Skip to content

Commit 9999dfb

Browse files
doichanjtungbqhhoriiibm-wakizakamergify[bot]
authored
Adding circuit executor classes and shot-branching (#1766)
* adding executor classes for parallel simulations * fix merge conflicts * simplify sub-classes * fix unformatted code * fix unformatted code again * Fix MPI code * Fix shot-branching was not enabled with noise sampling * Fix clang format * set_num_qubits to virtual function to set correct num qubits on matrix * reflecting review comments * reuse of random number generator * recover save_data_per_shot * add missed omp threads setting in statevector, change class hieralchy * Fix performance issue of GPU shot-branching * move fusion outside of loop for non noise dynamic circuits * fix shot-branching options in aer_compiler.py * save codes before merge * Fix format * Fix multi-chunk with cuStateVec * format * format * add better multi-GPU distribution for shot-branching * fix format * Changed option shot_branching_enable=False by default, add shot_branching_sampling_enable (False by default), add test cases for shot-branching * format * format test_shot_branching.py * Changed OpenMP threading for shot-branching * mutable to matrix and param buffer * format * add target_gpus option * Remove Python 3.7 from Github actions (#1819) Since 0.13.0, Aer does not support Python 3.7. This commit removes github actions for CI. * Removing python 3.7 from test workflow * Removing python 3.7 from build workflow * Removing python 3.7 from deploy workflow * Removing python 3.7 from tox * revert * Remove python 3.7 from pyproject.toml * Remove python 3.7 from pyproject.toml - tool --------- Co-authored-by: Hiroshi Horii <[email protected]> * Fix missing dynamic link path for CUDA runtime and cuQuantum libraries (#1877) Co-authored-by: Hiroshi Horii <[email protected]> * Fix OpenMP nested parallel (#1880) * Fix OpenMP nested parallel * add comment in release note * fix true and false * fix format --------- Co-authored-by: Hiroshi Horii <[email protected]> * Support u3 gate application in Aer runtime API (#1876) * Support u3 gate application * Apply clang-format * Revert clang-format for aer_runtime_api.h * Add release note --------- Co-authored-by: Hiroshi Horii <[email protected]> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> * Fix required_memory_mb (#1881) * Fix required_memory_mb * add release note --------- Co-authored-by: Hiroshi Horii <[email protected]> * format * format * format * comment out target_gpu setting for non-GPU * comment out target_gpu setting for non-GPU * Remove `PulseSimulator` (#1884) Since 0.12, Qiskit-Aer notices deprecation warnings to use of PulseSimulato. Because 0.13 will be released after +3 months since 0.12 was released, Qiskit-Aer will stop supports of pulse simulation. * first pass at removing pulse simulator * autoformat with black * remove ref to aer pulse in docs * fix lint issues * remove pulse rst * remove pulse tests * add release note * remove open pulse from CMakeLists.txt * remove pulse tests * remove remaining pulse codes --------- Co-authored-by: AngeloDanducci <[email protected]> * Fix an issue in `aer_state_initialize()` of C API (#1885) Correct C API `aer_state_initialize` to take an argument of `handler`. * update aer_state_initialize API * add reno * fix MPI shot-branching sampling * fix unmerged file * remove conflict * rerun tests * recover files * remove conflict * fix non-gpu * update release note --------- Co-authored-by: Tung Bui (Leo) <[email protected]> Co-authored-by: Hiroshi Horii <[email protected]> Co-authored-by: Ryo Wakizaka <[email protected]> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: AngeloDanducci <[email protected]>
1 parent e842b4c commit 9999dfb

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+10881
-6526
lines changed

qiskit_aer/backends/aer_compiler.py

+3
Original file line numberDiff line numberDiff line change
@@ -465,6 +465,8 @@ def compile_circuit(circuits, basis_gates=None, optypes=None):
465465
"chunk_swap_buffer_qubits": (int, np.integer),
466466
"batched_shots_gpu": (bool, np.bool_),
467467
"batched_shots_gpu_max_qubits": (int, np.integer),
468+
"shot_branching_enable": (bool, np.bool_),
469+
"shot_branching_sampling_enable": (bool, np.bool_),
468470
"num_threads_per_device": (int, np.integer),
469471
"statevector_parallel_threshold": (int, np.integer),
470472
"statevector_sample_measure_opt": (int, np.integer),
@@ -488,6 +490,7 @@ def compile_circuit(circuits, basis_gates=None, optypes=None):
488490
"use_cuTensorNet_autotuning": (bool, np.bool_),
489491
"parameterizations": (list),
490492
"fusion_parallelization_threshold": (int, np.integer),
493+
"target_gpus": (list),
491494
}
492495

493496

qiskit_aer/backends/aer_simulator.py

+31
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,10 @@ class AerSimulator(AerBackend):
170170
If AerSimulator is built with cuStateVec support, cuStateVec APIs are enabled
171171
by setting ``cuStateVec_enable=True``.
172172
173+
* ``target_gpus`` (list): List of GPU's IDs starting from 0 sets
174+
the target GPUs used for the simulation.
175+
If this option is not specified, all the available GPUs are used for
176+
chunks/shots distribution.
173177
174178
**Additional Backend Options**
175179
@@ -287,6 +291,30 @@ class AerSimulator(AerBackend):
287291
threads per GPU. This parameter is used to optimize Pauli noise
288292
simulation with multiple-GPUs (Default: 1).
289293
294+
* ``shot_branching_enable`` (bool): This option enables/disables
295+
applying shot-branching technique to speed up multi-shots of dynamic
296+
circutis simulations or circuits simulations with noise models.
297+
(Default: False).
298+
Starting from single state shared with multiple shots and
299+
state will be branched dynamically at runtime.
300+
This option can decrease runs of shots if there will be less branches
301+
than number of total shots.
302+
This option is available for ``"statevector"``, ``"density_matrix"``
303+
and ``"tensor_network"``.
304+
305+
* ``shot_branching_sampling_enable`` (bool): This option enables/disables
306+
applying sampling measure if the input circuit has all the measure
307+
operations at the end of the circuit. (Default: False).
308+
Because measure operation branches state into 2 states, it is not
309+
efficient to apply branching for measure.
310+
Sampling measure improves speed to get counts for multiple-shots
311+
sharing the same state.
312+
Note that the counts obtained by sampling measure may not be as same as
313+
the counts calculated by multiple measure operations,
314+
becuase sampling measure takes only one randome number per shot.
315+
This option is available for ``"statevector"``, ``"density_matrix"``
316+
and ``"tensor_network"``.
317+
290318
* ``accept_distributed_results`` (bool): This option enables storing
291319
results independently in each process (Default: None).
292320
@@ -709,6 +737,9 @@ def _default_options(cls):
709737
batched_shots_gpu=False,
710738
batched_shots_gpu_max_qubits=16,
711739
num_threads_per_device=1,
740+
# multi-shot branching
741+
shot_branching_enable=False,
742+
shot_branching_sampling_enable=False,
712743
# statevector options
713744
statevector_parallel_threshold=14,
714745
statevector_sample_measure_opt=10,

qiskit_aer/backends/wrappers/aer_controller_binding.hpp

+17-3
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,11 @@ void bind_aer_controller(MODULE m) {
182182
[](Config &config, uint_t val) {
183183
config.num_threads_per_device.value(val);
184184
});
185+
// # multi-shot branching
186+
aer_config.def_readwrite("shot_branching_enable",
187+
&Config::shot_branching_enable);
188+
aer_config.def_readwrite("shot_branching_sampling_enable",
189+
&Config::shot_branching_sampling_enable);
185190
// # statevector options
186191
aer_config.def_readwrite("statevector_parallel_threshold",
187192
&Config::statevector_parallel_threshold);
@@ -403,6 +408,10 @@ void bind_aer_controller(MODULE m) {
403408
[](Config &config, uint_t val) {
404409
config.extended_stabilizer_norm_estimation_default_samples.value(val);
405410
});
411+
aer_config.def_property(
412+
"target_gpus",
413+
[](const Config &config) { return config.target_gpus.val; },
414+
[](Config &config, reg_t val) { config.target_gpus.value(val); });
406415

407416
aer_config.def(py::pickle(
408417
[](const AER::Config &config) {
@@ -488,12 +497,14 @@ void bind_aer_controller(MODULE m) {
488497
write_value(77, config.unitary_parallel_threshold),
489498
write_value(78, config.memory_blocking_bits),
490499
write_value(
491-
79,
492-
config.extended_stabilizer_norm_estimation_default_samples));
500+
79, config.extended_stabilizer_norm_estimation_default_samples),
501+
write_value(80, config.shot_branching_enable),
502+
write_value(81, config.shot_branching_sampling_enable),
503+
write_value(82, config.target_gpus));
493504
},
494505
[](py::tuple t) {
495506
AER::Config config;
496-
if (t.size() != 79)
507+
if (t.size() != 82)
497508
throw std::runtime_error("Invalid serialization format.");
498509

499510
read_value(t, 0, config.shots);
@@ -580,6 +591,9 @@ void bind_aer_controller(MODULE m) {
580591
read_value(t, 78, config.memory_blocking_bits);
581592
read_value(t, 79,
582593
config.extended_stabilizer_norm_estimation_default_samples);
594+
read_value(t, 80, config.shot_branching_enable);
595+
read_value(t, 81, config.shot_branching_sampling_enable);
596+
read_value(t, 82, config.target_gpus);
583597
return config;
584598
}));
585599
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
features:
3+
- |
4+
This release restructures ``State`` classes.
5+
Adding circuit executor classes that runs a circuit and manages multiple
6+
states for multi-shots simulations or multi-chunk simulations for large
7+
number of qubits.
8+
Previously ``StateChunk`` class manages multiple chunks for multi-shots or
9+
multi-chunk simulations but now ``State`` class only has one state
10+
and all the parallelization codes are moved to ``Executor`` classes.
11+
Now all ``State`` classes are independent from parallelization.
12+
Also some of the functions in ``Aer::Controller`` class are moved to
13+
``CircuitExecutor::Executor`` class.
14+
- |
15+
Shot-branching technique that accelerates dynamic circuits simulations
16+
is implemented with restructured ``Executor`` classes.
17+
Shot-branching is currently applicable to statevector density_matrix
18+
and tensor_network methods.
19+
Shot-branching provides dynamic distribution of multi-shots
20+
by branching states when applying dynamic operations
21+
(measure, reset, initialize, noises)
22+
By default ``shot_branching_enable`` is disabled.
23+
And by setting ``shot_branching_sampling_enable``, final measures will be
24+
done by sampling measure that will speed up to get counts for multiple shots
25+
sharing the same state.
26+
- |
27+
A new option ``target_gpus`` is added to select GPUs used for the
28+
simulation. A list of target GPU's ID is passed for example
29+
``target_gpus=[0, 2]`` select 2 GPUs to be used.
30+
Without this option, all the available GPUs are used.

0 commit comments

Comments
 (0)