Fix for GPU batched optimization with Pauli noise#1482
Conversation
hhorii
left a comment
There was a problem hiding this comment.
I think NoiseModel should not have a new method pauli_only(). StateChunk<state_t>::apply_ops_multi_shots() can efficiently check whether all the quantum errors consist of pauli gates only by iterating NoiseMode.opset().
| } | ||
|
|
||
|
|
||
| bool QuantumError::pauli_only(void) const |
There was a problem hiding this comment.
I believe that this method is not necessary.
| model = NoiseModel(js); | ||
| } | ||
|
|
||
| bool NoiseModel::pauli_only(void) const |
There was a problem hiding this comment.
This method name is odd because NoiseModel may have another noise as read out errors.
I think callers can check gates are only pauli by checking NoiseModel.opset().
There was a problem hiding this comment.
I recover the prior code to check if sampled gates only contains Pauli gates or not.
| uint_t rng_seed, | ||
| bool final_ops); | ||
| bool final_ops, | ||
| bool pauli_only); |
There was a problem hiding this comment.
can we rename pauli_only with batched_pauli_ops?
| * ``num_threads_per_group`` (int): This option sets the number of | ||
| threads per group. For GPU simulation, this value sets number of | ||
| threads per GPU. This parameter is used to optimize Pauli noise | ||
| simulation with multiple-GPUs (Default: 1). | ||
|
|
There was a problem hiding this comment.
group sounds too general. Can we rename this with num_threads_per_device?
There was a problem hiding this comment.
changed to num_threads_per_device
| int_t i; | ||
| int_t i_begin,n_shots; | ||
|
|
||
| bool pauli_only = noise.pauli_only(); |
There was a problem hiding this comment.
I believe that pauli_only should be renamed with batched_pauli_ops.
Also, it is better to check NoiseModel.opset() here instead of adding a new method pauli_only().
hhorii
left a comment
There was a problem hiding this comment.
LGTM. In my understanding, this change does not need tests because this improves performance only by parallelizing noise sampling.
Summary
This PR is fix for issue #1473
Details and comments
Added option to allocate multiple threads per GPU and nested parallelism is applied to parallelize runtime noise sampling for Pauli noise in batched optimization.