How can I see what collective ops are used during auto-parallelisation #22813

Joshuaalbert · 2024-08-01T10:23:45Z

Joshuaalbert
Aug 1, 2024

How can I see what collective ops are applied under the hood? I would hope to see a ppermute here, as that's all rolling does. make_jaxpr doesn't seem to show anything about the decisions made by the compiler for automatically parallelising.

import os

import jax
import jax.numpy as jnp
from jax import config, lax
from jax.experimental.mesh_utils import create_device_mesh

# Set num jax devices
os.environ["XLA_FLAGS"] = f"--xla_force_host_platform_device_count={os.cpu_count()}"

from jax.sharding import Mesh
from jax.sharding import PartitionSpec
from jax.sharding import NamedSharding


def create_mesh(shape, axis_names, devices=None):
    mesh_devices = create_device_mesh(mesh_shape=shape, devices=devices)
    mesh = Mesh(mesh_devices, axis_names=axis_names)
    return mesh


def main(num_shards):
    P = PartitionSpec

    mesh = create_mesh(shape=(num_shards,), axis_names=('a',))

    @jax.jit
    def f(x):
        x = lax.with_sharding_constraint(x, NamedSharding(mesh, P('a')))
        return jnp.roll(x, 1)

    x = jnp.arange(num_shards * 2)

    print(jax.make_jaxpr(f)(x))

if __name__ == '__main__':
    main(num_shards=os.cpu_count())

This gives

{ lambda ; a:i64[24]. let
    b:i64[24] = pjit[
      name=f
      jaxpr={ lambda ; c:i64[24]. let
          d:i64[24] = sharding_constraint[
            resource_env=ResourceEnv(mesh=Mesh(), ())
            sharding=NamedSharding(mesh=Mesh('a': 12), spec=PartitionSpec('a',))
            unconstrained_dims=set()
          ] c
          e:i64[24] = pjit[
            name=_roll_static
            jaxpr={ lambda ; f:i64[24]. let
                g:i64[1] = slice[
                  limit_indices=(24,)
                  start_indices=(23,)
                  strides=(1,)
                ] f
                h:i64[23] = slice[
                  limit_indices=(23,)
                  start_indices=(0,)
                  strides=(1,)
                ] f
                i:i64[24] = concatenate[dimension=0] g h
              in (i,) }
          ] d
        in (e,) }
    ] a
  in (b,) }

Answered by yashk2810

Aug 1, 2024

Try: print(f.lower(x).compile().as_text(). This will print the HLO with collectives inside it.

View full answer

yashk2810 · 2024-08-01T13:53:19Z

yashk2810
Aug 1, 2024
Collaborator

Try: print(f.lower(x).compile().as_text(). This will print the HLO with collectives inside it.

0 replies

Joshuaalbert · 2024-08-01T17:22:25Z

Joshuaalbert
Aug 1, 2024
Author

Cool, this worked. With this I see a permute, which it must have figured out from a slice->concatenate rule.

  %collective-permute = s64[1]{0} collective-permute(s64[1]{0} %slice.2), channel_id=1, source_target_pairs={{0,1},{1,2},{2,3},{3,4},{4,5},{5,6},{6,7},{7,8},{8,9},{9,10},{10,11},{11,0}}

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I see what collective ops are used during auto-parallelisation #22813

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

How can I see what collective ops are used during auto-parallelisation #22813

Joshuaalbert Aug 1, 2024

Replies: 2 comments

yashk2810 Aug 1, 2024 Collaborator

Joshuaalbert Aug 1, 2024 Author

Joshuaalbert
Aug 1, 2024

yashk2810
Aug 1, 2024
Collaborator

Joshuaalbert
Aug 1, 2024
Author