Need help Interpreting Perfetto for Profiling #24712

amishra791 · 2024-11-05T08:33:23Z

amishra791
Nov 5, 2024

Hi, I profiled two code snippets on a TPU v4-8 VM that I am having trouble making sense of.

The first code snippet is the following:

import jax
import jax.numpy as jnp
from jax.sharding import NamedSharding, PartitionSpec as P, Mesh
import numpy as np

@jax.jit
def dot(input_bd, matrix_dm):
    return jnp.dot(input_bd, matrix_dm)

mesh = Mesh(
    devices=np.array(jax.devices()).reshape(
        4,
    ),
    axis_names=("data"),
)

batch_sharding = NamedSharding(mesh, P("data"))
replicated_sharding = NamedSharding(mesh, P())

weight = jax.device_put(jax.random.uniform(jax.random.key(0), (2048, 4096)), replicated_sharding)
sample = jax.device_put(jax.random.uniform(jax.random.key(1), (256, 2048)), batch_sharding)

with jax.profiler.trace("/tmp/jax-trace", create_perfetto_link=True):
    output = dot(sample, weight)
    output.block_until_ready()

Even though I sharded the array across 4 devices, the profiler is showing computation across 8. Why is that?

The second code snippet is the following:

import jax
import jax.numpy as jnp
from jax.sharding import PartitionSpec as P, NamedSharding

with jax.profiler.trace("/tmp/foo", create_perfetto_link=True):
    batch_mesh = jax.make_mesh((4,), ('x'))
    batch_sharding = NamedSharding(batch_mesh, P('x', None, None))
    x = jnp.ones((4, 8, 64))
    x_sharded= jax.device_put(x, batch_sharding)

    y = jnp.ones((64, 128))
    model_mesh = jax.make_mesh((4,), ('y'))
    model_sharding = NamedSharding(model_mesh, P(None, 'y'))
    y_sharded = jax.device_put(y, model_sharding)

    sharded_mat_mul = jnp.dot(x_sharded, y_sharded)
    sharded_mat_mul.block_until_ready()

Like in the first code example, there are also 8 TPU devices, despite only sharding across 4. But my main question here is why lax.ones takes up so much computation time, ending even past the actual computation performed across all the TPU devices?

I've attaced the trace file too: perfetto_trace.json.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need help Interpreting Perfetto for Profiling #24712

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Need help Interpreting Perfetto for Profiling #24712

amishra791 Nov 5, 2024

Replies: 0 comments

amishra791
Nov 5, 2024