Is there a way to control the pjit to dynamically partition the input IDs into different ranks? #21902

MoFHeka · 2024-06-16T19:21:18Z

I need to build a model like this:
There is a very large distributed dynamic shape Embedding, which can be seen as a hash table.
In every DP rank, when workers get the IDs input they need to transfer the IDs to another worker for lookup table.
For example, rank 0 get [0,1,2,2], also rank 1 get [0,3,2,1]. But hash table in rank 0 only stored the key 0,2,..., and rank 1 with 1,3,.... For now there will be a alltoallv operator, where rank 0 send [1] and receive [0,2], and rank 1 send [0,2] and receive [1]. And then lookup the value back to their origin rank.

Note: In fact, this procedure uses the send_recv operator to implement asynchronous ring alltoall to overlap the table lookup time.

Please:

Check for duplicate requests.
Describe your goal, and if possible provide a code snippet with a motivating example.

superbobry · 2024-06-17T15:25:52Z

Hi @MoFHeka, could you clarify how your question relates to JAX? JAX does not have a concept of a rank (other than array rank), so jit probably cannot do the partitioning you are describing.

MoFHeka · 2024-06-17T19:11:49Z

@superbobry Sorry my mistake. A simpler description might be, when I use pmap, how do I implement the all2allv operator in transform code? Among them, the data parallel dimension (batch) requires dynamic size data exchange.

For example, GPU0 and GPU1 run the data parallelism training. GPU0 get [0,1,2,2], also GPU1 get [0,3,2,1]. But hash table in GPU0 only stored the key 0,2,..., and GPU1 with 1,3,.... For now there will be an alltoallv operator, where GPU0 send [1] and receive [0,2], and GPU1 send [0,2] and receive [1]. And then lookup the value back to their origin GPUx.

yashk2810 · 2024-06-17T19:12:51Z

Try using shard_map? https://jax.readthedocs.io/en/latest/notebooks/shard_map.html

MoFHeka · 2024-06-17T19:16:46Z

Try using shard_map? https://jax.readthedocs.io/en/latest/notebooks/shard_map.html

@yashk2810 Thanks for your answer, but I don't know if shard_map supports swapping tensor with different dynamic shapes in batch dimensions between different GPU. And also I need to transform the collective operator(alltoallv) into send/recv(Peer to Peer) expressions to achieve the acceleration that can be overlapped by the calculation operator.

superbobry · 2024-06-18T07:15:04Z

I will let Yash comment on dynamic shapes in shard_map, but I suspect that this use-case is not supported (since most of JAX assumes static shapes atm).

Re "transform the collective": I don't think we have send/recv operations in lax, so I'm not quite sure what kind of transform you are talking about. Are these custom JAX operations you implemented?

MoFHeka · 2024-06-18T07:37:06Z

@superbobry Assume I have JAX lookup table operations.
I need to split the Ring AllToAll collective sync operator into several parts to ensure that while some data is being transferred, others are querying the hash-table to overlap the transmission overhead time.

Dynamic shapes are crucial in recommendation algorithms, and we now need to explore large generative recommendation model so jax or torch are needed.

MoFHeka · 2024-06-25T12:54:25Z

@yashk2810 @superbobry I'm sorry to bother you, but I found that jax.lax.ppermute needs a tensor with the same amount of data for each sharding, while I need a tensor with the different amount of data for each sharding.

superbobry · 2024-06-26T08:37:03Z

I will defer the answer to Yash, as I'm not yet very familiar with sharding APIs in JAX.

MoFHeka added the enhancement New feature or request label Jun 16, 2024

superbobry added needs info More information is required to diagnose & prioritize the issue. question Questions for the JAX team and removed enhancement New feature or request labels Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way to control the pjit to dynamically partition the input IDs into different ranks? #21902

Is there a way to control the pjit to dynamically partition the input IDs into different ranks? #21902

MoFHeka commented Jun 16, 2024

superbobry commented Jun 17, 2024 •

edited

Loading

MoFHeka commented Jun 17, 2024 •

edited

Loading

yashk2810 commented Jun 17, 2024

MoFHeka commented Jun 17, 2024 •

edited

Loading

superbobry commented Jun 18, 2024

MoFHeka commented Jun 18, 2024 •

edited

Loading

MoFHeka commented Jun 25, 2024 •

edited

Loading

superbobry commented Jun 26, 2024

Is there a way to control the pjit to dynamically partition the input IDs into different ranks? #21902

Is there a way to control the pjit to dynamically partition the input IDs into different ranks? #21902

Comments

MoFHeka commented Jun 16, 2024

superbobry commented Jun 17, 2024 • edited Loading

MoFHeka commented Jun 17, 2024 • edited Loading

yashk2810 commented Jun 17, 2024

MoFHeka commented Jun 17, 2024 • edited Loading

superbobry commented Jun 18, 2024

MoFHeka commented Jun 18, 2024 • edited Loading

MoFHeka commented Jun 25, 2024 • edited Loading

superbobry commented Jun 26, 2024

superbobry commented Jun 17, 2024 •

edited

Loading

MoFHeka commented Jun 17, 2024 •

edited

Loading

MoFHeka commented Jun 17, 2024 •

edited

Loading

MoFHeka commented Jun 18, 2024 •

edited

Loading

MoFHeka commented Jun 25, 2024 •

edited

Loading