Skip to content

Navigation Menu

Explore
By size
By industry
By use case
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

microsoft / DeepSpeed Public

Notifications You must be signed in to change notification settings
Fork 3.9k
Star 33.6k

Code
Issues 979
Pull requests 143
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: microsoft/DeepSpeed

Labels 32 Milestones 0

Labels 32 Milestones 0

New pull request New

143 Open 2,653 Closed

143 Open 2,653 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Disable nvtx decorator to avoid graph break

#5697 opened Jun 25, 2024 by tohtana

Loading…

sequence parallel with communication overlap

#5691 opened Jun 21, 2024 by inkcherry

Loading…

ENV var added for recaching in INF Unit tests

#5688 opened Jun 20, 2024 by raza-sikander

Loading…

2

inference unit test injectionPolicy split world_size to multiple tests

#5687 opened Jun 20, 2024 by oelayan7

Loading…

2

Switch from torch.cuda.amp.custom_fwd to torch.amp.custom_fwd(device=...)

#5684 opened Jun 18, 2024 by loadams • Draft

Switch what versions of python are supported

#5676 opened Jun 17, 2024 by loadams • Draft

[CPU] add fp16 support to shm inference_all_reduce

#5669 opened Jun 17, 2024 by delock • Queued

4

Update xpu-max1100.yml with new config and add some tests

#5668 opened Jun 17, 2024 by Liangliang-Ma

Loading…

6

Add and Remove ZeRO 3 Hooks

#5658 opened Jun 13, 2024 by jomayeri

Loading…

4

Unpin transformers version

#5650 opened Jun 12, 2024 by loadams

Loading…

reduce all-to-all communication volume when both expert and non-expert are tensor-parallel

#5626 opened Jun 7, 2024 by taozhiwei

Loading…

Hybrid Offloading for ZeRO3

#5625 opened Jun 7, 2024 by tohtana • Draft

fix: quantization with DeepSpeed HE

#5624 opened Jun 6, 2024 by Atry

Loading…

2

Add support for Phi-3 small to FastGen

#5614 opened Jun 4, 2024 by adk9 • Draft

[INF] Enable torch compile for inference

#5612 opened Jun 4, 2024 by oelayan7

Loading…

6

Upgrade HPU image to v1.16.2.

#5610 opened Jun 4, 2024 by vshekhawat-hlab

Loading…

5

Add an argument to enable the injection of missing state during the conversion of universal checkpoints

#5608 opened Jun 3, 2024 by xylian86

Loading…

[CPU] Allow deepspeed.comm.inference_all_reduce in torch.compile graph

#5604 opened Jun 3, 2024 by delock

Loading…

2

state_dict_factory: llama checkpoint - support SWIGLU

#5601 opened Jun 2, 2024 by nelyahu

Loading…

2

FastGen H100 MoE support: Add PyTorch multi-gemm MOE implementation

#5586 opened May 29, 2024 by HeyangQin

Loading…

7

Update profiler.py

#5584 opened May 29, 2024 by gameofdimension

Loading…

reduce cpu host overhead when using moe

#5578 opened May 29, 2024 by ranzhejiang

Loading…

7

Reuse KV cache of prefixes

#5572 opened May 27, 2024 by tohtana • Draft

3

Add support for Microsoft Phi-3 model to DeepSpeed-FastGen

#5559 opened May 21, 2024 by adk9

Loading…

Add chatglm2 & chatglm3 autotp

#5540 opened May 16, 2024 by Yejing-Lai

Loading…

4

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.