InternLM-v0.2.1dev20230908
·
72 commits
to main
since this release
Highlights
- fix the bug that may have NaN value when overlap gradients' allreduce with backward
- support timeout wrapper and runtime diagnosis
- support readthedocs Chinese version
What's Changed
馃殌 Features
- feat(monitor): add light monitor by @JiaoPL in #275
- feat(utils): add timeout wrapper by @SolenoidWGT in #286
- feat: add runtime diagnosis by @sunpengsdu in #297
馃挜 Improvements
- fix(storage): refactor and fix storage_manager api by @SolenoidWGT in #281
- Feat/sync grad use async op by @sunpengsdu in #277
馃悶 Bug fixes
- fix(doc/code-docs): autodoc shown error by @huangting4201 in #265
- fix(eval): no need to check length of valid_dl when using streaming dataset by @00INDEX in #274
- fix/broadcast should not in commu stream by @sunpengsdu in #276
- fix(model): set tensor parallel attribute for mlp by @yingtongxiong in #271
- feat(ckpt): checkpoint bug fixes and feature enhancements. by @SolenoidWGT in #259
- fix(ckpt): fix checkpoint reload bug by @SolenoidWGT in #282
- fix(core/context): use dummy mode to generate random numbers in model construction by @blankde in #266
- fix(monitor): add alert switch and refactor monitor config by @JiaoPL in #285
- fix: fix the bug to do bcast in a stream by @sunpengsdu in #294
馃摎 Documentations
- docs(*): add documentation and reST files for readthedocs by @zigzagcai in #272
- docs(doc/code-docs): support zh cn readthedocs by @huangting4201 in #289
- docs(fsdp): add training option for fsdp by @zaglc in #273
- docs(doc/code-docs): refine profiler docs by @zigzagcai in #295
馃寪 Other
Known issues
New Contributors
- @JiaoPL made their first contribution in #275
- @blankde made their first contribution in #266
- @zigzagcai made their first contribution in #272
- @zaglc made their first contribution in #273
Full Changelog: v0.2.1dev20230901...v0.2.1dev20230908