Labels
Labels
26 labels
- Optimizer wrapper for automatic LR scaling without lossing model accuracy
- Something isn't working
- This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
- Pull requests that update a dependency file
- Improvements or additions to documentation
- This issue or pull request already exists
- New feature or request
- FullyShardedDataParallel (zero-3)
- Good for newcomers
- Extra attention is needed
- this issue is being worked on
- This doesn't seem right
- memory efficient vocab output
- Optimizer State Sharding (zero-1)
- Pipeline parallelism
- Further information is requested
- ShardedDataParallel (zero-2)
- This will not be worked on