File tree Expand file tree Collapse file tree 1 file changed +8
-3
lines changed Expand file tree Collapse file tree 1 file changed +8
-3
lines changed Original file line number Diff line number Diff line change @@ -13,14 +13,19 @@ model.load_checkpoint()
1313## Computing Gradients
1414
1515To compute gradients, you will need an A100 with 80 GB of memory.
16- In addition, you will need to use [ PyTorch AMP] ( https://pytorch.org/docs/stable/amp.html )
17- and gradient checkpointing.
16+ In addition, you will need to use reduced precision and gradient checkpointing.
1817You can do this as follows:
1918
2019``` python
2120from aurora import AuroraPretrained
2221
23- model = AuroraPretrained(autocast = True ) # Use AMP.
22+ model = AuroraPretrained(
23+ # BF16 mode is an EXPERIMENTAL mode that saves memory by running the backbone in pure BF16
24+ # and the decoder in FP16 AMP. This should enable gradient computation. USE AT YOUR OWN RISK.
25+ # THIS WAS NOT USED IN THE DEVELOPMENT OF AURORA AND IS PURELY PROVIDED AS A STARTING POINT
26+ # FOR FINE-TUNING.
27+ bf16_mode = True ,
28+ )
2429model.load_checkpoint()
2530
2631batch = ... # Load some data.
You can’t perform that action at this time.
0 commit comments