Skip to content

Commit eadeb01

Browse files
committed
Update docs
1 parent f3d81fb commit eadeb01

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

docs/finetuning.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,14 +13,19 @@ model.load_checkpoint()
1313
## Computing Gradients
1414

1515
To compute gradients, you will need an A100 with 80 GB of memory.
16-
In addition, you will need to use [PyTorch AMP](https://pytorch.org/docs/stable/amp.html)
17-
and gradient checkpointing.
16+
In addition, you will need to use reduced precision and gradient checkpointing.
1817
You can do this as follows:
1918

2019
```python
2120
from aurora import AuroraPretrained
2221

23-
model = AuroraPretrained(autocast=True) # Use AMP.
22+
model = AuroraPretrained(
23+
# BF16 mode is an EXPERIMENTAL mode that saves memory by running the backbone in pure BF16
24+
# and the decoder in FP16 AMP. This should enable gradient computation. USE AT YOUR OWN RISK.
25+
# THIS WAS NOT USED IN THE DEVELOPMENT OF AURORA AND IS PURELY PROVIDED AS A STARTING POINT
26+
# FOR FINE-TUNING.
27+
bf16_mode=True,
28+
)
2429
model.load_checkpoint()
2530

2631
batch = ... # Load some data.

0 commit comments

Comments
 (0)