Add activation offloading to trainer
by mbtariq82
·
Feb 06, 2026 at 19:26 UTC
·
scan-7b64ad3ec7ddd1ad
Get this automatically on every PR
Install the Axiomo GitHub App to get Signals as check runs and PR comments on every pull request.
Risk level: Critical (90%)
Add activation offloading feature to reduce GPU memory usage during training.
training_args.py adds activation_offloading parameter.
trainer.py imports get_act_offloading_ctx_manager.
trainer.md adds activation_offloading example.
test_activation_offloading.py file is added with passing tests.
Has 0 merged PRs to this repo. unfamiliar with 5 files being modified.
First-time contributor to this repository. Account created 11 days ago.
Focus on 2 critical file(s)
src/transformers/utils/activation_offloading.py
+700
700 lines changed; New file; Source code
tests/utils/test_activation_offloading.py
+206
206 lines changed; New file; Source code
src/transformers/trainer.py
+14
Source code
src/transformers/training_args.py
+9
Source code
docs/source/en/trainer.md
+1
Standard file
112
minutes to review
extensive
effort level
none
staleness risk
Schedule dedicated review time; consider pair review
Critical risk level requires changes before approval
Consider breaking into smaller PRs
Why is ci_passing missing? Consider adding this check.
Why is lint_passing missing? Consider adding this check.
src/transformers/utils/activation_offloading.py
Critical file: 700 lines changed; New file; Source code
tests/utils/test_activation_offloading.py
Critical file: 206 lines changed; New file; Source code