Back

transformers #43774

Add activation offloading to trainer

by mbtariq82 · Feb 06, 2026 at 19:26 UTC · scan-7b64ad3ec7ddd1ad

Critical Risk (90%)

Get this automatically on every PR

Install the Axiomo GitHub App to get Signals as check runs and PR comments on every pull request.

Install App

Risk Assessment

Risk level: Critical (90%)

Risk Drivers

  • large_diff: Large change: 930 lines modified
  • new_contributor: First contribution from mbtariq82-code
  • young_account: Account mbtariq82-code is 11 days old
  • api_surface_change: API surface changed in 2 file(s)

Intent

4/4 criteria met

Add activation offloading feature to reduce GPU memory usage during training.

Acceptance Criteria

  • Activation offloading is added to TrainingArguments.

    training_args.py adds activation_offloading parameter.

  • Trainer utilizes activation offloading.

    trainer.py imports get_act_offloading_ctx_manager.

  • Documentation is updated for the new feature.

    trainer.md adds activation_offloading example.

  • New tests for activation offloading are added and pass.

    test_activation_offloading.py file is added with passing tests.

Confidence: 90.0% Source: pr description AI: openai

Contributors

mbtariq82 PR Author 1 commit ? Low Trust
Account Age: 1169 days
Prior PRs: 2

Has 0 merged PRs to this repo. unfamiliar with 5 files being modified.

mbtariq82-code 2 commits ? New Contributor
Account Age: 11 days
Prior PRs: 0

First-time contributor to this repository. Account created 11 days ago.

Evidence

Evidence Completeness: 10.0%
tests_passing Failing
Missing: ci_passing, lint_passing, security_scan_clean, coverage_maintained, build_successful

Supply Chain

None Risk
Modifies dependencies
Modifies lockfile
Modifies CI config
Modifies build scripts

Focus Files

Focus on 2 critical file(s)

src/transformers/utils/activation_offloading.py +700

700 lines changed; New file; Source code

critical
tests/utils/test_activation_offloading.py +206

206 lines changed; New file; Source code

critical
src/transformers/trainer.py +14

Source code

medium
src/transformers/training_args.py +9

Source code

medium
docs/source/en/trainer.md +1

Standard file

low

Triage

112

minutes to review

extensive

effort level

none

staleness risk

Schedule dedicated review time; consider pair review

Recommendation

REQUEST CHANGES 22.0% readiness

Critical risk level requires changes before approval

Next Steps

Concern

Consider breaking into smaller PRs

Question

Why is ci_passing missing? Consider adding this check.

Question

Why is lint_passing missing? Consider adding this check.

Concern src/transformers/utils/activation_offloading.py

Critical file: 700 lines changed; New file; Source code

Concern tests/utils/test_activation_offloading.py

Critical file: 206 lines changed; New file; Source code