fix: Add MXFP4 MoE/attention backward kernels
by leoneperdigao
·
Feb 06, 2026 at 19:26 UTC
·
scan-f05078364281ec84
Get this automatically on every PR
Install the Axiomo GitHub App to get Signals as check runs and PR comments on every pull request.
Risk level: High (50%)
Implement backward pass support for MXFP4 quantized weights to enable LoRA/adapter fine-tuning.
New file src/transformers/integrations/mxfp4_backward.py added for Triton kernel implementation.
Transformer integration updated with MatmulOGSFunction in mxfp4_backward module.
MoE routing gradient handled in mxfp4_backward.py with gradient inversion logic.
Training mode toggle implemented in mxfp4.py with enable_training_mode method.
Comprehensive tests added in tests/quantization/mxfp4/test_mxfp4_backward.py.
Has 1 merged PRs to this repo. maintains 55 public repositories. unfamiliar with 6 files being modified.
Focus on 3 critical file(s)
benchmarks/benchmark_mxfp4_backward.py
+219
219 lines changed; New file; Source code
src/transformers/integrations/mxfp4_backward.py
+866
866 lines changed; New file; Source code
tests/quantization/mxfp4/test_mxfp4_backward.py
+311
311 lines changed; New file; Source code
src/transformers/integrations/mxfp4.py
+83
83 lines changed; Source code
src/transformers/quantizers/quantizer_mxfp4.py
+42
Source code
src/transformers/integrations/__init__.py
+8
Source code
169
minutes to review
high
effort level
none
staleness risk
Allocate focused review time
Insufficient evidence (CI/tests) to evaluate
Consider breaking into smaller PRs
Why is ci_passing missing? Consider adding this check.
Why is lint_passing missing? Consider adding this check.
benchmarks/benchmark_mxfp4_backward.py
Critical file: 219 lines changed; New file; Source code
src/transformers/integrations/mxfp4_backward.py
Critical file: 866 lines changed; New file; Source code
tests/quantization/mxfp4/test_mxfp4_backward.py
Critical file: 311 lines changed; New file; Source code