Back to feed
arXiv cs.LG
arXiv cs.LG
5/13/2026
Rotation-Preserving Supervised Fine-Tuning

Rotation-Preserving Supervised Fine-Tuning

Short summary

Rotation-Preserving Supervised Fine-Tuning (RPSFT) addresses the in-domain/out-of-domain generalization trade-off by constraining changes to top-k singular vectors of pretrained weights. This approach serves as an efficient proxy for Fisher-sensitive directions without expensive Hessian computation. Experiments show improved generalization trade-offs and stronger initializations for downstream RL fine-tuning.

  • RPSFT preserves singular vectors of pretrained weights to limit unnecessary rotation during supervised fine-tuning
  • Computationally efficient alternative to direct Fisher/Hessian computation at LLM scale
  • Improves in-domain/OOD generalization trade-off and provides better downstream RL fine-tuning initializations

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more