Rotation-Preserving Supervised Fine-Tuning

Short summary

Rotation-Preserving Supervised Fine-Tuning (RPSFT) addresses the in-domain/out-of-domain generalization trade-off by constraining changes to top-k singular vectors of pretrained weights. This approach serves as an efficient proxy for Fisher-sensitive directions without expensive Hessian computation. Experiments show improved generalization trade-offs and stronger initializations for downstream RL fine-tuning.

•RPSFT preserves singular vectors of pretrained weights to limit unnecessary rotation during supervised fine-tuning
•Computationally efficient alternative to direct Fisher/Hessian computation at LLM scale
•Improves in-domain/OOD generalization trade-off and provides better downstream RL fine-tuning initializations

Generated with AI, which can make mistakes.

#research-breakthrough #ai-tools

Read full article at arXiv cs.LG

Is this a good recommendation for you?

Rotation-Preserving Supervised Fine-Tuning

Short summary

Explore more