Dev.to
5/9/2026

I Trained an LLM on 75K of My Own Messages So It Would Stop Writing Like a Chatbot
Short summary
Developer trained a Qwen 2.5 3B model on 75K personal text samples spanning 23 years to capture authentic writing voice for AI content. Two-tier architecture: frontier model handles reasoning and structure, while a quantized fine-tuned 3B model rewrites in the author's voice. Includes practical code and data extraction patterns from 12 platforms.
- •Two-tier architecture separates reasoning (frontier model) from voice matching (fine-tuned 3B)
- •Trained Qwen 2.5 3B on 75K personal messages with careful sanitization and deduplication
- •Practical implementation with code examples and data extraction from 12 platforms
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



