2
Serious question, does fine-tuning a small model on a tight budget ever beat a big one?
I had this client in Omaha who needed a chatbot for their internal HR docs, but their budget was under $500. Instead of using a big API like GPT-4, I fine-tuned a 7-billion parameter Mistral model on their specific PDFs using a cheap cloud spot instance for about $40. It now answers their questions better than the generic big model did, with way lower costs. Has anyone else found a niche where a small, focused model actually works better?
3 comments
Log in to join the discussion
Log In3 Comments
ray56228d ago
Yeah, that's the exact use case where a small tuned model wins. Big models try to know everything and get weirdly general. I had a similar thing with local restaurant menus, where GPT-4 kept over-explaining basic cooking terms instead of just listing daily specials. A focused model sticks to its data like glue. Skyler217, that tight budget forces you to be smart, not just throw money at an API. For specific, boring business docs, the small guy often does the job better and cheaper.
5