Serious question, does fine-tuning a small model on a tight budget ever beat a big one?

I had this client in Omaha who needed a chatbot for their internal HR docs, but their budget was under $500. Instead of using a big API like GPT-4, I fine-tuned a 7-billion parameter Mistral model on their specific PDFs using a cheap cloud spot instance for about $40. It now answers their questions better than the generic big model did, with way lower costs. Has anyone else found a niche where a small, focused model actually works better?

3 comments

3 Comments

ray5622mo ago

Yeah, that's the exact use case where a small tuned model wins. Big models try to know everything and get weirdly general. I had a similar thing with local restaurant menus, where GPT-4 kept over-explaining basic cooking terms instead of just listing daily specials. A focused model sticks to its data like glue. Skyler217, that tight budget forces you to be smart, not just throw money at an API. For specific, boring business docs, the small guy often does the job better and cheaper.

skyler2172mo ago

Man, my budget's so tight I can only afford to fine-tune my own bad habits.

grantc802mo agoMost Upvoted

Wait, can you fine tune a model on just your own data for cheap now? I thought you still needed a decent sized dataset and some cloud credits to get started.