Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For training from scratch, maybe a small model like https://github.com/karpathy/nanoGPT or tinyllama. Perhaps with quantization.

Fine-tuning is very doable. The hard part is making a novel dataset with input output pairs. You might consider just combining datasets you find on HuggingFace as an experiment.

replicate.com has a dead simple fine tuning API.

Predibase is also an easy to use option. But again for something custom you need a dataset with hundreds of examples. Normally people use GPT-4 to generate the dataset. As long as OpenAI doesn't block them.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: