Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In my use case for small models I typically only generate a max of 100 tokens per API call, with the prompt processing taking up the majority of the wait time from the user perspective. I found OAI's models to be quite poor at this and made the switch to Anthropic's API just for this.

I've found Haiku to be a pretty fast at PP, but would be willing to investigate using another provider if they offer faster speeds.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: