Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think it's reasonable. The development process is just not really comparable to other software engineering: It's fairly clear that currently nobody really has a good grasp on what a model will be while they are being trained. But they do have expectations. So you do the training, and then you assign the increment to align the two.


I figured you don't update the major unless you significantly change the... algorithm, for lack of a better word. At least I assume something major changed between how they trained ChatGPT 3 vs GPT 4, other than amount of data. But maybe I'm wrong.


The number is purely for marketing.

If you could get much better performance without changing the algorithm (eg just by scaling), you'd still bump the number.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: