Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I mean just use them and compare, the gap is obvious.


I did, and I fixed Qwen's issues with trivial sampling and loop detection hacks.

If I can do this, then a company that wants to sell local models seriously could do it too.


> I did, and I fixed Qwen's issues with trivial sampling and loop detection hacks.

Wow, that's amazing! Care to share the changes? Would love to try them out.


It's not amazing at all.

What's amazing is that LLM technologies are so immature that even basic engineering diligence isn't being done. (Like detecting token loops, for example.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: