Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're asking me to do the thing I just said was frustrating haha. I have no idea. It's a new technology and we have nothing to draw from to make predictions. But for the sake of fun..

New code generation / modification I think we're hitting a point of diminishing returns and they're not going to improve much here

The limitation is fundamentally that they can only be as good as the detail in the specs given, or the test harnesses provided to them. Any detail left out they're going to make up, and hopefully it's what you want (often it's not!). If you make the specs detailed enough so that there's no misunderstanding possible: you've just written code, what we already do today

Code optimization I think they'll get quite a bit better. If you give them GCC it's probable they'll be able to improve upon it



> If you make the specs detailed enough so that there's no misunderstanding possible: you've just written code, what we already do today

This was my opinion for a very long time. Having build a few applications from scratch using AI, though, nowadays I think: Sometimes not everything needs to be spelled out. Like in math papers some details can be left to the ~~reader~~LLM and it'll be fine.

I mean, in many cases it doesn't really matter what exactly the code looks like, as long as it ends up doing the right thing. For a given Turing machine, the equivalence class of equivalent implementations is infinite. If a short spec written in English leads the LLM to identify the correct equivalence class, that's all we need and, in fact, a very impressive compression result.


Sometimes, yeah. I don't think we're disagreeing

What I'd also add:

Because of the unspecified behaviour, you're always going to need someone technical that understands the output to verify it. Tests aren't enough

I'm not even sure if this is a net productivity benefit. I think it is? Some cases it's a clear win.. but definitely not always. You're reducing time coding and now putting extra into spec writing + review + verification


> Sometimes, yeah. I don't think we're disagreeing

I would disagree. Formalism and precision have a critical role to play which is often underestimated. More so with the advent of llms. Fuzziness of natural languages is both a strength and weakness. We have adopted precise but unnatural languages (math/C/C++) for describing machine models of the physical world or of the computing world. Such precision was a real human breakthrough which is often overlooked in these debates.


Hmm. It’s not clear what specific task it can’t handle. Can you come up with a concrete example?


Are you saying you've never had them fail at a task?

I wanted to refactor a bunch of tests in a TypeScript project the other day into a format similar to table driven tests that are common in Golang, but seemingly not so much in TypeScript. Vitest has specific syntax affordances for it, though

It utterly failed at the task. Tried many times with increasing specificity in my prompt, did one myself and used it as an example. I ended up giving up and just doing it manually


I see. Did you use Claude code? With access to compiling and running.


Codex on high, yeah it had access to compiling/running


thanks for the data point




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: