> I think the more you can shift to compile time the better when it comes to agents
not born out by evidence. rust is bottom-mid tier on autocoderbenchmark. typescript is marginally bettee than js
shifting to compile time is not necessarily great, because the llm has to vibe its way through code in situ. if you have to have a compiler check your code it's already too late, and the llm does not havs your codebase in its weights, a fetch to read the types of your functions is context expensive since it's nonlocal.
i mean as a first order approximation context (the key resource that seems to affect quality) doesn't depend on real compilation speed, presumably the agent is suspended and not burning context while waiting for compliation
If you have an LLM that doesn't make errors ever, then you have an ASI, at which point the conversation is meaningless. In the meantime, having a lower error rate but more uncaught errors is less important than making incorrect code impossible to compile, and/or flagged by strict linters.
incorrect. having a higher caught error rate means that you consume more context on the way to your solution which makes for worse results, both by spending more time in the context danger zone, and by losing more on compaction handoffs.
given a system that can ascertain the same level of overall non-business logic errors as one that makes a ton of non-business logic errors that are all catchable, your LLM's ability to correctly implement business logic amid the noise will be greatly impaired along the way.
not born out by evidence. rust is bottom-mid tier on autocoderbenchmark. typescript is marginally bettee than js
shifting to compile time is not necessarily great, because the llm has to vibe its way through code in situ. if you have to have a compiler check your code it's already too late, and the llm does not havs your codebase in its weights, a fetch to read the types of your functions is context expensive since it's nonlocal.