Bit flips aren’t always bad hardware. I remember an anecdote from Sandia from my HPC days - they found they were getting more bit flips on some machines than others on their cluster and sometimes correlated.
Turned out at their altitude cosmic rays were flipping bits in the top-most machines in the racks, sometimes then penetrating lower and flipping bits in more machines too.
The proof is really in the pudding, isn't it? I don't see a wave of successful vibe-coded startups in the market yet. That's kind of the benchmark for whether this stuff actually does in practice what the AI-hypemen claim it can.
Rather the opposite. A vibe-coded startup cannot survive if it can be trivially duplicated. The proof will be in observing the inverse phenomenon; (pure) software companies disappearing.
It is still not clear to me. The periodicity of their orbit around the tree is the same. I think this is an instance of us meaning different things by “go around”
The landing page reads like it was written with an LLM.
Somehow this makes me immediately not care about the project; I expect it to be incomplete vibe-coded filler somehow.
Odd what a strong reaction it invokes already. Like: if the author couldn’t be bothered to write this, why waste time reading it? Not sure I support that, but that’s the feeling.
I am very concerned about the long term effects of people developing the habit of mistrusting things just because they’re written in coherent English and longer than a tweet. (Which seems to be the criterion for “sounds like an LLM wrote it”.)
Haha. This is so true. I'm a bit long-winded myself and once got accused of being AI on here. I just don't communicate like Gen Alpha. I read their site and nothing jumped out as AI although it's possible they used it to streamline what they initially wrote.
I don't think it feels particularly LLM-written, I can't find many of the usual tells. However, it is corporate and full of tired cliches. It doesn't matter if it's written by an LLM or not, it's not pleasant to read. It's a self-indulgent sales pitch.
Genuinely interesting how divergent people's experiences of working with these models is.
I've been 5x more productive using codex-cli for weeks. I have no trouble getting it to convert a combination of unusually-structured source code and internal SVGs of execution traces to a custom internal JSON graph format - very clearly out-of-domain tasks compared to their training data. Or mining a large mixed python/C++ codebase including low-level kernels for our RISCV accelerators for ever-more accurate docs, to the level of documenting bugs as known issues that the team ran into the same day.
We are seeing wildly different outcomes from the same tools and I'm really curious about why.
Super cool, I spent a lot of time playing with representation learning back in the day and the grids of MNIST digits took me right back :)
A genuinely interesting and novel approach, I'm very curious how it will perform when scaled up and applied to non-image domains! Where's the best place to follow your work?
The rest of the README is llm-generated so I kinda suspect these numbers are hallucinated, aka lies. They also conflict somewhat with your "cut shipping time roughly in half" quote, which I'm more likely to trust.
Are there real numbers you can share with us? Looks like a genuinely interesting project!
OP here. These numbers are definitely in the ballpark. I personally went from having to compact or clear my sessions 10-12 times a day to doing this about once or twice since we've started to use the system. Obviously, results may vary depending on the codebase, task, etc., but because we analyze what can be run in parallel and execute multiple agents to run them, we have significantly reduced the time it takes to develop features.
Every epic gets its own branch. So if multiple developers are working on multiple epics, in most cases, merging back to the main branch will need to be done patiently by humans.
To be clear, I am not suggesting that this is a fix-all system; it is a framework that helped us a lot and should be treated just like any other tool or project management system.
That's where the human architect comes in (for now at least). We'll try to think of features that would have the least amount of conflicts when merged back to main. We usually max it at 3, and have a senior dev handle any merge conflicts.
That depends on how decoupled your codebase is and how much overlap in the areas being worked on by your agents are. If you have a well architected modular monolith and you don't dispatch overlapping issues, it's fine.
Turned out at their altitude cosmic rays were flipping bits in the top-most machines in the racks, sometimes then penetrating lower and flipping bits in more machines too.
reply