Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

True, I should probably have been more specific, with 'LLMs' I was just referring to the current Transformer based models.

And yes, we could of course get the right answer to PARITY problems, but only if we were allowed some kind of scratch space (mental or external) -- even if only for the single bit of current state in the sequence.

But that's precisely what Transformers aren't afforded, right? They have to look at the whole problem once and need to derive their solution from that; there's no 'going through the sequence one bit at a time' for them. So they are really good (amazing actually, when you think about it) at performing the 'single thought' action, but still severely limited because they don't have the equivalent of a writable working memory.



The difficulty with PARITY for Transformers seems to be that it relies heavily on the position encoding. The longer the input sequence, the more likely it is to read the wrong bit, then all the subsequent output is wrong.

Instruction tuned LLMs as chat bots receive multiple inputs in a sequence, interspersed with computation to produce output tokens. Those output tokens appear to be able to act like short term memory.

For example, this ChatGPT 3.5 prompt appears to work well:

    You will act as a parity calculator. Ask me for each bit in turn. The parity value starts at 0, an is updated with xor using each new bit. Print out the updated value after each new bit is processed.
Responding with 0 or 1 produces outputs like:

    Got it, so we'll apply the XOR operation between the current parity value (1) and the new bit (1), which gives us 0.

    The updated parity value is 0.

    What is the next bit?
After ~20 iterations the output looks stable, if a little wordy. Perhaps this works as the tokens to attend to are all recent. I imagine that there would need to be a mechanism to remind the model what it is doing if the chat continued indefinitely, otherwise the initial prompt may be forgotten, I’ve not done any experiments with that yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: