Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tried generating a Chinese rap song, and it did generate a pretty good rap. However, upon completion, it deleted the response, and showed > I don’t understand Chinese yet, but I’m working on it. I will send you a message when we can talk in Chinese.

I tried some other languages and the same. It will generate non-English language, but once its done, the response is deleted and replaced with the message



I'm seeing the same behaviour. It's as if they have a post-processor that evaluates the quality of the response after a certain number of tokens have been generated, and reverts the response if it's below a threshold.


I've noticed Gemini exhibiting similar behaviour. It will start to answer, for example, a programming question - only to delete the answer and replace it with something along the lines of "I'm only a language model, I don't know how to do that"


This seems like a bizarre way to handle this. Unless there's some level of malicious compliance, I don't see why they wouldn't just hide the output until the filtering step is completed. Maybe they're incredibly concerned about it appearing responsive in the average case.

Would not be surprised if there were browser extensions/userscripts to keep a copy of the text when it gets deleted and mark it as such.


They have both pre and post-LLM filters.


The linked article mentions these safeguards as the post-processing step.


I've seen the exact same thing! Gemini put together an impressive bash one liner then deleted it.


Always very frustrating when it happens.


It might be copyright related and not quality related. What if X% of it is a direct ripoff an existing song?


so run it locally, local version is not guarded


My locally-hosted llama3 actually craps itself if I ask it to answer in other languages. It's pretty hilarious. Has been working flawlessly (and impressively fast) for everything in English, then does hilarious glitches in other languages.

Eg right now to show it here, I say "Write me a poem about a digital pirate in Danish":

Digitalen Pirat

På nettet sejler han, En digital pirat, fri og farlig. Han har øjnene på de kodeagtige Og hans hjerne er fuld af ideer.

Hans skib er en virtuel børs, Hvor dataenes vætætø Tø Tø Tø Hø T Ø T Ø T Ø T Ø T Ø T 0 Ø T 0 Ø T 0

Edit: Formatting is lost here, but all those "T" and "Ø" etc are each on their own line, so it's a vomit of vertical characters that scrolls down my screen.


Trying the same on https://llama3.replicate.dev/ with Llama 3-70B gives a perfectly fine response with a long poem in Danish. And then it even translates it to English before concluding the response.


The training data is 95% English, foreign language is not going to be its strongest strength.


Tried with Italian and it seems to work but always appends the following disclaimer:

«I am still improving my command of non-English languages, and I may make errors while attempting them. I will be most useful to you if I can assist you in English.»


Crazy that this bug is still happening 12hrs later




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: