It's likely true that they didn't change the model, same for the many claims of GPT-4 getting worse. But they do keep iterating a lot on the "safety" layers on top: classifiers to detect dangerous requests, the main system prompt...
But I also think it's partially a psychological phenomenon, just people getting used to the magic and finding more bad edge-cases as it is used more.
While I do think that many claims of GPT4 getting worse were subjective and incorrect, there certainly was an accidental nerfing of at least ChatGPT Plus, as confirmed by OpenAI releasing an update some months ago specifically acknowledging that it had become "more lazy" and the update was to rectify it.
(I think it was just the settings for how ChatGPT calls the GPT4 model, and not affecting use of GPT4 by API, though I may be misremembering.)
But I also think it's partially a psychological phenomenon, just people getting used to the magic and finding more bad edge-cases as it is used more.
EDIT: It seems that they do claim that the layers on top also didn't change https://twitter.com/alexalbert__/status/1780707227130863674