If it were my site, "I like X now" would be a red flag.
I don't think you're gonna AI your way out of this part of things for some time, and it really is the core challenge to content moderation; it's heavily opinion and circumstance based, in a way current models really struggle with.
The defaults we have set are clearly too high. That comment should be exactly what we should approve. Thanks for trying it.