FWIW I didn't like the Robot / Efficient mode because it would give very short answers without much explanation or background. "Nerdy" seems to be the best, except with GPT-5 instant it's extremely cringy like "I'm putting my nerd hat on - since you're a software engineer I'll make sure to give you the geeky details about making rice."
"Low" thinking is typically the sweet spot for me - way smarter than instant with barely a delay.
I hate its acknowledgement of its personality prompt. Try having a series of back and forth and each response is like “got it, keeping it short and professional. Yes, there are only seven deadly sins.” You get more prompt performance than answer.
I like the term prompt performance; I am definitely going to use it:
> prompt performance (n.)
> the behaviour of a language model in which it conspicuously showcases or exaggerates how well it is following a given instruction or persona, drawing attention to its own effort rather than simply producing the requested output.
It's like writing an essay for a standardized test, as opposed to one for a college course or for a general audience. When taking a test, you only care about the evaluation of a single grader hurrying to get through a pile of essays, so you should usually attempt to structure your essay to match the format of the scoring rubric. Doing this on an essay for a general audience would make it boring, and doing it in your college course might annoy your professor. Hopefully instruction-following evaluations don't look too much like test grading, but this kind of behavior would make some sense if they do.
Pay people $1 and hour and ask them to choose A or B, which is more short and professional:
A) Keeping it short and professional. Yes, there are only seven deadly sins
B) Yes, there are only seven deadly sins
Also have all the workers know they are being evaluated against each other and if they diverge from the majority choice their reliability score may go down and they may get fired. You end up with some evaluations answered as a Keynesian beauty contest/family feud survey says style guess instead of their true evaluation.
I use Efficient or robot or whatever. It gives me a bit of sass from time to time when I subconsciously nudge it into taking a “stand” on something, but otherwise it’s very usable compared to the obsequious base behavior.
If only that worked for conversation mode as well. At least for me, and especially when it answers me in Norwegian, it will start off with all sorts of platitudes and whole sentences repeating exactly what I just asked. "Oh, so you want to do x, huh? Here is answer for x". It's very annoying. I just want a robot to answer my question, thanks.
repeating what is being asked is fine i think, sometimes is thinks you want something different to what you actually want. what is annoying is "that's and incredibly insightul question that delves into a fundamental..." type responses at the start.
This is like arguing that we shouldn't try to regulate drugs because some people might "want" the heroin that ruins their lives.
The existing "personalities" of LLMs are dangerous, full stop. They are trained to generate text with an air of authority and to tend to agree with anything you tell them. It is irresponsible to allow this to continue while not at least deliberately improving education around their use. This is why we're seeing people "falling in love" with LLMs, or seeking mental health assistance from LLMs that they are unqualified to render, or plotting attacks on other people that LLMs are not sufficiently prepared to detect and thwart, and so on. I think it's a terrible position to take to argue that we should allow this behavior (and training) to continue unrestrained because some people might "want" it.
There aren't many major labs, and they each claim to want AI to benefit humanity. They cannot entirely control how others use their APIs, but I would like their mainline chatbots to not be overly sycophantic and generally to not try and foster human-AI friendships. I can't imagine any realistic legislation, but it would be nice if the few labs just did this on their own accord (or were at least shamed more for not doing so)
Unfortunately, I think a lot of the people at the top of the AI pyramid have a definition of "humanity" that may not exactly align with the definition that us commoners might be thinking of when they say they want AI to "benefit humanity".
I agree that I don't know what regulation would look like, but I think we should at least try to figure it out. I would rather hamper AI development needlessly while we fumble around with too much regulation for a bit and eventually decide it's not worth it than let AI run rampant without any oversight while it causes people to kill themselves or harm others, among plenty of other things.
At the very least, I think there is a need for oversight of how companies building LLMs market and train their models. It's not enough to cross our fingers that they'll add "safeguards" to try to detect certain phrases/topics and hope that that's enough to prevent misuse/danger — there's not sufficient financial incentive for them to do that of their own accord beyond the absolute bare minimum to give the appearance of caring, and that's simply not good enough.
Yes. My position is that it was irresponsible to publish these tools before figuring out safety first, and it is irresponsible to continue to offer LLMs that have been trained in an authoritative voice and to not actively seek to educate people on their shortcomings.
But, of course, such action would almost certainly result in a hit to the finances, so we can't have that.
Alternative take: these are incredibly complex nondeterministic systems and it is impossible to validate perfection in a lab environment because 1) sample sizes are too small, and 2) perfection isn’t possible anyway.
All products ship with defects. We can argue about too much or too little or whatever, but there is no world where a new technology or vehicle or really anything is developed to perfection safety before release.
Yeah, profits (or at least revenue) too. But all of these AI systems are losing money hand over fist. Revenue is a signal of market fit. So if there are companies out there burning billions of dollars optimizing the perfectly safe AI system before release, they have no idea if it’s what people want.
Releasing a chatbot that confidently states wrong information is bad enough on its own — we know people are easily susceptible to such things. (I mean, c'mon, we had people falling for ELIZA in the '60s!)
But to then immediately position these tools as replacements for search engines, or as study tutors, or as substitutes for professionals in mental health? These aren't "products that shipped with defects"; they are products that were intentionally shipped despite full knowledge that they were harmful in fairly obvious ways, and that's morally reprehensible.
Pretty sure most of the current problems we see re drug use are a direct result of the nanny state trying to tell people how to live their lives. Forcing your views on people doesn’t work and has lots of negative consequences.
I don't know if this is what the parent commenter was getting at, but the existence of multi-billion-dollar drug cartels in Mexico is an empirical failure of US policy. Prohibition didn't work a century ago and it doesn't work now.
All the War on Drugs has accomplished is granting an extremely lucrative oligopoly to violent criminals. If someone is going to do heroin, ideally they'd get it from a corporation that follows strict pharmaceutical regulations and invests its revenue into R&D, not one that cuts it with even worse poison and invests its revenue into mass atrocities.
Who is it all even for? We're subsidizing criminal empires via US markets and hurting the people we supposedly want to protect. Instead of kicking people while they're down and treating them like criminals over poor health choices, we could have invested all those countless billions of dollars into actually trying to help them.
I'm not sure which parent comment you're referring to, but what you're saying aligns with my point a couple levels up: reasonable regulation of the companies building these tools is a way to mitigate harm without directly encroaching on people's individual freedoms or dignities, but regulation is necessary to help people. Without regulation, corporations will seek to maximize profit to whatever degree is possible, even if it means causing direct harm to people along the way.
I'm not saying they're equivalent; I'm saying that they're both dangerous, and I think taking the position that we shouldn't take any steps to prevent the danger because some people may end up thinking they "want" it is unreasonable.
No one sane uses baseline webui 'personality'. People use LLMs through specific, custom APIs, and more often than not they use fine tune models, that _assume personality_ defined by someone (be it user or service provider).
Look up Tavern AI character card.
I think you're fundamentally mistaken.
I agree that to some users use of the specific LLMs for the specific use cases might be harmful but saying (default AI 'personality') that web ui is dangerous is laughable.
I don't know how to interpret this. Are you suggesting I'm, like, an agent of some organization? Or is "activist" meant only as a pejorative?
I can't say that I identify as any sort of AI "activist" per se, whatever that word means to you, but I am vocally opposed to (the current incarnation of) LLMs to a pretty strong degree. Since this is a community forum and I am a member of the community, I think I am afforded some degree of voicing my opinions here when I feel like it.
Disincentivizing something undesirable will not necessarily lead to better results, because it wrongly assumes that you can foresee all consequences of an action or inaction.
Someone who now falls in love with an LLM might instead fall for some seductress who hurts him more. Someone who now receives bad mental health assistance might receive none whatsoever.
I disagree with your premise entirely and, frankly, I think it's ridiculous. I don't think you need to foresee all possible consequences to take action against what is likely, especially when you have evidence of active harm ready at hand. I also think you're failing to take into account the nature of LLMs as agents of harm: so far it has been very difficult for people to legally hold LLMs accountable for anything, even when those LLMs have encouraged suicidal ideation or physical harm of others, among other obviously bad things.
I believe there is a moral burden on the companies training these models to not deliberately train them to be sycophantic and to speak in an authoritative voice, and I think it would be reasonable to attempt to establish some regulations in that regard in an effort to protect those most prone to predation of this style. And I think we need to clarify the manner in which people can hold LLM-operating companies responsible for things their LLMs say — and, preferably, we should err on the side of more accountability rather than less.
---
Also, I think in the case of "Someone who now receives bad mental health assistance might receive none whatsoever", any psychiatrist (any doctor, really) will point out that this is an incredibly flawed argument. It is often the case that bad mental health assistance is, in fact, worse than none. It's that whole "first, do no harm" thing, you know?
...nobody? I didn't determine any such thing. What I was saying was that LLMs are dangerous and we should treat them as such, even if that means not giving them some functionality that some people "want". This has nothing to do with playing god and everything to do with building a positive society where we look out for people who may be unable or unwilling to do so themselves.
And, to be clear, I'm not saying we necessarily need to outlaw or ban these technologies, in the same way I don't advocate for criminalization of drugs. But I think companies managing these technologies have an onus to take steps to properly educate people about how LLMs work, and I think they also have a responsibility not to deliberately train their models to be sycophantic in nature. Regulations should go on the manufacturers and distributors of the dangers, not on the people consuming them.
here’s something I noticed: If you yell at them (all caps, cursing them out, etc.), they perform worse, similar to a human. So if you believe that some degree of “personable answering” might contribute to better correctness, since some degree of disagreeable interaction seems to produce less correctness, then you might have to accept some personality.
You’re getting downvoted but I agree with the sentiment. The fact that people want a conversational robot friend is, I think, extremely harmful and scary for humanity.
Giving people what makes them feel good in the short term is not actually necessarily a good thing. See also: cigarettes, alcohol, gambling, etc.
https://share.cleanshot.com/9kBDGs7Q