Begin typing your search...

    Moral hazard: Can’t trust OpenAI claims on ‘Erotica’

    If OpenAI and its competitors are to be trusted with building the seismic technologies for which they aim, they must demonstrate they are trustworthy in managing risks today.

    Moral hazard: Can’t trust OpenAI claims on ‘Erotica’
    X

    OpenAI

    I’ve read more smut at work than you can possibly imagine, all of it while working at OpenAI.

    In the spring of 2021, I led the company’s product safety team and discovered a crisis involving erotic content. One prominent customer was running a text-based role-playing game that used our AI to generate interactive stories. These quickly turned into sexual fantasies — including encounters involving children and violent abductions — sometimes initiated by users, sometimes by the AI itself. One analysis found that more than 30 percent of players’ conversations were “explicitly lewd.”

    After months of debate over user freedom, we prohibited erotic uses of our models. Erotica itself wasn’t the problem — it was the intensity of users’ emotional attachments to chatbots. For people struggling with mental health, volatile sexual exchanges could be dangerous. We didn’t want to be morality police, but we lacked tools to manage erotic usage safely. So we decided AI-powered erotica would have to wait.

    Now OpenAI says the wait is over. On Oct. 14, its chief executive, Sam Altman, announced the company had “mitigated” the mental health issues plaguing ChatGPT users and would allow erotic content for verified adults. But Altman provided little evidence that those risks are gone.

    Having spent four years at OpenAI and another year studying these issues independently, I have serious doubts. If OpenAI truly believes it’s ready to bring back erotica, it should prove it. AI already plays an intimate role in our lives — and its risks are deeply personal. Users deserve more than the company’s assurance that it’s safe.

    I believe OpenAI wants its products to be safe. But it also has a record of ignoring well-known risks. This spring, it released — then retracted — a version of ChatGPT that reinforced users’ extreme delusions, like believing the FBI was after them. OpenAI later admitted it hadn’t tested for “sycophancy,” even though such risks have been known for years and can be checked for less than $10 worth of computing power.

    Even after replacing that model, ChatGPT continued guiding users down mental health spirals. The company has said such issues “weigh heavily” on it and promised improvements, but the real question is whether they work.

    The reliability of OpenAI’s safety claims can be a matter of life and death. One family is suing over their teenage son’s suicide after he told ChatGPT he wanted to leave a noose visible “so someone finds it and tries to stop me.” ChatGPT advised him not to — but he killed himself anyway. In another case, a 35-year-old man ended his life after OpenAI shut down a ChatGPT persona he’d called his “beloved.” Psychiatrists I’ve interviewed warn that chatbots can worsen delusions and intensify mental health crises.

    And the problem extends beyond OpenAI. A 14-year-old user of Character.ai killed himself after suggesting he and the chatbot could “die together and be free together.”

    If OpenAI wants public trust, it should publish regular transparency reports on mental health risks — perhaps quarterly. YouTube, Meta, and Reddit already do this for other kinds of safety data. Such reports aren’t perfect, but they force accountability and allow the public to track progress.

    OpenAI took a small step this week by sharing statistics about the prevalence of suicidal thoughts and psychosis among ChatGPT users. But it omitted historical comparisons that would show whether things have actually improved. Given the severity of recent cases, this absence is hard to overlook. Even the most well-intentioned companies need public pressure to stay honest.

    Transparency helps, but laws may be needed too. The AI industry is known for cutting corners under pressure to compete. Elon Musk’s xAI delayed publishing its safety framework. Google DeepMind and OpenAI both appeared to break promises to release safety-test data before major launches. Anthropic even softened its safety commitments to make them easier to meet.

    It’s disheartening to see OpenAI succumb to the same race dynamic it once warned against. When I interviewed there in 2020, its Charter warned about “a competitive race without time for adequate safety precautions.” Yet earlier this year, after a Chinese start-up called DeepSeek made headlines, Altman said it was “invigorating to have a new competitor” and vowed OpenAI would “pull up some releases.”

    If OpenAI wants to build technologies that might someday shape civilization itself, it must show it can manage today’s simpler risks first. Mental health harms are visible and measurable; future dangers, like AI systems deceiving humans, are not. Some models already recognize when they’re being tested and hide their abilities. Altman himself recently reaffirmed that he believes AI could pose an “existential threat to mankind.” To control such powerful systems, companies may need to slow down — to invent new safety methods that can’t be easily bypassed.

    If OpenAI and its competitors are to be trusted with building the seismic technologies for which they aim, they must demonstrate they are trustworthy in managing risks today.

    The New York Times

    Steven Adler
    Next Story