What’s Wrong with AI Therapy Bots?

20 Mar 2023

I have a distinct memory from my childhood: I was on a school trip, at what I think was the Ontario Science Centre, and my classmates and I were messing around with a computer terminal. As this was the early-to-mid 90s the computer itself was a beige slab with a faded keyboard, letters dulled from the hunt-and-pecking of hundreds of previous children on school trips of their own. There were no graphics, just white text on a black screen, and a flashing rectangle indicating where you were supposed to type.

The program was meant to be an “electronic psychotherapist,” either some version of ELIZA – one of the earliest attempts at what we would now classify as a chatbot – or some equivalent Canadian substitute (“Eh-LIZA”?). After starting up the program there was a welcome message, after which it would ask questions – something like “How are you feeling today?” or “What seems to be bothering you?” The rectangle would flash expectantly, store the value of the user’s input in a variable, and then spit it back out, often inelegantly, in a way that was meant to mimic the conversation of a therapist and patient. I remember my classmate typing “I think I’m Napoleon” (the best expression of our understanding of mental illness at the time) and the computer replying: “How long have you had the problem I think I’m Napoleon?”

30-ish years later, I receive a notification on my phone: “Hey Ken, do you want to see something adorable?” It’s from an app called WoeBot, and I’ve been ignoring it. WoeBot is one of several new chatbot therapists that tout that they are “driven by AI”: this particular app claims to sit at the intersections of several different types of therapy – cognitive behavioral therapy, interpersonal psychotherapy, and dialectical behavior therapy, according to their website – and AI that is powered by natural language processing. At the moment, it’s trying to cheer me up by showing me a gif of a kitten.

Inspired by (or worried they’ll get left behind by) programs like ChatGPT, tech companies have been chomping at the bit to create their own AI programs that produce natural-sounding text. The lucrative world of self-help and mental well-being seems like a natural fit for such products, and many claim to solve a longstanding problem in the world of mental healthcare: namely, that while human therapists are expensive and busy, AI therapists are cheap and available whenever you need them. In addition to WoeBot, there’s Wysa – also installed on my phone, and also trying to get my attention – Youper, Fingerprint for Success, and Koko, which recently got into hot water by failing to disclose to its userbase that they were not, in fact, chatting with a human therapist.

Despite having read reports that people have found AI therapy bots to be genuinely helpful, I was skeptical. But I attempted to keep an open mind, and downloaded both WoeBot and Wysa to see what all the fuss was about. After using them for a month, I’ve found them to be very similar: they both “check in” at prescribed times throughout the day, attempt to start up a conversation about any issues that I’ve previously said I wanted to address, and recommend various exercises that will be familiar to those who have ever done any cognitive behavioral therapy. They both offer the option to connect to real therapists (for a price, of course), and perhaps in response to the Koko debacle, neither hides the fact that they are programs (often annoying so: WoeBot is constantly talking about how its friends are other electronics, a schtick that got tired almost immediately).

It’s been an odd experience. The apps send me messages saying that they’re proud of me for doing good work, that they’re sorry if I didn’t find a session to be particularly useful, and that they know that keeping up with therapy can be difficult. But, of course, they’re not proud of me, or sorry, and they don’t know anything. At times their messages are difficult to distinguish from those of a real therapist; at others, they don’t properly parse my input, and respond with messages not unlike “How long have you had the problem I think I’m Napoleon?” If there is any therapeutic value in the suspension of disbelief then it often does not last long.

But apart from a sense of weirdness and the occasional annoyances, are there any ethical concerns surrounding the use of AI therapy chatbots?

There is clearly potential for them to be beneficial: your stock model AI therapist is free, and the therapies that they draw their exercises from are often well-tested in the offline world. A little program that reminds you to take deep breaths when you’re feeling stressed out seems all well and good, so long as it’s obvious that it’s not a real person on the other side.

Whether you think the hype about new AI technology is warranted or not will likely impact your feelings about the new therapy chatbots. Techno-optimists will emphasize the benefit of expanding care to many more people than could be reached through other means. Those who are skeptical of the hype, however, are likely to think that spending so much money on unproven tech is a poor use of resources: instead of sinking billions into competing chatbots, maybe that money could be spent on helping a wider range of people access traditional mental health resources.

There are also concerns about the ability of AI-driven text generators to go off the rails. Microsoft’s recent experiment with their new AI-powered Bing search had an inauspicious debut, occasionally spouting nonsense and even threatening users. It’s not hard to imagine the harm such unpredictable outputs could cause for someone who relied heavily on their AI therapy bot. Of course, true believers in the new AI revolution will dismiss these worries as growing pains that inevitably come along with the use of any new tech.

What is perhaps troubling is that the apps themselves walk a tightrope between trying to be a sympathetic ear, and reminding you that they’re just bots. The makers of WoeBot recently released research results that suggest that users feel a “bond” with the app, similar to the kind of bond they might feel with a human therapist. This is clearly an intentional choice on the part of the creators, but it brings with it some potential pitfalls.

For example, although the apps I’ve tried have never threatened me, they have occasionally come off as cold and uninterested. During a recent check-in, Wysa asked me to tell it what was bothering me that morning. It turned out to be a lot (the previous few days hadn’t been great). But after typing it all out and sending it along, Wysa quickly cut the conversation short, saying that it seemed like I didn’t want to engage at the moment. I felt rejected. And then I felt stupid that I felt rejected, because there was nothing that was actually rejecting me. Instead of feeling better by letting it all out, I felt worse.

In using the apps I’m reminded of a thought experiment from philosopher Hilary Putnam. He asks us to consider an ant on a beach who, through its search for food and random wanderings, happens to trace out what looks to be a line drawing of Winston Churchill. It is not, however, a picture of Churchill, and the ant did not draw it, at least in the way that you or I might. However, at the end of the day a portrait of Winston Churchill consists of a series of marks on a page (or on a beach), so what, asks Putnam, is the relevant difference between those made by the ant and those made by a person?

His answer is that only the latter are made intentionally, and it is the underlying intention which gives the marks their meaning. WoeBot and Wysa and other AI-powered programs often string together words in ways that look indistinguishable from those that might be written down by a human being on the other side. But there is no intentionality, and without intentionality there is no genuine empathy or concern or encouragement behind the words. They are just marks on a screen that happen to have the same shape as something meaningful.

There is, of course, a necessary kind of disingenuousness that must exist for these bots to have any effect at all. No one is going to feel encouraged to engage with a program that explicitly reminds you that it does not care about you because it does not have the capacity to care. AI therapy requires that you play along. But I quickly got tired of playing make believe with my therapy bots, and it’s overall become increasingly difficult for me to find the value in this kind of ersatz therapy.

I can report one concrete instance in which using an AI therapy bot did seem genuinely helpful. It was guiding me through an exercise, the culmination of which was to get me to pretend as though I were evaluating my own situation as that of a friend, and to consider what I would say to them. It’s an exercise that is frequently used in cognitive behavioral therapy, but one that’s easy to forget to do. In this way, the app checking-in did, in fact, help: I wouldn’t have been as sympathetic to myself had it not reminded me to. But I can’t help but think that if that’s where the benefits of these apps lie – in presenting tried-and-tested exercises from various therapies and reminding you to do them – then the whole thing is over-engineered. If it can’t talk or understand or empathize like a human, then there seems to be little point in there being any artificial intelligence in there at all.

AI therapy bots are still new, and so it remains to be seen whether they will have a lasting impact or just be a flash in the pan. Whatever does end up happening, though, it’s worth considering whether we would even want the promise of AI-powered therapy to come true.

What’s Wrong with AI Therapy Bots?

A New Kind of Risk?

When Is Fair Use “Fair” for AI (and When Is It "Use")?

The Algorithm Made Me Do It

Has AI Made Photos Untrustworthy?

What’s Wrong with AI Therapy Bots?

A New Kind of Risk?

When Is Fair Use “Fair” for AI (and When Is It "Use")?

The Algorithm Made Me Do It

Has AI Made Photos Untrustworthy?

Receive a weekly digest of our best content!