THERE’S A CHANCE that ChatGPT knows personal details about you—and if it doesn’t, it might just make something up. As OpenAI’s generative text chatbot has boomed in popularity over the past six months, the risks of the system being trained on data vacuumed up from the web have become clearer.
Data regulators around the world are investigating issues with how OpenAI gathered the data it uses to train its large language models, the accuracy of answers it provides about people, and other legal concerns about the use of its generative text systems. Europe’s data regulators have joined forces to look at OpenAI after Italy temporarily banned ChatGPT from the country. And Canada is also investigating the technology’s potential privacy risks.
In Europe, GDPR laws require companies and organizations to demonstrate lawful reasons for handling people’s personal information and to let people access information about them, be informed of how their information is used, and demand that errors be rectified. In some cases, they can ask that certain types of data be erased. The way people’s personal information has been used in training data has been an early area of concern for EU regulators.
As people have experimented with the chatbot, asking it questions about their lives and friends, a range of potential problems have emerged. OpenAI warns that ChatGPT may provide inaccurate information, and people have found that it makes up jobs and hobbies. It has cooked up false newspaper articles that had even the alleged human authors wondering if they were real. It generated incorrect
statements saying a law professor was involved in a sexual harassment scandal, and it said a mayor in Australia had been implicated in a bribery scandal—he is preparing to sue for defamation.
It’s not just individuals who are concerned about how data is used. Samsung has banned employees from using generative AI tools, in part over fears about how data is stored on external servers and the risk that company secrets could ultimately be disclosed to other users. (There are separate issues around copyright and intellectual property.)
In response to the scrutiny—particularly from the Italian data regulator, which has now allowed ChatGPT back into the country after OpenAI made changes to its service—the company has introduced tools and processes that allow people more control over at least some of their data. Here’s how to use them.