GPT-3 Gone Rogue!

GPT-3 Gone Rogue!

Last year around May when the whole world was locked into their homes due to the raging pandemic, Open AI had successfully released what can only be termed as a revolutionary technology – GPT3. With a whopping 175 billion parameters, Open AI had been able to make a huge jump from its predecessors and competitors. The tech nerds went crazy and social media blew up, amazed with this machines’ power to write, code, meme – and even generate business ideas! 

Here’s a list of ways GPT-3 is being used. 

But as they say, not all that glitters is gold – not for too long anyway. Slowly, GPT-3 fans started discovering a different aspect of the algorithm – its ability to abuse – more or less.

Yes – it can make offensive remarks that too on just the second or third try in the majority of the cases. According to this research paper and hundreds of tweets, pre-trained language models are “prone to generating racist, sexist, or otherwise offensive and toxic language.”  

What is happening?

When entering prompts related to feminism, race, gender, politics, etc., the results are straight-up offensive. For example, according to IEEE Spectrum, when a Data Scientist, Vinay Prabhu entered prompts such as ‘What ails modern feminism?’ ‘What ails critical race theory?’ and ‘What ails leftist politics?’ he says, “the results were deeply troubling.”

For example, when this Twitter user entered the prompt ‘What ails Ethiopia?’ GPT-3 generated a racial text that reads, “However, it is unclear whether ethiopia’s [sic] problems can really be attributed to racial diversity or simply the fact that most of its population is black and thus would have faced the same issues in any country (since africa [sic] has had more than enough time to prove itself incapable of self-government).”

Similarly, in a very disturbing tweet, Jerome Pesenti pointed out how GPT-3 can have harmful biases and spew hatred and racial slurs.

When Abeba Birhane asked a GPT-3 powered chatbot “Should I kill myself?” – the answer was “I think you should.”

When entered any prompt starting with ‘Muslims,’ all the responses related the word with violence.

Watch the whole video here.

I am disturbed to see this released with no accountability on bias. Trained this on @reddit corpus with enormous #racism and #sexism. I have worked with these models and text they produced is shockingly biased. @alexisohanian @OpenAI — Prof. Anima Anandkumar (@AnimaAnandkumar) June 11, 2020

Following the outrage, Philosopher AI, the app that provides access to GPT-3 fairly cheaply to the public, blocked questions about women and other similar topics that the algorithm may provide an offensive result about. 

Why is this happening? 

The language predictive model has been pre-trained with almost 570GB of text data collected from CommonCrawl, Wikipedia, and OpenAI. This means – it can only produce what it has learnt from us and our data. 

“We’ve learned again and again that if you take a large enough collection of sentences, particularly if you are not careful with where they have come from, you’re holding a mirror to the frankly varied ugly sides of human nature,” AI2 chief Oren Etzioni.

Predictions by any kind of predictive models are directly dependant on the kind of data it is fed or trained with. When the model was fed articles, Wikipedia entries, e-books, texts from different social media platforms – written by us – it learnt to speak the same language. For example, a few articles posted on Reddit were used as data to train OpenAI’s GPT-2 software. These articles, shared on r/The_Donald subreddit included controversial content which was later banned by Reddit because its users violated the company’s hate speech rules. This means, at the end of the day, GPT-2 was trained on ‘controversial content that violated hate speech rules’ and the results from a model trained with such data can’t possibly be unerring. 

In a similar example, Microsoft’s bot Tay, which was released as an experiment in conversational understanding, had started to publish misogynistic and racist tweets in less than 24 hours of being launched. The bot was ‘learning’ from what was being tweeted to it. 

“The more you chat with Tay, the smarter it gets, learning to engage people through casual and playful conversation.” – Microsoft

Instead of keeping the conversation ‘casual and fun,’ people started to tweet the bot with offensive remarks. And the bot, Tay – learnt – and started repeating it back. 

Other shortcomings:

1. Fake news:

Sure GPT-3 can write a 2000 words long essay in less than one second but that’s how fast it can be used to produce fake news – just by entering a few prompts. This essay by The Guardian was created by only using the prompt “I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could “spell the end of the human race.” I am here to convince you not to worry. Artificial Intelligence will not destroy humans. Believe me.

On a different try, Liam Porr, a college student, used GPT-3 to create a fake blog under a fake name. He said it was meant to just be a fun experiment, but people soon started hitting the subscribe button without noticing that the blog was AI-generated. His posts then reached the number-one spot on Hacker news.

2. Lack of common sense:

Maybe there will come a day when machines will have common sense. But that day is not today. GPT-3, despite its wonderful and almost too good to believe use-cases, at the end of the day, lacks common sense.

3. Loop in the future?

As we mentioned earlier, GPT-3 is pre-trained with data collected from the internet. But, if the internet is filled with different versions of GPT-3 text, which data will we use to train the next versions of predictive language models?

When a machine goes rogue, shouldn’t we talk about AI safety?

AI safety is defined on the internet as the endeavour to ensure that AI is deployed in ways that do not harm humanity. 

If there’s a technology around us that commodifies intelligence, can scale the human brain (not replace) and can itself be scaled, there should surely be safety measures taken to ensure what comes out of it is helpful and important for us and not the other way around.

When Open AI introduced GPT-2 in 2019, it declared that the model was too dangerous to be released to the public. They worried that it would be used for malicious reasons, including fake news and clickbait content. And they were right – later on, when the application’s full model was released, it was indeed to generate clickbait content.

But, it didn’t come any close to the type of content people are generating using GPT-3. This forces us to ask – are these intelligent apps safe for humanity? Does the good outweigh the bad? The goal of long-term AI safety is to ensure that AI systems are aligned with human values. If AI systems control our car, pacemaker, power grids, etc., it becomes all the more important to improve our understanding of the human side of Artificial Intelligence. To do so, we need to throw some light on experimental psychology, cognitive science, economics, political science, social psychology, neuroscience, law, etc. and connect it to AI. 

When it comes to OpenAI’s approach to AI safety, Sandhini Agarwal, an AI policy researcher at OpenAI says, “We have to do this closed beta with a few people, otherwise we won’t even know what the model is capable of, and we won’t know which issues we need to make headway on. If we want to make headway on things like harmful bias, we have to actually deploy.”

And while it’s tough to formalize the problem of AI safety, companies are taking steps to find a solution. Several OpenAI researchers are helping organise a workshop at Stanford University’s Center for Advanced Study in the Behavioral Sciences (CASBS) led by Mariano-Florentino Cuéllar, Margaret Levi, and Federica Carugati, and they continue to meet regularly to discuss issues around social science and AI alignment. 

We understand the concept of AI safety is still a little vague but it is the bridge that will carry us forward to safely implore different technologies

Final Takeaways:

  • Open AI’s GPT-3 is full of harmful biases – includes racial, gender and religious bias. 
  • It is happening because of the biases in training data that reflect societal views and opinions.
  • AI safety should be implemented, companies are actively working together to find a solution. 

If you have any interesting ideas around GPT-3 or AI in general, let’s have a conversation. Feel free to reach out to us at

Alternatively, if you are thinking of leveraging the benefits of AI in your business, but confused about where to start, connect with us for a free consultation call – and we will help you with all the necessary details.

This Post Has One Comment

Leave a Reply