11/15/2024

How to Block OpenAI from Crawling Your Website

Here's how to block ChatGPT by just copying just two lines of code in a text file

Not everyone was thrilled to learn that OpenAI, the creators of ChatGPT, had been training their AI on data taken from people’s websites without permission. While it’s too late to do anything about the data they’ve already crawled, you can stop these models from being trained on your current and future content — and all it takes is two lines of code.

However, just because you can block OpenAI from crawling your website, I would highly recommend asking the question if you should. For more on that, read this article: “Leaders: Don't prematurely block OpenAI from your websites.”

How ChatGPT crawls the web for content

OpenAI uses a web crawler called GPTBot to train their AI models (such as GPT-4). Web crawling is when an automated bot goes around collecting data on all the content on the internet. It happens all the time, and in fact, this is how Google works.

Please select this link to read the complete article from PluralSight.

Complete Story

11/15/2024

How to Block OpenAI from Crawling Your Website

Here's how to block ChatGPT by just copying just two lines of code in a text file

How ChatGPT crawls the web for content