According to Originality.AI, in the first 14 days since GPTBot documentation was launched almost 10% of the top 1000 websites in the world have chosen to block GPTBot. Among the websites that have blocked GPTBot are Amazon, Quora, Wikihow and several international news publications.
The report says that GPTbot was launched because OpenAI is facing an increasing number of lawsuits, some of which are related to using content without proper permission.
How does GPTbot work?
GPTbot works by first identifying potential sources of data. It does this by crawling the web and looking for websites that contain relevant information. Once a potential source has been identified, GPTbot will then extract the information from the website. This information is then stored in a database and can be used to train AI models.
The tool is able to extract information from a variety of sources, including text, images and even code. GPTbot can extract text from websites, articles, books, and other documents. GPTbot can extract information from images, such as the objects that are depicted in the image and the text that is associated with the image. Furthermore, GPTbot can extract code from websites, GitHub repositories, and other sources.
OpenAI’s ChatGPT and other generative AI tools rely on data from websites to train the models to become more efficient. A few months back — when it was still called Twitter — Elon Musk blocked OpenAI from scraping data from the social media platform.