3 reasons not to block GPTBot from crawling your site

27.02.2024 18:00

SearchEngineLand.com

The next phase in ChatGPT’s meteoric rise is the adoption of GPTBot. This new iteration of OpenAI’s technology involves crawling webpages to deepen the output ChatGPT can provide.

AI improvement seems positive, but it’s not so clear-cut. Legal and ethical issues surround the technology.

GPTBot’s arrival has highlighted these concerns, as many major brands are blocking it instead of leveraging its potential.

But I truly believe there’s much more to gain than lose by fully (and responsibly) embracing GPTBot.

Why do AI bots like GPTBot crawl websites?

Understanding why bots like GPTBot do what they do is the first step to embracing this technology and leveraging its potential.

Simply put, bots like GPTBot are crawling websites to gather information. The main difference is rather than an AI platform passively being fed data to learn from (the “training set,” if you will), a bot can actively pursue information on the web by crawling various pages.

Large language models (LLMs) scour these websites in an attempt to understand the world around us. Google’s C4 data set makes up a large portion (15.7 million sites) of the learning body for these LLMs. They also crawl other authoritative, informative sites like Wikipedia and Reddit.

The more sites these bots can crawl, the more they learn and the better they can become. Why, then, are companies blocking GPTBot from crawling?

Do brands that block GPTBot have valid fears?

When I first read about companies blocking GPTBot from crawling their websites, I was confused and surprised.

To me, it seemed incredibly short-sighted. But I figured there must be a lot to consider that I wasn’t thinking deeply enough about.

After researching and talking to agency professionals with legal backgrounds, I found the biggest reasons.

Lack of compensation for their proprietary training data

Many brands block GPTBot from crawling their site because they don’t want their data used in training its models without compensation. While I can understand wanting a piece of their $1 billion pie, I think this is a short-sighted view.

ChatGPT, much like Google and YouTube, is an answer engine for the world. Preventing your content from being crawled by GPTBot might limit your brand’s reach to a smaller set of internet users in the future.

Security concerns

Another reason behind the anti-GPTBot sentiment is security. While more valid than greedily hoarding data, it’s still a largely unfounded concern from my perspective.

By now, all websites should be very secure. Not to mention, the content GPTBot is trying to access is public, non-sensitive content. The same stuff that Google, Bing, and other search engines are crawling daily.

What caches of sensitive information do CIOs, CEOs, and other company leaders think GPTBot will access during its crawl? And with the right security measures, shouldn’t this be a non-issue?

The looming threat of legal implications

From a legal standpoint, the argument is that any crawls done on a brand’s site must be covered by their privacy disclaimer. All websites should have a privacy disclaimer outlining how they use the data collected by their services. Attorneys say this language must also state that a generative AI third-party platform could crawl the data collected.

If not, any personally identifiable information (PII) or customer data could still be “public” and expose brands to a Section 5 Federal Trade Commission (FTC) claim for unfair and deceptive trade practices.

I get this concern to some degree. If you’re the legal department of a big-name brand, one of your primary objectives is to keep your company out of hot water. But this legal concern applies more to what’s input into ChatGPT rather than what GPTBot crawls.

Anything input into OpenAI’s platform becomes part of its data bank and has the potential to be shared with other users – leading to data leakage. However, this would likely only happen if users asked questions relative to stored information.

This is another unwarranted concern to me because it can all be resolved by responsible internet usage. The same data principles we’ve used since the dawn of the web still ring true – don’t input any information you don’t want shared.

An impulse to save humanity from AI advancement

I can’t help but think that leaders at some of these brands blocking GPTBot have a bias against the advancement of AI technology.

We often fear what we don’t understand, and some are frightened by the idea of artificial intelligence gaining too much knowledge and becoming too powerful.

While AI is evolving rapidly and beginning to “think” more deeply, humans are still largely in control. Additionally, legislation governing AI will grow alongside the technology.

When we finally reach a world of “autonomous” AI platforms, their functionality will be guided by years of human innovation and legislation.

Get the daily newsletter search marketers rely on.

See terms.

3 reasons not to block ChatGPT’s GPTBot

So why should you allow GPTBot to crawl your site? Let’s look on the bright side with these three primary benefits of embracing OpenAI’s bot technology.

1. 100 million people use ChatGPT each week

By not allowing GPTBot to crawl your site, there’s a 100 million-person audience you’re missing out on maximizing brand visibility.

Sharing access to your website content can help ensure your brand is both factually and positively represented to ChatGPT users.

This means there’s a higher chance that your brand will actually be recommended by ChatGPT, leading to more traffic and potential customers.

Some brands report getting 5% of their overall leads, or $100,000 in monthly subscription revenue from ChatGPT. I know our agency has already gotten some leads from ChatGPT, too.

Another way to consider this is as a positive digital PR (DPR) play. You should leverage DPR strategies like brand mention campaigns in today’s landscape.

Permitting GPTBot to crawl your site only adds to these efforts by allowing ChatGPT to access your brand information directly from the source and distribute it to 100 million users positively.

2. Generative engine optimization (GEO)

Whether you have fears about AI, we can all agree that it’s changing the marketing landscape. Like all new technologies and trends in our industry, those slow to embrace AI as a conduit for new business and brand exposure will miss the proverbial boat.

GEO is picking up steam as a sub-practice of SEO. You’ll miss a significant opportunity if you’re not targeting some of your marketing efforts to be in this marketplace. Competitors may pick up after you let it slip through the cracks.

We know it’s easy for brands to fall behind in today’s fractioned and ever-growing marketing landscape. If your competitors spend years working on GEO, maximizing LLM visibility and developing skills and expertise in this area, that’s years ahead of you they’ll be.

Now, GEO reporting capabilities haven’t caught up to the value yet, which means it will be tough to measure an ROI, but that doesn’t mean it’s something to ignore and fall behind on.

Brands and marketers must start embracing LLMs like ChatGPT as an emerging acquisition channel that shouldn’t be ignored.

3. OpenAI’s pledge to minimize harm

Source: https://openai.com/safety-standards

A healthy distrust of AI technologies is important to its legal and ethical growth. But we also need to be open-minded and realize we can’t be effective as marketers if we resist and choose not to grow and innovate in the direction of things.

OpenAI clearly states “minimize harm” as one of the guiding principles of their platform. They also have policies to respect copyright and intellectual property and have stated that GPTBot filters out sources violating their policies.

By allowing GPTBot to crawl your site’s content, you’re contributing to the clean and accurate training data OpenAI uses to enhance and improve its information accuracy.

As AI technology marches on, it can be easy to get caught up in skepticism, fear, and noise. Those struggling to embrace and maximize it will get left behind.

Новости от наших партнёров в Вашем городе

Ria.city

123ru.net

Россияне определились с местами, где проведут новогодние каникулы

Замглавы МИД Рябков: единственная кандидатура посла РФ в США еще не утверждена

Новый экспресс-маршрут появился в ТиНАО

Не принявшие присягу иностранцы будут лишены статуса гражданина России

Музыкальные новости

Bigpot.news

Судзуки пообщается с премьером Японии, чтобы он ответил на сигнал Путина

Ротенберг — об отсутствии Кузнецова: «Прихватило живот. Находился не в том месте, где хотел бы находиться»

На заводах Желдорреммаш внедряется программное обеспечение ИС «Метрология»

EVITA BEAUTY STORE - интернет-магазин косметики премиум-класса!

Новости России

29ru.net

Замглавы МИД Рябков: единственная кандидатура посла РФ в США еще не утверждена

Премьер Мишустин подарил композитору Пахмутовой букет на 95-летие

РФ подписала с ЦАР и Руандой соглашения об отмене виз для дипломатов

Владимир посетил народный артист Сергей Жилин с проектом «Джаз-регион»

Экология в России и мире

Life24.pro

Важен каждый шаг: беговой клуб «Будь Здоров» - в благотворительном забеге «Бегу за ОРБИ»

Юные морские пехотинцы посетили отель Yalta Intourist

Магия общения в Интернете

Назван средний чек туров по 10 самым популярным направлениям в ноябре: Египет, Россия, Таиланд, ОАЭ, Турцию и ещё 5 стран

Спорт в России и мире

News.tennis

Касаткина сыграет на Итоговом турнире WTA после снятия Пегулы

Российская теннисистка Анастасия Потапова сообщила о разводе

Кудерметова в паре с Чжань Хаоцин вышли в полуфинал итогового турнира WTA

Русские ракетки развели по углам // Даниил Медведев и Андрей Рублев попали в разные группы на Nitto ATP Finals

Moscow.media

News24.pro

XIX Международный кинофестиваль «Победили вместе» пройдет с 17 по 22 ноября в Сочи

Ликсутов: резиденты столичной ОЭЗ занимают важное место в международной торговле

В российском городе изуродовали памятник Ленину

Монолитные работы завершены в жилом доме в Хамовниках

Why do AI bots like GPTBot crawl websites?

Do brands that block GPTBot have valid fears?

Lack of compensation for their proprietary training data

Security concerns

The looming threat of legal implications

An impulse to save humanity from AI advancement

3 reasons not to block ChatGPT’s GPTBot

1. 100 million people use ChatGPT each week

2. Generative engine optimization (GEO)

3. OpenAI’s pledge to minimize harm

Читайте на 123ru.net

Происшествия

Религия

Настроение

Фоторепортажи

Частные объявления в Вашем городе, в Вашем регионе и в России

Новости от наших партнёров в Вашем городе

Россияне определились с местами, где проведут новогодние каникулы

Замглавы МИД Рябков: единственная кандидатура посла РФ в США еще не утверждена

Новый экспресс-маршрут появился в ТиНАО

Не принявшие присягу иностранцы будут лишены статуса гражданина России

Судзуки пообщается с премьером Японии, чтобы он ответил на сигнал Путина

Ротенберг — об отсутствии Кузнецова: «Прихватило живот. Находился не в том месте, где хотел бы находиться»

На заводах Желдорреммаш внедряется программное обеспечение ИС «Метрология»

EVITA BEAUTY STORE - интернет-магазин косметики премиум-класса!

Замглавы МИД Рябков: единственная кандидатура посла РФ в США еще не утверждена

Премьер Мишустин подарил композитору Пахмутовой букет на 95-летие

РФ подписала с ЦАР и Руандой соглашения об отмене виз для дипломатов

Владимир посетил народный артист Сергей Жилин с проектом «Джаз-регион»

Важен каждый шаг: беговой клуб «Будь Здоров» - в благотворительном забеге «Бегу за ОРБИ»

Юные морские пехотинцы посетили отель Yalta Intourist

Магия общения в Интернете

Назван средний чек туров по 10 самым популярным направлениям в ноябре: Египет, Россия, Таиланд, ОАЭ, Турцию и ещё 5 стран

Касаткина сыграет на Итоговом турнире WTA после снятия Пегулы

Российская теннисистка Анастасия Потапова сообщила о разводе

Кудерметова в паре с Чжань Хаоцин вышли в полуфинал итогового турнира WTA

Русские ракетки развели по углам // Даниил Медведев и Андрей Рублев попали в разные группы на Nitto ATP Finals

"Бешеная табуретка" - в Daewoo Matiz установили мотор V6

5 ГВт и 2 млн ускорителей: опубликованы детали проекта километровых ИИ ЦОД OpenAI

На Ямале водитель мусоровоза погиб при разгрузке отходов

Спасателям вынесли приговор за смерть ребенка в челябинском термальном комплексе

Топ новостей на этот час

Какие народы жили на территории Ленинградской области до прихода русских

Премьер Мишустин подарил композитору Пахмутовой букет на 95-летие

Владимир посетил народный артист Сергей Жилин с проектом «Джаз-регион»

Футболист «Факела» во время матча с «Локомотивом» нашел на поле телефон