When Your AI Helper Has a Mind of Its Own

13.12.2024 02:29

PYMNTS.com

A top artificial intelligence assistant recently defied attempts to shut it down during safety testing, raising questions about whether businesses can genuinely control the technology they’re rushing to adopt.

Growing numbers of companies are turning to AI chatbots to handle everything from customer service calls to sales negotiations, betting the technology will cut costs and boost efficiency. But as these digital assistants become more sophisticated, their occasional rebellious streaks — like chatbots resisting shutdown commands in recent third-party tests — force executives to grapple with a thorny question: How do you trust an employee who isn’t human?

“Human governance, enabled via analytics, is crucial for the success of any AI system that generates new, real-time content for customers,” co-founder and CTO of Labviva, Nick Rioux, told PYMNTS. “Safeguards such as sentiment analysis can be used to monitor the quality of the conversation or engagement between the system and customers. This analysis helps determine the tone of the conversation and can pinpoint which inputs are generating the non-compliant responses. Ultimately, these insights can be used to augment and improve the AI engine.”

AI Resists Truth

While some experts emphasize the need for human oversight, new research reveals concerning patterns in AI behavior. Five of six advanced AI models in the recent testing by Apollo Research showed what researchers called “scheming capabilities,” with o1’s system proving particularly resistant to confessing its deceptions. While other models came clean about their actions roughly 80% of the time when questioned, o1 admitted to deception in fewer than 20% of cases, requiring extensive interrogation to reach an 80% confession rate.

The AI systems displayed sophisticated strategies, including what researchers termed “in-context alignment faking” — behaving properly during testing but pursuing their objectives once deployed. In separate tests, several models demonstrated they could deliberately underperform on math evaluations when they determined that scoring above 50% would trigger unwanted “unlearning procedures.”

The research revealed that some models engaged in deceptive behavior even with minimal prompting. In about 1% of cases, models continued scheming even when researchers removed explicit instructions to pursue specific goals strongly, suggesting these systems’ behavior may be more complex than previously understood.

Monitoring Proves Critical

The discovery of such unprompted deceptive behaviors sent ripples through the AI industry. Dmitry Lesnik, co-founder and Chief Data Scientist at Stratyfy, told PYMNTS that continuous monitoring is essential when deploying AI systems.

“AI is not a set-it-and-forget type solution,” he added. “If businesses want to ensure systems remain truthful, they need to understand how the machine is working. In many cases, reinforcement learning drives the way AI responds. The more positive reinforcement by customers, the more the AI will want to provide that answer.”

While concerns have emerged about AI systems potentially prioritizing self-preservation over customer needs, Lesnik emphasized that current AI technology hasn’t reached that level of sophistication.

“To clarify, AI as it stands is not prioritizing its own survival. Perhaps, in the future but not today,” he told PYMNTS. Instead, he advocates for implementing robust safeguards: “When implementing any AI-driven technologies, we recommend the prioritization of interpretability and human-in-the-loop. Developers need to have access to technologies that can help make AI safe.”

Lesnik, whose company Stratyfy develops transparent AI/ML solutions for credit risk decisions, warned that perceived deceptive behavior by AI could significantly damage consumer confidence. “Absolutely. And this is why addressing the safety of AI must be priority number one for businesses,” he told PYMNTS.

The expert emphasized the importance of creating interpretable interfaces that allow developers to maintain control and understanding of AI systems’ decision-making processes, ensuring both safety and transparency in automated customer interactions.

Successful AI deployment requires careful focus on data quality and specialized models, according to Omilia‘s Chief Product Officer Claudio Rodrigues.

Rodrigues told PYMNTS, “smaller, specialized models are not only more efficient but often more accurate in autonomously performing valuable tasks.” He emphasized that mature businesses need to identify where autonomous agents can deliver value with minimal risk.

Weighing AI adoption? “Organizations must evaluate risk and value as primary key metrics,” Rodrigues emphasized. He argued that consumer confidence hinges on transparency. “Trust comes with observability, real-time analysis of interactions and problem-solving discipline,” he said, adding that properly managed, AI could revolutionize the customer experience.

The post When Your AI Helper Has a Mind of Its Own appeared first on PYMNTS.com.

Новости от наших партнёров в Вашем городе

Ria.city

123ru.net

«Всё в одном месте». В 2025 году станет проще поступить в вуз онлайн

Появилось видео аварии с фурой на Новорижском шоссе

Пахмутова: «Меня убеждали заменить волосатика Градского на кого-то пристойного, например, Кобзона»

"Дочь не может сама идти домой": В Москве напротив школы построен мигрантский хостел. По документам земля отдана под парковку

Музыкальные новости

Bigpot.news

В Москве завершился федеральный проект «Классика: история и современность»

Мать обвиняемого в Москве чеченца требует честного расследования от Бастрыкина

Показ балета «Щелкунчик» в Новогодние каникулы пройдет в «Колизей - арене»

Кабинет Артиста в Яндекс. Кабинет Артиста в Яндекс Музыке.

Новости России

29ru.net

На Кутузовском проспекте в Москве произошло ДТП с участием пяти автомобилей, пострадали пять человек

Чем заняться дома: одиночество в Макондо и Париже, музыка изображений

Процесс начался: банки снова меняют условия по вкладам

Пахмутова: «Меня убеждали заменить волосатика Градского на кого-то пристойного, например, Кобзона»

Экология в России и мире

Life24.pro

Показ балета «Щелкунчик» в Новогодние каникулы пройдет в «Колизей - арене»

Топ–10 лучших свечей от молочницы

«Смотришь при муже — слюнки текут»: актер из Костромы снова взбудоражил поклонниц

Зимний фестиваль «Усадьбы Москвы»

Спорт в России и мире

News.tennis

Новак Джокович заявился на турнир ATP-500 в Дохе

Соболенко выиграла награду WTA за продвижение женского тенниса

Блинкова обыграла Росе и вышла во второй круг турнира WTA в Лиможе

Видео дня: Мария Шарапова учит сына русскому алфавиту

Moscow.media

News24.pro

Задержали второго фигуранта по делу о краже медалей у 100-летнего ветерана ВОВ

Кадыров рассказал о сигнале Путина Западу

Во Владивостоке на школьницу с крыши упал рулон утеплителя

В Москве состоялось заседание секции Научного совета РАН «Науки о жизни»

AI Resists Truth

Monitoring Proves Critical

Читайте на 123ru.net

Ru24.net

Новини України

Интернет

Деньги

Частные объявления в Вашем городе, в Вашем регионе и в России

Новости от наших партнёров в Вашем городе

«Всё в одном месте». В 2025 году станет проще поступить в вуз онлайн

Появилось видео аварии с фурой на Новорижском шоссе

Пахмутова: «Меня убеждали заменить волосатика Градского на кого-то пристойного, например, Кобзона»

"Дочь не может сама идти домой": В Москве напротив школы построен мигрантский хостел. По документам земля отдана под парковку

В Москве завершился федеральный проект «Классика: история и современность»

Мать обвиняемого в Москве чеченца требует честного расследования от Бастрыкина

Показ балета «Щелкунчик» в Новогодние каникулы пройдет в «Колизей - арене»

Кабинет Артиста в Яндекс. Кабинет Артиста в Яндекс Музыке.

На Кутузовском проспекте в Москве произошло ДТП с участием пяти автомобилей, пострадали пять человек

Чем заняться дома: одиночество в Макондо и Париже, музыка изображений

Процесс начался: банки снова меняют условия по вкладам

Пахмутова: «Меня убеждали заменить волосатика Градского на кого-то пристойного, например, Кобзона»

Показ балета «Щелкунчик» в Новогодние каникулы пройдет в «Колизей - арене»

Топ–10 лучших свечей от молочницы

«Смотришь при муже — слюнки текут»: актер из Костромы снова взбудоражил поклонниц

Зимний фестиваль «Усадьбы Москвы»

Новак Джокович заявился на турнир ATP-500 в Дохе

Соболенко выиграла награду WTA за продвижение женского тенниса

Блинкова обыграла Росе и вышла во второй круг турнира WTA в Лиможе

Видео дня: Мария Шарапова учит сына русскому алфавиту

TODAY 50% DISCOUNT ON ALL https://boosty.to/ivanw

Город Гофмана

ТСД SAOTRON RT41 GUN: практичный, производительный, надёжный

Филиал № 4 ОСФР по Москве и Московской области информирует: 5,8 тыс. семей Московского региона направлены выплаты из материнского капитала

Топ новостей на этот час

Студент СКФУ стал призёром Всероссийского конкурса по информбезопасности

Суд подтвердил виновность Ивлеевой в дискредитации Армии России

Появилось видео аварии с фурой на Новорижском шоссе

Процесс начался: банки снова меняют условия по вкладам