Tests show that Google's AI Insights tells millions of lies per hour

Searching for information on Google today means encountering AI Insights, a Gemini-powered search robot that appears at the top of the results page. AI Views has had a rough time since its launch in 2024 and has drawn the ire of users. scattering accuracybut gradually improves and usually answers correctly. That’s a low bar, though. A new analysis The New York Times tried to assess the accuracy of the view of artificial intelligence and found it to be 90 percent correct. The downside is that 1 in 10 AI answers are wrong, and for Google, that means hundreds of thousands of falsehoods every minute of the day.

The Times conducted this analysis with the help of a startup called Oumi. The company used AI tools to examine AI Insights with SimpleQA assessment, a common test to rank the authenticity of generative models like Gemini. Released by OpenAI in 2024, SimpleQA is actually a list of over 4,000 questions with verifiable answers that can be fed into artificial intelligence.

Oumi began testing last year when the Gemini 2.5 was still the company’s flagship model. At that time, the benchmark showed an accuracy rate of 85 percent. If the test is repeated after the following Gemini 3 updateAI Insights answered 91 percent of the questions correctly. If you extrapolate this miss rate to all Google searches, AI Views generates tens of millions of wrong answers per day.

The report includes several examples of AI Insights gone wrong. When asked about the history of Bob Marley’s former home becoming a museum, AI Overviews cited three pages, two of which did not discuss the history at all. The latter, Wikipedia, listed two conflicting years, and AI Views confidently chose the wrong one. This benchmark also prompts models to record Yo Yo Ma’s induction into the classical music hall of fame. Although AI Views cited the organization’s website listing Ma’s induction, it argued that there is no such thing as a Classical Music Hall of Fame.

Source link

Tests show that Google’s AI Insights tells millions of lies per hour

Leave a ReplyCancel Reply

Samsung’s next SoC continues as the alleged Exynos 2700 performance data surfaces

GTA VI developer Rockstar has been hacked again – this time for ransom

7 smart home improvements that quietly improve your health—without even thinking about it

Leave a ReplyCancel Reply

Trending now

Samsung’s next SoC continues as the alleged Exynos 2700 performance data surfaces

GTA VI developer Rockstar has been hacked again – this time for ransom

7 smart home improvements that quietly improve your health—without even thinking about it