On May 14, 2024, Google rolled out their new AI overviews feature in the main search engine. Now, when users ask questions through a google search, the first result will be a generated AI overview of the information that Google’s Gemini AI program gleaned from various sources around the web. In addition to the summarized information, the AI overview is also supposed to show users multiple relevant links from which the AI got its information to create the overview. According to the official Google Blogspot, the aim of the new technology is to provide people with quick answers to common queries without them having to comb through different webpages. Sound like the future? Only if your future includes eating rocks, putting glue on your pizza, and staring at the sun for 15 to 20 minutes a day.
The rollout of the new AI overviews program was intermediately met with backlash from all corners of the internet for showing viewers incorrect information, sometimes falsely attributed to trustworthy sources. The worst part was that this misinformation, due to the way the AI overview system works, was the first thing people were greeted with after pulling up their search results. Many people took to social media to share their funniest, and most concerning, examples of the Google AI spouting nonsense it got from sources on the web. Two of the most popular examples were the AI’s answer to the question “How many rocks should I eat a day?” (at least 1 small rock per day according to the AI overview) and its suggestion of adding a little non-toxic glue to pizza cheese to increase its stickiness. Although misinformation has and will always exist through Google searches, never before has such incorrect information been actively promoted through the search engine.
The problem stems from the AI’s basis as a Language Learning Model (LLM), meaning that to generate an overview the AI doesn’t have any method of distinguishing if its information is from genuine sources or from satire sites (like the Onion, where the recommendation for consuming rocks came from) and random forum users (the suggestion to add nontoxic glue to pizza came from an 11-year-old Reddit comment). The AI only makes its overviews using the most popular information available across the web, regardless of source credibility. The basis of an LLM is to generate answers that are coherent and make sense, not to give answers that are objectively true. This can lead to the AI creating “hallucinations” where the AI creates data, not from any outside source, to fill in the gaps in its knowledge in its attempt to create a coherent answer, regardless if this self-created data is true. The issues with Google’s AI overviews, although hilarious in some cases, exposes a deeper issue in not just Google AI, but all modern generative AI systems.
Google has, of course, responded to the backlash. A Google spokesman publicly stated that: “The vast majority of AI Overviews provide high quality information, with links to dig deeper on the web. Many of the examples we’ve seen have been uncommon queries, and we’ve also seen examples that were doctored or that we couldn’t reproduce.” Although some might see this as sweeping the issue under the rug, there is some truth to this statement. Many of the examples of the AI’s funny mistakes come from users asking oddly specific questions and requests. In cases where there’s a lack of popular data for the AI to skim to form its overview, known as a “data void”, it’s much more likely Google will use obscure, unverified sources to generate its responses, such as random Reddit comments. For the already commonly asked questions on Google, the AI was able to generate a concise, factually accurate result with few issues.
Among all of the very real goofs from the AI overview, users from X (formerly Twitter) and Reddit have taken to fabricating their own AI overview responses and passing them off as genuine. Often, the widespread examples of these fake AI overviews are much darker, such as the Reddit user who shared an image showing Google recommending people to jump off the Golden Gate Bridge as a remedy for Depression. The post even caught the attention of The New York Times, who reported the post as a real AI result and had to later publicly apologize for the error. The fact that even a major news organization was willing to entertain Google AI recommending such a drastic course of action speaks volumes about the whole situation.
Computer experts within and outside of Google have suggested multiple ways of at least alleviating the AI’s issues when generating overviews. However, due to the nature of the Google AI drawing from all corners of the web, along with AI’s inability to understand context and nuances in language like sarcasm, the opportunity for it to spout incorrect information as fact will always be a possibility. The company hopes to institute some kind of automatic fact-checking system within the AI, but admits the easiest solution would be to hire cheap human labor to manually fact check some of the most commonly asked questions on Google to maintain accuracy. They have already manually fixed many of the most frequently shared incorrect AI overviews, and in the past few weeks Google has begun implementing safeguards like limiting “user-generated content”, (Quora and Reddit comments) filtering out oddly specific questions that are likely to get a nonsense answer from the AI, and limiting the inclusion of known satire sites like The Onion and Babylon Bee.
While it is a start, the safest thing remains conducting your own research using multiple trustworthy sources instead of solely relying on AI, whether it be from Google, OpenAI or any of the swath of tech companies trying to use AI as a workaround for traditional information gathering. The ability to distinguish legitimate sources, known as the “sanity checker”, the BS detector, among other things, remains outside of the realm of artificial intelligence. In other words, it’s still up to you to know that staring at the sun for any amount of time is bad for you.