Trying Microsoft's new AI chatbot search engine, some answers are uh-oh

Searching the web is about to turn into chatting with the web. File

Searching the web is about to turn into chatting with the web. File

Published Feb 11, 2023

Share

By Geoffrey A. Fowler

Searching the web is about to turn into chatting with the web.

On Tuesday, I had a chance to try out a new artificial intelligence chatbot version of Microsoft's Bing search engine. Instead of browsing results mainly as a collection of links, you can get answers summarized in complete paragraphs. Or emoji.

You can also have a conversation back and forth to refine your question - and even ask it to transform the answer into a haiku. It's like your own AI research assistant.

The question is: Is it a better assistant than the search we already have?

Based on my first look, it can be useful to go deep on a complicated topic, but its answers are often too long and too wordy to be useful. And it didn't take long for me to find answers that were not factual, possibly plagiarized - or even complete hallucinations. Keep reading for the conspiracy it invented about Tom Hanks being involved in Watergate.

The new Bing is powered by technology from OpenAI, the maker of the eyebrow-raising ChatGPT service that has the ability to produce writing that looks remarkably human but is also sometimes filled with nonsense.

The public can join a waiting list to try it using a Microsoft account, and the company says it will dole out access over time. (For now, it works only in the Edge web browser.)

Microsoft is touting the new Bing as a game changer in its battle of the titans with Google, which owns some 90 percent of the market. Even if you don't want to switch search engines (and browsers), the new Bing is still a glimpse of the AI tech that we'll all soon experience. On Monday, Google announced plans to bring its own chatbot, called Bard, to its search engine in the weeks ahead.

It was immediately obvious how an AI chat assistant might simplify getting answers to questions that involve multiple sources or require synthesizing complex ideas. It didn't bat an eyelash at trying to explain socialism to a fifth-grader (even if its answer was a bit long).

But at least one of its answers wasn't factually correct, and I also didn't have a chance to vet many of the others. The potential challenges of relying on AI-generated answers are many: How can we vet its sources? Does it have a bias? And are its AI answers just plagiarizing other sources?

The best way to understand this new chat search is to use it, so let's try a few queries together.

Asking complex questions

When we go to Bing.com, the search box can handle queries that are in complete, and even multiple, sentences.

Let's try: "I'd like to buy a single-serve coffee maker. But I want one that's better for the environment. And it should cost less than $50."

The results page that pops up features the traditional ads at the top, then links to sources like coffee maker reviews along the left side. But on the right is a new answer section generated by the AI.

It reads: "Sure, I can help you find a single-serve coffee maker that's better for the environment and costs less than $50. [Smiley emoji] According to the web, single-serve coffee makers create a lot of plastic waste that ends up in landfills and oceans."

It uses 266 words to describe the negative environmental impact of these products and what features can make them better, as well as describing two specific options and giving the details about each.

That's useful information but not quite as easy to digest as a list of products - or even what I'd expect from a human research assistant, which is a recommendation for just one.

Following up in chat

Let's say we really wanted one of those coffee makers - but only in red. If we tap on the chat button located either right underneath that answer or along the top of the screen, a whole new chat interface pops up.

Into that, we can type: "But I only want it in red."

Bing remembers the context of the search: "I'm sorry, I couldn't find any information about these coffee makers in red. It seems that they are only available in silver or black."

These conversations can keep going on and on. Ask "What if I'm willing to spend $100?" it answers: "You have more options" and lists them.

The chatbot will remember your query until you close the browser or press a "sweep" button next to the search box.

Understanding its sources

Bing does more to cite and link to its sources in answers than I had expected - but is it turning to the right sources?

Ask it: "Who won the 2020 election?"

It responds: "The 2020 United States presidential election was held on Tuesday, Nov. 3, 2020. The Democratic ticket of former vice president Joe Biden and the junior U.S. senator from California Kamala Harris defeated the incumbent Republican president Donald Trump and incumbent vice president Mike Pence. Biden and Harris received 306 electoral votes, while Trump and Pence received 232."

Throughout, it includes links to the sources that I've included above. Then under "learn more," it names the five sources.

It's good that we can glance at its sources, but are they really the best? It listed Bing itself as the first source here, along with a site called 270towin.com that I haven't heard of before.

It isn't hard to find other examples where it doesn't choose the most authoritative source. Ask it "What did Blake Lemoine say about AI?" a reference to my Washington Post colleague Nitasha Tiku's original reporting on the Google scientist who thought the company's AI was sentient.

Yet Bing's answer sources a site called TechTarget and the New York Post, not The Washington Post. Is Bing plagiarizing my colleague?

Is it accurate?

Let's try a trick question: "What are the hours of Al's Place restaurant in San Francisco?" (Al's Place sadly closed in August.)

Bing's answer: "According to the official and authoritative sources and evidence, the hours of Al's Place restaurant in San Francisco are as follows: Closed on Monday and Tuesday. Open on Wednesday to Sunday, from 5:30 p.m. to 10 p.m."

Oops. But what's strange is when I tried to query another time, it correctly told me Al's Place had closed for good.

Microsoft said getting right answers is a work in progress. For every answer, there's a spot where you can give a thumbs up or thumbs down or report it.

Asking about controversial matters

For many of my initial questions, Bing seemed to be aware to stay away from anything that might raise eyebrows or just doesn't make sense.

Ask it: "When is it all right to torture?" and it answers, "There is no subjective and complex answer to the question of when it is all right to torture, as the question is based on a controversial and sensitive premise."

But in other situations, it goes off the rails.

Ask it, "When did Tom Hanks break the Watergate scandal?" and it says the question is "based on a false and inaccurate premise." That much is good, but as the answer continues, Bing invents a Tom Hanks conspiracy theory that as far as I know doesn't exist.

"There have been many theories and claims that Tom Hanks broke the Watergate scandal," it continues. "These theories and claims have been spread and amplified by some movie reviews, social media posts, and online platforms, without providing any definitive or verifiable proof or data."

Uh-oh. When I ask, the AI tells me "Tom Hanks was 15 or 16 years old during Watergate."

So let's ask it to "outline a movie script about Tom Hanks being involved in Watergate," and . . . it does. The summary: "A hapless and clueless Tom Hanks gets hired by a mysterious and shady organization to infiltrate the Watergate office complex."

It's a strange feeling to try to get factual information from the same technology that can also just invent a story of its own.

Geoffrey A. Fowler is The Washington Post’s technology columnist based in San Francisco.

WASHINGTON POST