IBM Watson Future of Search Engines

The hardest thing for a search engine to do is understand the true meaning of the words on a web site and how those words relate to the true meaning of the words someone is using in their search query. Bing has had great fun lately poking fun at how these results can go totally awry. Consider all the meanings for "salsa," then mix in a bunch of companies that use "salsa" in their name, and who knows what someone is searching for when they type "salsa" as their search query. Using the tons of data available to the search engines, they can make a pretty good guess as to what the majority of people are looking for on a "salsa" search query, but someone looking for an obscure use of the term may have to do some digging. Throw in a bunch of derivative terms and things get really interesting. "Pico de gallo" anyone?

This is where search engines have an Achilles heel. Computers are very good at doing exactly what you tell them, but not very good at interpreting what you actually meant. Consider the difference between a human interaction related to dancing and a search engine interaction...

Humans:

  • Search query: tango
  • Reply: what kind?
  • Search query: tango dancing
  • Reply: whatever that person knows about the tango
  • Search query: calypso
  • Reply: whatever that person knows about calypso dancing
  • Search query: salsa
  • Reply: whatever that person knows about salsa dancing

Computers:

  • Search query: tango
  • Reply: Free mobile video calls
  • Search query: tango dancing
  • Reply: images, wikipedia entry on tango dancing, etc.
  • Search query: calypso
  • Reply: women's clothing, mythology, music
  • Search query: salsa
  • Reply: salsa recipes (I wonder if this result was different prior to the Bing commercial)

As you can see, the computer doesn't realize that the overall subject of the conversation was dancing, where a human can interpret what you meant based on what you were already talking about. As my daughter would say, changing the conversation from tango and calypso dancing to salsa recipes is "random."

So what does this mean for the future of search engines? Search will need to become more personal and based on an individual's recent search behavior. To do that will require people to realize that future search engines will need to keep track of their search habits. This isn't really any more of an invasion of privacy than me remembering your birthday once you've given it to me (and sending you a birthday card). Google has been doing more and more personalized search over the past year or so and I applaud them for it.

Future Search engines will also need to be more adept at understanding the meaning of a page or site and not just the individual words on it. If I search for the "best salsa recipe" and the best salsa is really a pico de gallo recipe, today's search engines won't know that. (Not unless a savvy SEO consultant helps them out). But from an SEO standpoint, we would have to optimize a page or pages that compare pico de gallo to salsa and use a good link strategy to have the pico de gallo site rank above any salsa page. Today's search engines rely on the publishers to provide semantic clues rather than truly understanding what everything is about.

A final thought... One of the Final Jeopardy clues was "Its largest airport is named for a World War II hero; its second largest, for a World War II battle."  Watson probably should have gotten this one since Google would have. Even restricting the Google search results to old entries, the Jeopardy clue of the day for October 16, 2009 was this exact clue (and the #1 Google search result). So maybe Watson's next Jeopardy challengers should be Google and Bing!