For Search, Facebook Had to Go Beyond ‘Robospeak’
Human behavior is Facebook's business.
Its success is based on understanding how people are wired: how they present themselves, what they remember, whom they trust, and now, how they seek information.
Facebook this month introduced a search tool to help users find answers to many kinds of questions. But before it did, it assembled an eclectic team to scrutinize what users were searching for on the site — and how.
(Read more: Facebook Rolls Out Social Search Feature.)
The team included two linguists, a Ph.D. in psychology and statisticians, along with the usual cadre of programmers. Their mission was ambitious but clear: teach Facebook's computers how to communicate better with people.
Kathryn Hymes, 25, who left a master's program in linguistics at Stanford to join the team in late 2011, said the goal was to create "this natural, intuitive language." She was joined last March by Amy Campbell, who earned a doctorate in linguistics from the University of California, Berkeley.
When the team began its work, Facebook's largely ineffective search engine understood only "robospeak," as Ms. Hymes put it, and not how people actually talk. The machine had to be taught the building blocks of questions, a bit like the way schoolchildren are taught to diagram a sentence. The code had to be restructured altogether.
Loren Cheng, 39, who led what is known as the natural language processing part of the project, said the search engine had to adjust to the demands of users, a great variety of them, considering Facebook's mass appeal. "It used to be you had to go to the computer on the computer's terms," Mr. Cheng said. "Now it's the user."
The heart of the research took place in a lab at the Facebook offices here. Hidden behind one-way glass, team members watched users playing with different versions of a search engine and filled notebooks with observations. On occasion, the engineers tore out their hair.
They consulted dictionaries, newspapers and parliamentary proceedings to grasp the almost infinite variety of ways people posed questions. Then they trained the algorithms to understand what was meant. They tested tweaks to the search tool, as they do with every product, and measured how certain groups of people responded.
The project represents how Facebook builds products. It studies human behavior. It tests its ideas. Its goal is to draw more and more people to the site and keep them there longer.
What it builds is not exactly a replica of how people interact offline, said Clifford I. Nass, a professor of communication at Stanford who specializes in human-computer interaction. Rather, it reflects an "idealized view of how people communicate."
"The psychology they are drawing on is not pure psychology of how humans communicate," Professor Nass said, "but the psychology of what makes people stay around, spend time on site and secondarily, what makes people click the advertisements."
(Read more: Facebook Will Make 'a Lot' of Money Off Search: Analyst.)
It explains why there is a "like" button but not a "dislike" button; negative emotions turn people away, he said. The very principle of the like button is based on a psychological concept known as homophily: the notion that people like similar kinds of people and things.
The reason profile pictures pop up every time a Facebook friend is used in a Sponsored Story advertisement is that people remember faces better than words.
Facebook constantly tests and tweaks its features for its diverse, global audience, paying close attention to the responses. The search tool, in its first iteration, answers queries by mining some of the data at the company's disposal, including photos, interests and likes. It will eventually mine status updates and other activities, from what users eat to where they hike.
The introduction is especially slow, Facebook executives have said, so they can better test what works and what does not.
In the past, Facebook's rudimentary search engine responded to very specific queries. Say a user was trying to find Stanford students. The user had to type into the search bar: "people who attended Stanford." The search engine did not understand "people who went to Stanford" or "studied at Stanford."
Likewise, if someone were looking for friends, that person could type in "friends of me" or that person could look for vacations he or she had taken, by typing in "places visited by me."
For Mr. Cheng, the turning point came during a test session with users. They sat in pairs in a small room with laptops in front of them. Members of Facebook's user experience team coached users through an early version of the search engine. They asked users to search for a high school classmate.
The guinea pigs first typed in keywords, as they had been accustomed to doing on conventional search engines. That did not work. They typed in short sentences, then longer sentences. That did not work either. One user was asked to find friends who liked baseball, but the most passionate baseball aficionados in his network did not appear because they had liked "Major League Baseball," rather than "baseball."
The engineers who were watching these users from behind the one-way glass were frazzled, said Mr. Cheng, an engineer educated at Stanford. "If their code is not being used, not connecting with people, it drives them crazy," he said. "The engineers from that day got it. We need to restructure code."
Today, the search engine can understand 25 close synonyms for the word "student," including "freshmen" and "pupils," and another 25 slightly more distant words that suggest the same thing, including "academics." That can be combined with a time reference — current students — or more detailed descriptions — psychology majors — and all told, the search engine can recognize at least 275,000 ways to ask about "students."
The search tool has already come under scrutiny. A recent blog post on Tumblr detailed how it could ferret out several uneasy personal details, including a list of people who "like" Falun Gong and whose relatives live in China, where Falun Gong is an illegal organization.
How aggressively the new search engine will compete with Google, which dominates the search market, is unclear, as is how quickly it can spin money for Facebook. The company went public last May, in one of the costliest public offerings, and has been on something of a roller-coaster ride since. Its fourth quarter results will be announced Wednesday.
Much work remains for the researchers. The search engine, for example, still has difficulty understanding many kinds of sentences. It would be baffled by "photos John likes and that he commented on." Nor can it grasp sentences that are ambiguous when written but perfectly understandable when spoken, face to face. Note how this statement can be understood in various ways: "Sports fans of Lady Gaga play."
"Computers are bad at context," Ms. Campbell, the linguist, said. "They're bad at real world knowledge."
Even without context, Facebook is also trying to approximate real world trust. Its search engine ranks answers to every query by an awkward construct that Facebook calls "social distance." Its algorithms vet who among a user's Facebook friends the user is closest to and whose answers the user would like to see at the top of search results. The company is betting on the principle of homiphily: if it is from someone the user likes, the user may be more likely to pay attention to it — and click on the link.
"Psychology," Professor Nass said, "is cheap tricks to meet your goals."