IT’S THE THIRD TUESDAY IN OCTOBER, and the whole world is watching. President Obama and Mitt Romney are about to square off for their second debate. Will Barack Obama redeem himself after his listless first outing? Will Romney continue his extreme-to-moderate makeover? To chew over these questions endlessly, CNN has mobilized its vast team of talking heads, arraying them in clusters throughout its hangar-like studio. Periodically, the network cuts to a separate studio, seeking insight from 35 undecided voters who appear as trapped and beleaguered as the Donner Party in the snows of the Sierra Nevada.
That night Deb Roy is sitting at home with his wife, watching the debate on television like the rest us. Or at least the “second screen” viewers among us who no longer restrict our conversations about what we’re watching to the people sitting next to us on the couch. More than 40 percent of smartphone and tablet owners use their devices while watching television at least once a day, according to a Nielsen survey published in April.
As the debate unfolds, the 43-year-old Roy keeps his iPad on his lap, pausing often to read his wife the insights from the 100 people he follows. “Being able to have this social soundtrack come into our living room,” he says, “has completely changed the experience of television.”
Roy has always been fascinated by technology. When he was in grade school in Winnipeg, Manitoba, in the late 1970s, he walked into a RadioShack and was dazzled by a new personal computer on display. The boy strode up to the keyboard and typed “HELLO.” On the screen popped the message “SYNTAX ERROR.” He tried “HI,” but got the same response. In a way, Roy hasn’t stopped trying to get computers to understand him since.
The son of a town planner who had emigrated from India to Canada, Roy spoke Bengali before English. While his older sister was an accelerated student who kept her head buried in the books, he was content to get B’s. Skipping studying left him plenty of time to conduct an endless supply of experiments at home. By his early teens, he had built robots, fireworks, a reading device for the blind, and a rudimentary speech-recognition tool that could understand words in English, Bengali, and German.
In 2004, when Roy’s wife, Rupal Patel, learned she was pregnant, he approached her about turning their split-level Arlington home into a round-the-clock research lab, with their newborn serving as the sole subject. Since they’d broadly discussed this kind of project years earlier, Patel was an easier sell than most spouses. While Roy focused on teaching robots human language as a professor at the MIT Media Lab, his wife taught speech pathology and computer science at Northeastern University. She had contributed to his PhD research, and they shared an interest in observing a child in a natural setting, all in the hopes of unlocking the mysteries of how kids learn to talk.
First, they agreed to lots of privacy controls. Even though every room in the house — including the master bedroom and the bathroom — had a color camera with a fisheye lens planted in the stucco ceiling, individual room cameras could be turned off easily. What’s more, each was equipped with what Roy calls an “oops” button, which would erase the previous stretch of video if he and his wife felt it had caught something embarrassing.
The multiple cameras of this Human Speechome Project rolled for three years, amassing about 250,000 hours of footage — an archive Roy calls “the world’s largest home-video collection.” And Roy and his research team soon found some fascinating results in their mountains of data. They learned it didn’t make sense to focus solely on the child, but instead the interplay between him and his parents (and nanny). Tellingly, the caregivers’ language became less complicated the closer the child got to speaking each new word, then gradually grew more complex afterward. Without realizing it, the caregivers were essentially dumbing down their language to meet the boy halfway.
The researchers also learned that a word’s association with a pattern of activity or certain place in the house was a far more robust predictor of how quickly Roy’s son would learn that word than was the frequency with which he heard it. The boy was much quicker to learn words with a very specific meaning — “mango,” for instance, which he usually ate in the kitchen — than those like “water,” which could mean a drink in the kitchen, bath water in the tub, or the rain outside. Context, it turned out, was hugely important.Continued...