Artificial intelligence might be able to learn a language similarly to a baby, even without any prior knowledge of grammar or social interactions, researchers from NYU’s Center for Data Science found in a study published on Feb. 2.
In the study, a baby spent over a year wearing a head-mounted camera, which captured audio and video recordings. Researchers then fed the 61 hours of video footage and 37,486 “utterances” of a parent speaking to the child into an AI model and asked it to identify objects from the footage as well as the same objects in other images. The AI model was able to associate words with their corresponding objects, especially ones it had encountered more frequently such as apples and cribs. The model did, however, struggle to identify knives.
Brenden Lake, one of the head researchers in the study, said that there is a significant data gap between AI systems and children in language learning. He said that top AI models train on large amounts of text, often reaching trillions of words, while a child would require about 100,000 years to be exposed to a similar level of language.
“Because of this gap, researchers have been skeptical that recent AI advances can tell us much about human learning and development,” Lake said. “The ideal experiment involves training an AI model, not on enormous data from the web, but only on the input a single child receives.”
Lake added that the current AI model does not have the same vocabulary and word-learning capabilities of a typical 2-year old, noting that the model is not able to taste, touch, smell or learn actively. He said “it would be hugely exciting to add some of these components and see how much closer models can get to child-like language learning.”
Wai Keen Vong, another researcher in the study, said it is one of the first to use such a large amount of data from a single child. Vong added that, rather than learning from the internet, children learn language through their own experiences, making it a challenge to demonstrate language learning within what he called a “restrictive context.”
“For a long time, even though we’ve seen these advances from AI, a lot of people in the field just haven’t been convinced that they’re doing anything sensible because they’re just fed so much training data,” Vong said. “This is kind of the perfect experiment in a way — it’s as close to one child’s experience of actually having to learn language to get at the heart of a question that people have been debating for decades, even centuries — where is knowledge coming from and how can you go from raw sensory experience into a tangible, abstract understanding of the world.”
Contact Antonia Ang and Max Getty at [email protected].