Human Statistical Learning 

 If it's of interest, I will be blogging every so often about the numerous ways that humans seem to be remarkably adept statistical learners.  This is a big question in cognitive science for two reasons.  First, statistical learning  looks like a promising approach to help answer the open question of how people learn as well and as quickly as they do.  Second, better understanding how humans use statistical learning may be a good way to improve our statistical models in general, or at least investigate in what ways they might be applied to real data. 

 One of the more impressive demonstrations of human statistical learning is in the area usually called "implicit grammar learning."  In this paradigm, people are presented with strings of nonsense syllables like "bo ti lo fa" in a continuous stream for a minute or two.  One of the first examples of this paradigm,  by Saffran et. al. , studied word segmentation -- for example, being able to tell that "the" and "bird" are two separate words, rather than guessing it is "thebird" or "theb" and "ird."  If you ever listen to a foreign language, you realize that word boundaries aren't signaled by pauses, which is a huge problem if you're trying to learn the words.  Anyway, in the study, syllables occurred in groups of three, thus making "words" like  botifa  or  gikare.  As in natural language, there was no pause between words; the only cues to word segmentation were the different transition probabilities between syllables -- that is, "ti" might be always followed by "fa" but "fa" could be followed by any of the first syllables of any other words.  Surprisingly, people can pick up on these subtleties: adults who first heard a continuous stream of this "speech" were then able to identify which three-syllable items they heard were "words" or "nonwords" in the "language" they had just heard.  That is, the people could correctly say that "botifa" was a word, but "fagika" wasn't, at an above chance level.  Since the only cues to this information were in the transition probabilities, people must have been calculating those probabilities implicitly (none had the  conscious  sense they were doing much of anything).  Most surprisingly of all, the same researchers demonstrated in a follow-up  study  that even 8-month old infants can use these transitional probabilities as cues to word segmentation.  Work like this has led many to believe that statistical learning might be one of the most powerful resources infants use during the difficult problem of language learning. 

 From the modeling perspective, this result can be captured by Markov models in which the learner keeps track of the string of syllables and the transition probabilities between them, updating the transition probabilities as they hear more data.  More recent work has begun to investigate whether humans are capable of statistical learning that cannot be captured by a Markov model -- that is, learning  nonadjacent  dependencies (dependencies between syllables that do not directly follow each other) in a stream of speech.  For instance, papers by  Gomez et. al.  and  Onnis et. al.  provide evidence that discovering even nonadjacent dependencies is possible through statistical learning, as long as the variability of the intervening items is low or high enough.  This has obvious implications for how statistical learning might help in acquiring grammar (in which many dependencies are nonadjacent), but it also opens up new modeling issues, since simple Markov models are no longer applicable.  What more sophisticated statistical and computational tools are necessary in order to capture own unconscious, amazing abilities?