Pattern detection and language errors

One thing I’ve been thinking about as T learns new words is how our brains are wired to detect patterns and how useful that is for learning language.

Human brains are AMAZING at detecting patterns – and, thinking about and watching T learn new words, it’s so clear to me how important pattern detection is. For example, T has primarily learned the word “horse” (he doesn’t say it yet, but will make a clip-clopping noise when he sees one in a book) by looking at pictures of horses in books and playing with a toy horse that we have at home. But, all of these horses are different – some are realistic photos, some are cartoonish and colored unrealistic colors, some are more realistic cartoon drawings, etc. So, T has learned to generalize across all these different instances of horses to learn some pattern like “a horse is an object with four legs, a longish neck, and a mane, and it makes a clip-clopping noise.” Thinking about this is kind of amazing to me!

But, sometimes it’s possible to learn a pattern incorrectly – for example, by learning a pattern that is a bit too broad. An example of this might be if T had learned the pattern “a horse is an object with 4 legs and is sometimes brown” – this pattern might lead him to identify a picture of a cow as a horse (which sometimes happens with one particular cow in a book that we have :)).

T frequently makes pattern errors that both amuse and interest me. One of the more humorous ones involves his identification of pictures of my father (T’s grandfather, whom we call “thatha” – “grandfather” in Tamil). My father wears transition-lenses glasses (so they frequently look like sunglasses, even inside). T recently saw a picture of Ray Charles and insistently pointed at the photo yelling “Thatha! Thatha!” I guess T’s “pattern” for his grandfather is an older man who wears sunglasses!

One of the things about T’s errors that interests me is which words he tends to make more “errors” with. I think that he tends to “correctly” use nouns much more than non-nouns, and thinking about this in terms of pattern detection, I think this makes sense. I think that the “pattern” for most nouns is generally easier to deduce than for non-nouns. For example, “ball” is a fairly concrete clear concept, compared to, for example, “up” and “down.” “Up” and “down” are used in so many different contexts – lifting T up and down, picking something up off the floor, going up and down steps, etc, whereas “ball” is basically a round toy (although T will sometimes call fruit like melons balls!).

And, I think that T tends to have more interesting interpretations (and by this, I mean broad!) for when words like “all done” and “bye bye” should be used. Perhaps this is because he’s still trying to learn the “pattern” for these words!



T (13 months) has been playing around with lots of different vowel sounds lately. For awhile, it seemed like he was mostly saying “ahhh” and “aaa,” but lately, he has added “ee,” “ayy,” “oh,” and “uh.” Some cute highlights – he’s started saying “uh oh” – usually while looking us in the eye, grinning, and throwing food/utensils off his high chair and pointing for us to pick them up (I don’t think that’s exactly an “uh oh”?!). And, out of the blue, he’s started singing “E I E I O”! I THINK this is because of the song “Old McDonald,” but I don’t usually sing this to him, so I’m not totally sure where he learned this (daycare?!). He’s also started very reliably saying “ohhh” to get us to open a box, which is something we’ve worked hard on with speech therapy, and it finally clicked this week!

I’ve noticed that, with all of these new vowels, T tends to say the vowel in isolation, rarely combining it with a consonant (when T says consonants, they tend to be combined with his earlier mastered vowels, like “aaa” and “ahh” – so he will say “ba,” “da,” etc.). But, after a few days of experimenting with his new vowels, I’ve noticed that T has just now started combining them with consonants – and, interestingly, he seems to mostly be combining them with “d.” So, he will babble “doh,” “duh,” and “die” now. I wrote here about how I thought T’s favorite consonant was “d” (based on when he first said it and how frequently he says it), so I wonder if he’s starting to combine his new vowels with “d” because it’s his favorite and/or the easiest for him to say? If that’s the case, I predict that we will next start to hear “bye” and “no!”

T has also started to pick up a few new words that he will say pretty reliably – he will say “buh buh” while waving goodbye, and just in the past two days, has started saying “all done” (pronounced “ah duh”). He has started saying “mama” quite a bit, as well, but I think I might have inadvertently taught him that the word for photograph is “mama” – I may have been a bit overzealous pointing out myself in photographs, and I’ve noticed that T will now excitedly point to ANY photograph, regardless of whether I’m in it, and shout “mama”!

“Sssss” is for “Snake”

I wrote here about working with T on identifying and producing the Ling-6 sounds (“mmm,” “ooo,” “eee,” “ahhh,” “shhh,” and “sss”). T has made some great progress with producing these sounds!

Now, he very consistently will say “mmm” when he picks up a toy ice cream cone, and, he will even say “mmm” when he sees a picture of ice cream in one of his books! (However, he is not at all interested in eating ice cream – strange!).

But, the latest, and I think most exciting development, is that T has now started pretty saying “ssss” when he picks up the snake toy! “Ssss” is a tricky sound – it’s tricky to hear because it’s a relatively high frequency sound, which is usually difficult for people with hearing loss to hear. It’s also tricky to say – even children with normal hearing tend to take longer to consistently make this sound. Although right now T only says “sss” when he picks up the snake toy and won’t say it if I prompt him, it’s exciting that he can hear this sound and has started to figure out how to produce it!

First Word!

We’re calling it! T’s first word is “bubble,” pronounced “ba-buh” or “ba-bwah.”

I wrote here about how we weren’t sure if T’s attempts at “bubble” counted as a word. Since then, T has started pointing to the bottle of bubbles and shouting “ba-buh!” to get us to blow bubbles. It’s clear that he’s trying to say “bubble,” and he’s matching the first consonant and first vowel, and he’s trying to get us to do something, so we’re counting it as a word!


What counts as a word?

I was talking to two friends recently who have babies who are close in age to T (11 months), and we talked about how you know when your baby said their first word. Before I had a baby, I never thought this would be so hard to determine! It’s pretty easy to pinpoint exactly when your baby rolled over or started crawling, and it seemed like first words would be equally easy to identify. Based on our experience, and talking to my friends, it seems that this isn’t so easy!

T has been babbling “mama” and “dada” for a few months now, but it doesn’t seem like he ascribes any meaning to those (he rarely says “mama,” but I would say his favorite utterance is “dada” – he’ll shout “dada” not only when he sees my husband, but also when he sees himself in the mirror, when he notices we’ve left the baby gate open, when he sees the vacuum cleaner, at random people walking on the sidewalk, or just randomly when he’s talking to himself – so I don’t think he’s attached “dada” to his dad). Based on that, I don’t think “mama” and “dada” count as words for T yet, since they seem more like he’s just playing around with making different sounds.

There are two things T says that seem closer to being “real” words. First, when we play with bubbles, T will pretty reliably shout “baba!” to get us to blow on the bubble wand. Secondly, there’s a book that T loves (“Dear Zoo”) that has a lion in it – T has been OBSESSED with the lion since the day he first saw it. For the past week, T has been starting to shout “LYYYYY” as soon as we turn to the lion page. Do these count as “real” words?!

I’m not sure! In the paper I reviewed here about infants’ transitions from babbling to words, the researchers considered a child’s utterance a word if: 1) what the child said matched the “real” word by at least 1 consonant and 1 vowel; 2) the utterance was communicative (e.g., directed at someone); and 3) it was clear the child was attempting a word (e.g., referring to a a particular object, imitating the parent, or the parent recognized what the child was saying).

Based on these criteria, it seems like “baba” for “bubble” and “lyyy” for “lion” might be considered words – T will clearly say these things in specific contexts (when he wants bubbles or when he sees the lion on the page), and in those contexts, we are interacting with him. They also match the “real” word (“bubble” or “lion”) by the beginning consonant and the subsequent vowel.

But, my hesitation in considering these first words for T is that he will say “baba” and “lyyy” either talking to himself or in contexts that have nothing to do with bubbles or lions. Also, he hasn’t generalized the concept of a lion to lions other than in this specific book – there’s another book that we read that features a lion (“Goodnight Gorilla”), and T has never said “lyyyy” when he sees that lion – so can that really count as a word if he doesn’t understand the concept of a lion? I don’t know!

One last little story about T and words! There was a book that T went nuts for that we checked out from the library – “The Naked Book.” He would start grinning, squealing, and kicking as soon as we pulled it out and he saw the cover. I think we renewed it from the library about 8 times before we finally returned it. Anyway, one time I brought it out, and, when T saw the cover, he shouted what I could have sworn was “NA-GUH!!!” – or, a pretty good approximation of “naked!” I couldn’t get him to repeat it over the next few days, though. I’m kind of relieved – I don’t know what I’d do if T’s first word had been “naked” (I guess lie in his baby book and say his first word was “mama”?!).


Article Review – “Musician Advantage for Speech-on-Speech Perception”

Today, I want to talk about a recently published article (full text here) that isn’t directly related to babies or hearing loss, but that I found really interesting and wanted to share! The article is “Musician Advantage for Speech-on-Speech Perception.” (Baskent, D. and Gaudrain, E. “Musician Advantage for Speech-on-Speech Perception.” J. Acoust. Soc. Am. 139, EL51. 2016).

Also, this paper got some great publicity in Scientific American!


Anyone who’s tried to have a conversation in a crowded bar or restaurant knows that understanding what one person is saying when there’s background noise of other people talking is one of the hardest listening tasks (and one that people with hearing loss struggle the most with!). One of the challenges of understanding speech in the presence of other, competing speech is segregating the different people talking to be able to focus on the one person you want to hear (I talked a bit about differences between babies and adults in this type of task here).  This problem is often called the “cocktail party problem” – that is, if you’re in a noisy, crowded environment with other people talking, being able to understand  what one person you’re having a conversation with is saying.

The authors of this study hypothesized that musicians would be better able to understand speech in the presence of other, competing speech better than non-musicians. If musicians ARE better at understanding speech-on-speech, this might be for a few different reasons. First, musicians are better at identifying subtle changes in pitch (something they do all the time to know if they are playing something correctly and in tune!), and this might be really helpful for separating multiple speech streams. For example, they might be able to use pitch differences to group words that they hear as belonging to different voices. Secondly, over decades of practice, musicians hone their “listening skills” – so it might be that they are just better at shifting their auditory focus to what they want to hear than non-musicians.

So, the researchers first wanted to see if the musicians had an advantage at all. They also wanted to know, if the musicians did have an advantage, if the advantage seemed to be related to their better ability at detecting pitch changes, or if it seemed to be more generally related to an increased ability to shift focus to different speech streams.

The Study

The researchers tested 18 musicians and 20 non-musicians on their ability to understand a sentence (the target) in the presence of one competing talker (the masker) – so the subjects had to understand one person talking who was competing with a second person talking. In order to qualify as a musician for this study, participants had to have had 10+ years of training, began musical training before they were 7 years old, and had to have received musical training within the past 3 years.

To probe whether musicians were more able to take advantage of subtle pitch changes than non-musicians, the researchers manipulated how different the target sentence was from the masker sentence in 2 ways:

  1. The fundamental frequency (F0) – the fundamental frequency (F0) indicates the voice pitch of a person’s speech. So, men generally have lower F0s than women, children have lower F0s than adults, etc.
  2. An estimated Vocal Tract Length (VTL) – The vocal tract is a cavity that filters sounds that you produce – in a very simplified view, it’s kind of like a tube that goes from the vibrating vocal folds at one end to your mouth at the other end, and it helps shapes different sounds that you produce to make them sound like different vowels or consonants. The length of the vocal tract varies across people – children have shorter vocal tracts than adults, and men generally have longer vocal tracts than women. VTL doesn’t directly affect voice pitch (like F0), but it changes other frequencies in speech sounds (the formants – definitely getting a bit technical, but really interesting!). If you have two recordings of people talking and they have the same F0 but different VTLs, the pitch (how high or low their voice is) will be the same, but the quality and characteristics of their voice will sound different – that’s the VTL at work!

The researchers used some fancy software to manipulate the F0 and VTL of the target sentences and the masker sentences so that, in each trial the subjects listened to, the target and masker sentences were more alike or less alike. They measured how well musicians and non-musicians were able to understand the target sentences based on how similar the target sentence was to the masker sentence in terms of these two parameters.

And here are the results!

FIG. 1A (reproduced below) shows the average percent of the sentence the subjects correctly repeated back with various differences in VTL and F0 between the target and masker sentence. The leftmost panel shows the smallest difference in VTL between the target and masker sentences (in the leftmost panel, there was no difference in VTL), and the rightmost panel shows the largest difference in VTL between the target and masker. Within a panel, going left to right increases the F0 difference between the target and masker sentences (so, within a panel, the leftmost points are where the target and masker sentences had the same average voice pitch as each other).

The data from the musicians is shown in purple and the data from the non-musicians is shown in green.


FIG. 1A from Baskent and Gaudrain


As you can see, both musicians and non-musicians were better able to understand the target sentence when the target sentence was “more different” than the masker sentence – if you look at the leftmost points in the leftmost panel (the hardest condition where there was no difference in F0 or VTL between the target and masker sentences), musicians had about 70% intelligibility and non-musicians had about 55% intelligibility. However, looking at the rightmost points in the rightmost panel (the easiest condition where there was the largest difference in both F0 and VTL between the target and masker sentences), both musicians and non-musicians did really well – better than 90% intelligibility. This makes a lot of sense – it’s easier to understand what a (high-pitched) child is saying when their speech is competing with a deep-voiced man compared to trying to understand what one child is saying when their speech is competing with another child.

And, regardless of how different the target and masker sentences were, musicians performed better than non-musicians – and a fairly substantial difference – you can see that the purple points are generally ~15-20 points higher than the green points.

Recall that the researchers wanted to know if a musician advantage was due to the musicians’ ability to detect very subtle pitch differences. Based on this data, it seems like the musician advantage might not primarily be due to musicians’ better pitch perception – in FIG. 1A above, the purple (musician) and green (non-musician) lines are parallel to each other, indicating that both groups were deriving equal benefit from larger pitch differences (larger differences in F0). So, it might be that the musicians are better than the non-musicians at focusing their auditory attention – after all, musicians do this all the time when they practice; for example, a musician in an orchestra has to both listen to what their section is playing as well as what the other sections are playing.

My Reflections

I couldn’t help relating the results of this study to my personal experiences! I started playing the violin and the piano when I was little (~6 years old), and played through college, although I haven’t played regularly since I finished college (many years ago).

I’ve long suspected that I’m much better at understanding speech in noise compared to my husband, G. (This is just a gut feeling, we haven’t thoroughly confirmed this). For example, when G and I go out to eat, I’m usually much better at simultaneously listening to him while eavesdropping on conversations next to us. If G wants to eavesdrop, he’ll have to stop talking to me and stop eating to focus his attention on what the people next to us are saying (while trying hard to look like he’s NOT paying attention to what they’re saying!). So, maybe it’s my childhood musical training that’s given me an edge here!








Whispered Speech

Several weeks ago, we checked out a book from the library that T LOVES (“Mr. Brown Can Moo, Can You?” by Dr. Seuss) – he loved it so much that we actually ended up renewing it several times in a row! Anyway, there’s a part in the story that we read in a whispered voice, but, whenever I whispered, I thought T’s reaction was sort of interesting – it sort of seemed to me like he wasn’t really recognizing the whispered speech as speech. It seemed like even though he was hearing it, he maybe didn’t understand that the whispers included words just like in “normal” voiced speech.

A lot of studies (like this one) have shown that adults have more trouble understanding whispered speech compared to normal speech. This makes sense, since whispered speech lacks a lot of the frequency information in normal speech. Since babies and children require a lot of time and experience to reach adult-like levels in their language skills, I thought that babies might lack the ability to understand whispered speech – I wondered if T (10.5 months) is able to recognize whispered speech at all. So, we did a little experiment!

When researchers test adults to see how well they can understand different types of speech or speech in different types of noise, they play words or sentences and ask the research subjects to repeat back what they hear. T obviously can’t do that! So, I recorded myself saying 4 different names – T’s name and 3 other names – in a normal voice and in a whisper. I then randomly played these names to see whether T was more likely to look up when he heard his own name compared to the other names. Since T knows his own name, if he looked up more for his own name compared to the other names, that might be an indication that he was understanding what he heard.

We had a bit of trouble conducting our experiment. T was very interested in participating, but wanted to take a more active role than that of research subject, and he was particularly keen to tweak the software running the experiment from my laptop. My husband (G) ended up distracting T with toys while I played different names and recorded whether or not he looked up when he heard them. (This is totally not how this experiment would be conducted in a real lab!). We also didn’t get as many trials as we wanted, since T quickly tired of his mom sitting on the couch staring at him.

With that being said, here are the results! The graph below shows the proportion of times T noticed when I played his name compared to the 3 other names for normal (voiced) speech on the left and whispered speech on the right. For the normal speech, T looked up every time (it was cute –  he tended to have a bit of a delayed reaction and would look up surprised but happy someone had called him), and only about 1/3 of the time for the other names. For the whispered speech, T was much less consistent, and in fact, he often didn’t look up at all, for any of the names, but would sometimes just pause in what he was doing rather than looking up.


Since T looked up every single time when he heard his name said normally and hardly at all for the other names, I’m pretty confident T recognizes his name!

Although these results seem to suggest that T has more trouble understanding whispered speech than normal speech (since he didn’t react much to hearing his name when whispered), I’m not confident of that. Although T seemed to barely noticed the whispered speech during our experiment, G and I spent a lot of the weekend whispering T’s name to see if he looked up, and it seemed like he did. It might be that during our experiment, T was too distracted, or maybe, had grown tired of looking up when hearing his name just to see me staring at him and marking stuff down.

This is my second little experiment with T with inconclusive results (here’s the first) – babies are difficult to get good data from!