For the most part, you learned to talk before you can even remember, and nobody taught you how to talk or understand what they were saying, you learned just by listening. (Maybe the meanings of some words or some grammatical constructions were explained to you, but not most of them.)

There are many different levels of language: from sounds to words to sentences to meanings to intentions.

When babies are born, they’re equally good at telling the difference between all the different sounds used in all languages, but as they learn a particular language, they learn to group together the ones that count as the same in that language. For example, in English the “R” and “L” sounds are different but in Japanese they would both be considered the same sound, whereas in English the “P” in “top” and the “P” in “pot” are considered to be the same sound but in some languages they’re considered to be different (the difference might not be obvious to you from listening, but try holding your hand up in front of your mouth - when you say the “P” in “pot” there is a puff of air that you can feel on your hand, but when you say the “P” in “top” there isn’t).

There can be a lot of variation in sounds and we can still perceive them as the same “letter” if we don’t speak a language that treats them differently. Check out this video for a demonstration: you can cut off various lengths of the beginning of the sound and it still sounds like “B”, but at a certain point, it suddenly switches to sounding like “P”.

Sounds, just like vision as you learned about a few weeks ago, are actually very ambiguous, but we can mostly figure out from context what they’re actually supposed to be. A sound that you would normally interpret as “D” might be interpreted as “N” if the person saying it has a stuffy nose. It’s also a lot easier to hear a sound if you can also see the person saying it - and the exact same sound can sound different depending on how the person’s mouth is moving! Watch this video to try it out (this effect is called the McGurk effect).

People usually talk pretty fast, and a lot of the words get blended together and pronounced very differently than if you were saying one word on its own. Listen to this recording and see if you can tell what word it is. It’s the word “little” in this sentence.

You might expect that there would be some sort of pause between words, even just a little one, but for the most part there isn’t at all. Here’s a diagram of the sound wave of someone saying the sentence “where are the silences between words”, with the words marked:

You can see that there are actually longer silences within words than between them.

So how do you even learn which things are words and which sounds just happen to be next to each other because they’re in neighboring words? One way is to keep track of which sounds occur next to each other most often. If you hear a string of sounds like this:

padotituroturoturopadotipadotituropadotituropadotipadoti

you can figure out that “turo” and “padoti” are probably the words because they occur as chunks a lot; other combinations like “dotitu” occur occasionally, when “turo” comes right after “padoti”, but not nearly as often.

Even if you can figure out which units of sound are words, how do you learn what the words mean? Sometimes someone will tell you “look, a dog!”, but rarely will anyone tell you “look, a between!” or “look, an implication!”.

Occasionally you’ll be lucky and when you hear a new word there will be one obvious object that it has to refer to, but most of the time even that won’t be the case - if you’re a baby and you hear your parent talking about “shoes”, you don’t know if that word refers to your shoes or the rug or the shoerack or any of the other objects you can see or some other thing you can’t see. But if you hear the word “shoes” lots of different times, and there are various objects around each time but there are almost always shoes, then you can learn that they go together.

Another fact you can use in learning words is that people tend to use more-informative words over less-informative ones. For example, suppose you’re a baby learning words, and there are some objects:

and suppose I’m describing the one on the right and I use a word you don’t know. Is the word more likely to mean “green” or “square”? It’s probably more likely to mean “square”, because if I had just said “green”, I could have been talking about either of the green shapes.

People tend to think of things as being in hierarchical categories - for example, maybe Jake the dog is a golden retriever which is a dog which is a mammal which is an animal which is an object - and there is a certain “basic level” of category, in this case “dog”: usually when you’re talking about a dog, saying “animal” is not informative enough and saying “golden retriever” is unnecessarily specific. So when a baby is trying to learn words, it can be useful to assume that the most common words refer to basic-level things like “dog” and not very broad or very specific categories.

The sound of a word can also sometimes be useful in determining what the word means. For example, look at these two objects:

If I tell you that one of them is called a “bouba” and the other is called a “kiki”, which do you think is which? Most people (regardless of what language they speak) agree that the blobby one is the “bouba” and the pointy one is the “kiki”, just based on the sounds of the words.

Or look at these two objects:

If I tell you that one of them is called a “flug” and the other is called a “squorafemilunt”, which do you think is which? You’d probably guess that the more complicated word refers to the more complicated object. On average, shorter words tend to be for things that are more common and predictable.

Not every single word needs to be learned separately. Some words are combinations of other words: you know what “owl-flavored” means even if you’ve never heard it before. Some words are different forms of words you already know: if you hear “tomorrow I’m going to spling”, you can easily fill in the blank in “yesterday I ___” using a different form of the word “spling”, even though it’s not a word you’ve learned before.

Once you’ve learned some words, how do you then put them together to form sentences? If you just stick words together in no particular order, sometimes people can tell what you’re trying to say - and this is what babies initially do, to some extent - but sometimes that’s not enough: if you say “dog cat bucket fall bite”, you could mean “the dog bit the cat that fell into the bucket” or “the bucket fell onto the dog so it bit the cat” or any number of other things.

On the other hand, it’s possible to make a completely grammatical sentence that just doesn’t make any sense. A famous example is that you can understand the meaning of “revolutionary new ideas spread rapidly”, but you can replace the words with equally grammatical ones to say “colorless green ideas sleep furiously”, a sentence that you can’t really interpret at all.

Some sentences have multiple possible structures. Here are some funny headlines as examples:

  • Rumors about NBA referees getting ugly
  • Environmental unit helps dog bite victim
  • Two spies sentenced to life in Missouri
  • British left waffles on Falkland Islands

Some sentences start out sounding like they have a certain structure, but then when you hear the rest of the sentence you have to rethink your initial interpretation. These are called garden path sentences:

  • Put the book on the table into the box.
  • The horse raced past the barn fell.
  • The old man the boats.

Some sentences are technically grammatical but are very hard to understand because of the way they’re structured. In the following list of sentences, the first two are easy to understand, but as the same construction gets applied more and more times, they start to not even sound like real sentences:

  • The dog ran away.
  • The dog that the cat saw ran away.
  • The dog that the cat that the girl liked saw ran away.
  • The dog that the cat that the girl that the dress fit liked saw ran away.

Often, a sentence has multiple possible meanings but you can tell from context what the intended meaning is. For example, when you hear a pronoun (like “they” or “it”), there are different things it could be referring to, but most of the time you know which is the intended thing. Compare the following two sentences:

  • Annie was afraid to talk to Nora because she was shy.
  • Annie was afraid to talk to Nora because she was popular.

In the first of the two sentences, you can tell that the word “she” is probably referring to Annie, whereas in the second, the word “she” is probably referring to Nora.

Typically we think of sentences as stating facts, but a lot of the time, a statement or question conveys more than its literal meaning. For example, if someone asks, “Do you know what time it is?”, they don’t want you to just say “yes”, they’re actually asking you to tell them what time it is.

Sometimes a sentence can mean the opposite of its literal meaning - this is called sarcasm - and in these cases you often have to infer from context that it’s intended as sarcastic. For example, if your mom says “I’m making broccoli for dinner” and your brother says “oh great, I love broccoli”, you can tell whether your brother is being sarcastic based on your knowledge of whether he actually likes broccoli (you can also sometimes tell from his tone of voice, though only if it’s spoken and not written).