AbstractSpeech recognition can be formulated as the problem of guessing a sequence of words that produces a sequence of sounds. The human brain is remarkably good at solving this problem, even though the same words correspond to many different sounds, because of accents or characteristics of the voice. Moreover, the environment is always noisy, to that the listeners hear a corrupted version of the speech.Computers are getting much better at speech recognition and voice command systems are now common for smartphones (Siri), automobiles (GPS, music, and climate control), call centers, and dictation systems. In this chapter, we explain the main ideas behind the algorithms for speech recognition and for related applications.The starting point is a model of the random sequence (e.g., words) to be recognized and of how this sequence is related to the observation (e.g., voice). The main model is called a hidden Markov chain. The idea is that the successive parts of speech form a Markov chain and that each word maps randomly to some sounds. The same model is used to decode strings of symbols in communication systems.Section 11.1 is a general discussion of learning. The hidden Markov chain model used in speech recognition and in error decoding is introduced in Sect. 11.2. That section explains the Viterbi algorithm. Section 11.3 discusses expectation maximization and clustering algorithms. Section 11.4 covers learning for hidden Markov chains.