Lectures Playlist

Kunal Chaturvedi
View Profile

W1_L1: Course introduction

Welcome to Week 1 Lecture 1 of the course "Speech Technology" by Profs. S. Umesh and Hema A Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview In this introductory lecture, you’ll get an overview of the course, its objectives, and the fascinating world of speech technology. The session covers the fundamentals of speech signals, signal processing, and statistical concepts that form the backbone of modern applications. This applied course will introduce Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and Speaker Verification, along with hands-on insights into building real-world systems using Python and open-source toolkits like ESPnet and Kaldi. About IIT Madras' online Bachelor of Science programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details Visit https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #SpeechTechnology #SpeechSignalProcessing #SpeechRecognition #ASR #TextToSpeech #TTS #SpeakerVerification #SignalProcessing #AudioProcessing #MachineLearning #DeepLearning #Python #ESPnet #Kaldi #NaturalLanguageProcessing #VoiceAI #IITMadras #IITMadrasBS

W1_L2: Digital signal processing fundamentals | sampling, quantization & fourier basics

Welcome to Week 1 Lecture 2 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides an introduction to the fundamental concepts of speech signal processing. We will cover analog-to-digital conversion through sampling and quantization, explore the crucial Sampling Theorem and the differences between narrowband and wideband speech. We will also touch upon speech compression techniques like Code Excited Linear Prediction (CELP) and the importance of frequency analysis using Fourier representation for speech processing. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #speechprocessing #signals #sampling #quantization #analogtodigital #narrowband #wideband #speechcompression #CELP #fouriertransform #frequencyanalysis #linearprediction #audioprocessing #digitalsignalprocessing #speechsignal #speechtechnology #signaltheory #DSPapplications #IITMadras #IITMadrasBS

W1_L4: Speech production | fourier, z-transform & digital filters

Welcome to Week 1 Lecture 4 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides a review of Fourier Series and Transforms, covering concepts from continuous-time and discrete-time signals to aperiodic and periodic representations. It discusses the practical importance of the Discrete Fourier Transform (DFT) and the Fast Fourier Transform (FFT). The lecture also delves into the analysis of systems, particularly digital systems, introducing the Z-transform, linearity, time invariance, causality, and stability. Finally, it explores the concept of filters (FIR and IIR), poles and zeros, and how these principles relate to speech signal processing, modeling the mouth as a linear time-invariant filter. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #FourierSeries #FourierTransform #SignalProcessing #DiscreteFourierTransform #DFT #FastFourierTransform #FFT #ZTransform #Linearity #TimeInvariance #Causality #Stability #Filters #FIR #IIR #PolesAndZeros #DigitalSignalProcessing #SpeechSignalProcessing #SignalsAndSystems #IITMadras #IITMadrasBS

W1_L3: Fourier series | speech production & phoneme representation

Welcome to Week 1 Lecture 3 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview Dive into the fascinating world of speech production.This lecture breaks down the complex process of how we speak, from the initial message in the brain to the physical articulation of sounds. Learn how the lungs, vocal cords (glottis), and mouth shape airflow to create different phonemes. Discover the differences between vowels and consonants, the role of tongue position, and how physiological factors influence pitch. We also explore how speech sounds are represented in the frequency domain (Fourier analysis) and the concepts of phonemes, how they relate to the different languages of the world, and the challenges of mastering sounds not present in one’s native language. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #speechproduction #phonetics #speech #vowels #consonants #phonemes #articulation #glottis #vocalfolds #frequencydomain #fouriertransform #languages #ASR #acoustics #sound #speechtechnology #speechrecognition #voicedsounds #unvoicedsounds #linguistics #IITMadras #IITMadrasBS

W2_L1: Speech production, perception & frequency analysis

Welcome to Week 2 Lecture 1 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides a comprehensive review of key concepts from the previous two lectures on speech signal processing. We cover the importance of speech communication, the basics of speech production (including the role of the vocal cords), and the introduction of the Fourier transform for frequency analysis. We delve into the concepts of sampling, aliasing, and the Discrete Fourier Transform (DFT). Furthermore, we discuss linear time-invariant (LTI) systems, convolution, and how they relate to the frequency domain. Finally, we explore phonemes as the fundamental units of speech, classifying them based on voicing, manner of articulation, and place of articulation. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #SpeechSignalProcessing #SpeechCommunication #FourierTransform #FrequencyAnalysis #DFT #DiscreteFourierTransform #Sampling #Aliasing #LTISystems #LinearTimeInvariant #Convolution #Phonemes #VoicedSounds #UnvoicedSounds #MannerOfArticulation #PlaceOfArticulation #VocalCords #SpeechProduction #IITMadras #IITMadrasBS

W2_L2: Perceptual masking, cepstrum & filtering | speech analysis

Welcome to Week 2 Lecture 2 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides an insightful overview of speech analysis, focusing on how we can extract valuable information from speech signals in the time and frequency domains. We explore the fundamental units of speech (phonemes), the concept of voicing, and how to visually interpret speech waveforms. The lecture then delves into gender and emotion recognition from speech, the source-filter model of speech production, and the use of Short-Time Fourier Transform (STFT) for analyzing the time-varying frequency content of speech. Finally, we introduce how to use windowing to extract information of different durations for time–frequency analysis. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #speechanalysis #phonemes #voicing #speechwaveform #timedomain #frequencydomain #genderecognition #emotionrecognition #sourcefiltermodel #terminalanalogmodel #shorttimefouriertransform #STFT #windowing #speechprocessing #signalprocessing #acousticphonetics #IITMadras #IITMadrasBS

W1_L6: Waveforms | source-filter model, spectrograms & human hearing

Welcome to Week 1 Lecture 6 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the source-filter model of speech production, explaining how the vocal cords and mouth act as a source and filter, respectively, to create voiced and unvoiced sounds. It then discusses short-time Fourier analysis and spectrograms as tools to visualize the changing frequency content of speech signals. Finally, the lecture transitions to the listener's perspective, exploring the complexities of human hearing, including concepts like loudness, pitch, and the basilar membrane's frequency analysis. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #speechproduction #sourcefiltermodel #speechmodeling #lineartimeinvariantsystems #fouriertransform #voiced #unvoiced #resonantfrequencies #phonemes #shorttimeanalysis #frequencydomain #fourieranalysis #discretefouriertransform #FFT #spectrogram #narrowband #wideband #hearing #psychoacoustics #loudness #pitch #mels #basilarmembrane #windowing #hammingwindow #IITMadras #IITMadrasBS

W1_L5: Revision | speech perception, masking & cepstral analysis

Welcome to Week 1 Lecture 5 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture dives into the fascinating world of speech perception and signal processing. We begin by exploring the difference between physical and perceived quantities like frequency (Hertz vs. Mels) and intensity (vs. loudness). Then, we delve into the crucial concept of masking, both in frequency and time, and how it is used in lossy compression such as MP3. Finally, the lecture introduces the Cepstral analysis technique, including liftering, for separating excitation characteristics and formant characteristics useful for feature extraction. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #speechprocessing #speechsignal #auditoryperception #masking #melscale #cepstralanalysis #liftering #mp3 #audio #frequencymasking #timemasking #speechproduction #hertz #pitch #loudness #intensity #signalprocessing #FeatureExtraction #IITMadras #IITMadrasBS

W2_L5: MFCC | liftering, mel scale & feature extraction for speech

Welcome to Week 2 Lecture 5 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces Mel Frequency Cepstral Coefficients (MFCCs), a vital feature extraction technique in speech technology. We’ll explore the process of separating vocal tract information from speech signals using concepts like liftering and triangular averaging. Discover how the Mel scale mimics human auditory processing and why MFCCs are essential for various speech recognition applications. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechrecognition #MFCC #melFrequencyCepstralCoefficients #featureextraction #speechprocessing #voicetechnology #speechsignal #cepstrum #spectrum #liftering #melfilterbank #triangularaveraging #vocaltract #excitation #signalprocessing #melScale #IITMadras #IITMadrasBS

W3_L2: Error analysis of gaussian model | mel filter banks

Welcome to Week 3 Lecture 2 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture dives into the world of audio feature extraction, specifically focusing on Mel Filter Banks and Mel-Frequency Cepstral Coefficients (MFCCs). Learn the practical implementation of Mel Filter Banks and their importance in modern deep learning applications for speech processing. We cover the transformation steps from a raw speech signal to log Mel Filter Banks, highlighting the significance of each stage like windowing, Fourier Transform, triangular averaging, and the log operation. We also discuss the classic MFCC extraction, which involves a Discrete Cosine Transform (DCT) and the use of delta and delta-delta coefficients, often used in conventional statistical models like GMM-HMM. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #MelFilterBank #MFCC #SpeechProcessing #FeatureExtraction #DeepLearning #AudioAnalysis #SignalProcessing #FourierTransform #DCT #GMM #HMM #DeltaCoefficients #LogMelFilterBank #SpeechRecognition #AcousticFeatures #IITMadras #IITMadrasBS

W2L5_MFCC

Mel Filter Bank and Cepstral Coefficients

W2_L3: Feature extraction for speech processing | part 01

Welcome to Week 2 Lecture 3 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the Gaussian distribution (bell curve) and its key parameters: mean (_) and variance (__). We’ll explore how Gaussian distributions model real-world data, such as height measurements, and discuss parameter estimation through Maximum Likelihood Estimation (MLE). Learn how to fit a Gaussian curve to your data and understand the concepts of likelihood, independence, and optimization using derivatives. The lecture also briefly touches upon pattern classification using Gaussians, setting the stage for feature extraction in speech processing. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #GaussianDistribution #NormalDistribution #BellCurve #Mean #Variance #StandardDeviation #MaximumLikelihoodEstimation #MLE #ParameterEstimation #StatisticalModeling #DataAnalysis #CurveFitting #Probability #Statistics #PatternClassification #SpeechProcessing #IITMadras #IITMadrasBS

W3_L1: Gaussian model for binary classification problem

Welcome to Week 3 Lecture 1 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview We begin with a review of basic probability and statistics including normal distributions and how they can be used to model data. We then dive into maximum likelihood estimation for parameter estimation (mean and variance). Finally, we explore a pattern classification problem: using height to determine gender. This involves building separate Gaussian models for each class (male/female) and using likelihood to classify new data. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #statistics #probability #gaussian #normaldistribution #maximumlikelihood #MLE #patternclassification #speechsignalprocessing #mean #variance #datascience #machinelearning #height #gender #statisticalmodeling #IITMadras #IITMadrasBS

W3_L5: Mixture of gaussians | binary classification & error trade-offs

Welcome to Week 3 Lecture 5 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the concept of binary classification, illustrating how to make decisions based on measurements (like height) to categorize data into two classes (e.g., male/female, enemy target present/absent). We explore the trade-offs between different types of errors (false alarms and missed detections) and how minimizing one can increase the other, impacting real-world applications like defense systems and banking security. The lecture touches upon Gaussian distributions and concludes by introducing the upcoming topic of Mixture of Gaussians. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #BinaryClassification #PatternRecognition #MachineLearning #ErrorAnalysis #FalseAlarm #MissedDetection #GaussianDistribution #DecisionMaking #DataAnalysis #Defense #Security #Banking #Algorithms #Probability #Statistics #MixtureOfGaussians #AI #ArtificialIntelligence #IITMadras #IITMadrasBS

W2_L6: Gaussian review | binary classification & multivariate distributions

Welcome to Week 2 Lecture 6 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explores binary pattern classification problems, building upon the concept of using density functions to distinguish between two classes (e.g., male/female, target present/absent). We delve into the idea of using multiple measurements, such as height and weight, to improve classification accuracy. The session introduces the multivariate Gaussian distribution and explains how to build and utilize it for better decision-making. Key concepts include mean vectors, covariance matrices, and how correlated variables can enhance pattern recognition. We analyze a binary classification problem using height and weight, and also explain Maximum Likelihood estimation fitted to the data. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #SpeechSignalProcessing #PatternClassification #BinaryClassification #GaussianDistribution #MultivariateGaussian #MeanVector #CovarianceMatrix #FeatureVector #MachineLearning #SignalProcessing #HeightWeight #GenderClassification #DensityFunction #ErrorProbability #FalseAlarm #ProbabilityOfMiss #MaximumLikelihood #BivariateGaussian #UnivariateGaussian #IITMadras #IITMadrasBS

W3_L3: Introduction to bi variate gaussian model | gaussian mixture models

Welcome to Week 3 Lecture 3 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the concept of Gaussian Mixture Models (GMMs). It starts with a review of single and multivariate Gaussian distributions and maximum likelihood estimation. Then, it tackles the problem of modeling data that comes from multiple populations when the origin of each data point is unknown. Using the height of people in a park (adults and children) as an example, the lecture explains how to estimate the parameters of a mixture of Gaussians using an iterative process, laying the groundwork for understanding more complex models like Hidden Markov Models. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #GaussianMixtureModel #GMM #MachineLearning #Statistics #DataModeling #UnsupervisedLearning #Clustering #ExpectationMaximization #EMAlgorithm #Probability #GaussianDistribution #MixtureModels #HiddenMarkovModel #HMM #ParameterEstimation #LikelihoodEstimation #DataAnalysis #PatternRecognition #ArtificialIntelligence #IITMadras #IITMadrasBS

W3_L4: Mixture of gaussians introduction

Welcome to Week 3 Lecture 4 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces statistical ideas in speech signal processing, focusing on Gaussian and Gaussian Mixture Models (GMMs). We’ll explore how to use these models to represent data, like height distributions, more efficiently than histograms. We also discuss how to handle data where labels are available and delve into the more complex scenario of estimating model parameters when class labels are missing — a common problem in many real-world applications. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #SpeechSignalProcessing #Gaussian #GaussianMixtureModel #GMM #Statistics #Probability #Likelihood #MachineLearning #DataModeling #UnsupervisedLearning #MissingData #ParameterEstimation #IITMadras #IITMadrasBS

W3_L7: Vector quantization introduction

Welcome to Week 3 Lecture 7 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explains Gaussian Mixture Models (GMMs) for modeling complex data distributions as a combination of multiple Gaussians. We explore how to estimate the parameters of these Gaussians when you don’t have labeled data. The core concept is the Expectation-Maximization (EM) algorithm, which uses an iterative approach to refine initial guesses and cluster the data. We also touch upon the connections to K-means clustering, showing how EM generalizes this idea for more flexible statistical modeling. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #GaussianMixtureModel #GMM #EMAlgorithm #ExpectationMaximization #UnsupervisedLearning #Clustering #KMeans #StatisticalModeling #ParameterEstimation #DataModeling #MachineLearning #MixtureModels #IITMadras #IITMadrasBS

W4_L1: Vector quantization | data representation, clustering & finite precision

Welcome to Week 4 Lecture 1 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview Introduction to Vector Quantization. This lecture explores the concept of vector quantization, its importance in representing numbers with finite precision in computers, and its applications in data representation and clustering. We’ll discuss how quantization naturally occurs in the real world and contrast Gaussian-based approaches with direct data manipulation techniques. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, Visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes. #VectorQuantization #Quantization #DataRepresentation #Clustering #DataScience #MachineLearning #FinitePrecision #Algorithms #SpeechProcessing #IITMadras #IITMadrasBS

W3_L6: Parameter estimation of GMM

Welcome to Week 3 Lecture 6 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the concept of vector quantization in speech signal processing. We’ll explore how vector quantization provides a way to retain data in a representative form instead of using parametric models, focusing on the k-means and LBG algorithms for clustering. A real-world application of Code-Excited Linear Prediction (CELP) — the technology behind modern speech coding that makes voice communication clear and efficient — is discussed to show how it leverages vector quantization. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechprocessing #vectors #quantization #kmeans #LBG #CELP #linearprediction #clustering #voicerecognition #signals #VishnuAtal #codeexcitedlinearprediction #speechcoding #datarepresentation #centroid #ExpectationMaximization #ModelOrderEstimation #IITMadras #IITMadrasBS

W4_L4: Predicting weather sequence | sequence modeling & real-world applications

Welcome to Week 4 Lecture 4 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the concept of sequence modeling, highlighting its importance in various real-world applications where the order of events matters. We begin with a simple example of weather prediction using a Markov Chain to illustrate how historical data can be used to estimate the probability of future events based on the current state. This provides a foundation for understanding more complex sequence modeling techniques used in speech recognition, stock market analysis, and other fields. We’ll explore this framework, how these models are implemented, and the benefits of understanding how to best use them. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #sequenceModeling #markovChain #weatherPrediction #probability #hiddenMarkovModel #HMM #speechRecognition #signalProcessing #timeSeries #machineLearning #dataAnalysis #patternRecognition #IITMadras #IITMadrasBS

W4_L3: Markov chain – example | sequential modeling & weather prediction

Welcome to Week 4 Lecture 3 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview Explore the fundamentals of sequential modeling with a focus on Markov Chains. This lecture explains Markov Chains using a weather prediction example, detailing the first-order Markovian assumption, state definitions, and transition matrices. Discover how historical data informs probabilistic predictions and how these concepts form the foundation of sequential data modeling in applications like speech processing. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #sequentialmodeling #markovchain #hmm #speechprocessing #weatherprediction #statemodels #firstordermarkov #transitionmatrix #probability #machinelearning #sequentialdata #prediction #IITMadras #IITMadrasBS

W4_L2: Markov model | forecasting, greedy algorithm & viterbi

Welcome to Week 4 Lecture 2 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explores the power of Markov Models for forecasting, specifically using a weather prediction example. We start with historical weather data and transition probabilities, then delve into single and multi-day forecasts. We contrast a simple, but often inaccurate, “greedy algorithm” with a more robust method of evaluating all possible weather sequences to determine the most probable forecast. Finally, we discuss the computational challenges of longer forecasts and introduce the Viterbi algorithm as an efficient solution used in numerous applications. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #MarkovModel #WeatherForecasting #Probability #GreedyAlgorithm #ViterbiAlgorithm #MachineLearning #Algorithm #DataAnalysis #TimeSeries #Prediction #TransitionProbability #States #Forecast #HistoricalData #DataScience #DataDriven #OptimalPath #IITMadras #IITMadrasBS

W4_L6: Viterbi algorithm introduction

Welcome to Week 4 Lecture 6 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the Viterbi algorithm, a dynamic programming approach for finding the most likely sequence of hidden states given a sequence of observations. It explains how the algorithm efficiently avoids redundant calculations inherent in brute-force methods by intelligently storing and reusing path probabilities at each time step. The lecture breaks down the core concepts, including states, time steps, transition probabilities, and the crucial “gamma” variable for tracking maximum probabilities. It highlights how Viterbi delivers the optimal solution without approximation, minimizing computational cost and energy consumption. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #ViterbiAlgorithm #HiddenMarkovModel #DynamicProgramming #SequenceDecoding #StateEstimation #SignalProcessing #MachineLearning #Algorithms #Optimization #Probability #PathFinding #SpeechRecognition #ErrorCorrection #DataAnalysis #ComputationalEfficiency #MarkovModel #IITMadras #IITMadrasBS

W4_L6: Viterbi algorithm | intuitive explanation

Welcome to Week 4 Lecture 6 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides a clear and intuitive explanation of the Viterbi algorithm, a crucial technique used in many applications like speech recognition, DNA sequencing, and more. Learn how Viterbi efficiently finds the most probable sequence of states in a Hidden Markov Model (HMM) by avoiding the computational burden of brute-force calculations. We’ll cover the core idea of tracking only the maximum probabilities at each time step, along with the transition probabilities, to determine the optimal path. The lecture also illustrates how, at a given time instant and state, Viterbi looks back only at the maximum from the previous states — ensuring efficient decoding of sequences. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #ViterbiAlgorithm #HiddenMarkovModel #HMM #SequenceModeling #SpeechRecognition #DynamicProgramming #ForwardBackwardAlgorithm #ConnectionistTemporalClassification #CTC #MachineLearning #ArtificialIntelligence #AI #Probability #TransitionProbability #States #MarkovModel #Algorithms #DeepLearning #SequenceToSequence #Prediction #Computation #IITMadras #IITMadrasBS

W5_L3: Efficiency of viterbi algorithm | hidden markov models, observations & weather example

Welcome to Week 5 Lecture 3 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the Hidden Markov Model (HMM), an extension of the Markov model that incorporates hidden or latent variables. Drawing parallels to Gaussian Mixture Models, it shows how HMMs can be used to model real-world scenarios where the underlying states are not directly observable. Using the “monk in a cave” analogy, we explore how observed data such as grass conditions can be combined with transition and observation probabilities to infer hidden states like the actual weather outside. The session further explains the difference between observation and transition probabilities, applies Bayes’ theorem to compute the likelihood of weather sequences, and highlights the computational challenges that arise as the sequence length grows. Finally, it demonstrates how efficient algorithms such as the Viterbi and Forward methods provide practical solutions for determining the most probable sequences in applications ranging from weather forecasting to speech recognition. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #HiddenMarkovModel #HMM #MarkovModel #GaussianMixtureModel #LatentVariables #HiddenStates #ObservationProbability #TransitionProbability #ViterbiAlgorithm #ForwardAlgorithm #SequenceModeling #SpeechRecognition #MachineLearning #Probability #StatisticalModeling #IITMadras #IITMadrasBS

W5_L1: Hidden markov model | weather example, transition matrices & viterbi motivation

Welcome to Week 5 Lecture 1 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture reviews Markovian chains and Hidden Markov Models (HMMs) using the intuitive example of weather patterns. We’ll explore transition probabilities, transition matrices, and state diagrams to understand how to model sequential data where events depend on their immediate history. We then expand this to longer sequences and introduce the motivation for efficient algorithms such as the Viterbi algorithm. Learn how to calculate the probability of specific weather sequences and determine the most likely weather forecast for the next few days. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #MarkovChain #HiddenMarkovModel #HMM #StatisticalModeling #SpeechSignalProcessing #WeatherForecasting #TransitionProbability #StateDiagram #ViterbiAlgorithm #SequentialData #Probability #MachineLearning #IITMadras #IITMadrasBS

W5_L2: Review of the weather prediction example

Welcome to Week 5 Lecture 2 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides a clear explanation of the Viterbi algorithm and its efficiency. It contrasts the computational complexity of the Viterbi algorithm with a brute-force approach to path evaluation, demonstrating how Viterbi drastically reduces the number of calculations needed to find the most probable sequence of hidden states in a Hidden Markov Model (HMM). Using a weather example (sunny, cloudy, rainy), the lecture explains step-by-step how Viterbi leverages previous calculations to avoid redundant computations, making it computationally feasible for large-scale problems like speech recognition. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #Viterbi #Algorithm #HMM #HiddenMarkovModel #Probability #Path #Optimization #Sunny #Cloudy #Rainy #Weather #ComputationalComplexity #States #TransitionMatrix #SpeechRecognition #IITMadras #IITMadrasBS

W6_L4: Review of limited vocabulary speech recognition

Welcome to Week 6 Lecture 4 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the concept of Hidden Markov Models (HMMs) and explains how they are used when the underlying states are not directly observable but can be inferred through related observations. Using the classic example of inferring weather conditions from grass conditions (dry, damp, soggy), we demonstrate how transition and observation probabilities are defined and applied. The session then shows how to determine the most likely sequence of hidden states, such as weather patterns, given a sequence of observations by applying the Viterbi algorithm with both transition and observation probabilities incorporated. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #HiddenMarkovModel #HMM #MachineLearning #ArtificialIntelligence #ProbabilisticModels #ViterbiAlgorithm #ObservationProbability #TransitionProbability #StateSequence #HiddenStates #MarkovModel #DataScience #PatternRecognition #IITMadras #IITMadrasBS

W6_L2: Review of a pattern classification problem

Welcome to Week 6 Lecture 2 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explains the Forward Algorithm and contrasts it with the Viterbi Algorithm. While Viterbi identifies the single most likely sequence of hidden states, the Forward Algorithm calculates the total probability of observing data by considering all possible paths. The session highlights the difference between “soft alignment” and “hard alignment,” showing how the Forward Algorithm uses weighted probabilities from all data points for more efficient training, whereas Viterbi focuses on inference. A Gaussian Mixture Model (GMM) example is used to illustrate how these approaches differ in practice, setting the stage for deeper understanding of speech recognition models. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #ForwardAlgorithm #ViterbiAlgorithm #HiddenMarkovModel #HMM #Probability #Algorithm #MachineLearning #DataScience #SoftAlignment #HardAlignment #GaussianMixtureModel #GMM #Training #Inferencing #SpeechProcessing #BestPath #TotalProbability #IITMadras #IITMadrasBS

W5_L4: Review of HMM weather & grass example

Welcome to Week 5 Lecture 4 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview In this lecture, we bridge the gap between statistical models such as Gaussian Mixture Models (GMMs) and Hidden Markov Models (HMMs) and their real-world applications in speech processing, with a particular focus on speech recognition or speech-to-text. The session revisits key concepts and demonstrates how they provide the foundation for converting speech signals into editable text, drawing an analogy with handwriting recognition. We also explore the role of phonemes and their frequency characteristics, highlighting how Mel-Frequency Cepstral Coefficients (MFCCs) and filter banks are applied to represent speech for recognition. This lecture provides an integrated overview of essential speech processing techniques and feature extraction methods that make automatic speech recognition systems possible. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #SpeechRecognition #SpeechToText #ASR #HiddenMarkovModel #GaussianMixtureModel #GMM #HMM #Phonemes #MFCC #MelFrequencyCepstralCoefficients #FeatureExtraction #SpeechProcessing #SignalProcessing #PatternRecognition #AcousticModeling #VoiceRecognition #DeepLearning #ShortTimeProcessing #FourierTransform #MelFilterBank #Cepstrum #IITMadras #IITMadrasBS

W5_L5: Forward algorithm for HMM

Welcome to Week 5 Lecture 5 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explores how Gaussian Mixture Models (GMMs) can be used for population classification tasks such as distinguishing between Indians, Norwegians, and Pygmies based on height and weight. A single Gaussian distribution cannot capture the variations within each group, so we employ mixtures of Gaussians to better model men, women, and children separately. Through this example, the lecture demonstrates the importance of feature selection, mixture modeling, and inference, highlighting how GMMs extend beyond simple distributions to solve real-world classification problems. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #GaussianMixtureModel #GMM #PatternRecognition #Classification #MachineLearning #HeightWeight #DataScience #Indians #Norwegians #Pygmy #FeatureSelection #MixtureModel #Inference #IITMadras #IITMadrasBS

W6_L1: Review of speech feature extraction

Welcome to Week 6 Lecture 1 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the fundamentals of pattern classification for sound using a toy language with only two sounds, “R” and “E.” We explore how to build acoustic models with MFCC features and Gaussian Mixture Models (GMMs) to classify these sounds. The lecture then transitions to the complexities of sequence modeling in real speech, focusing on the difficulty of identifying individual phonemes within a continuous stream of sound. This discussion lays the foundation for understanding advanced speech recognition methods and the role of statistical modeling in real-world systems. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechrecognition #patternclassification #acousticmodeling #mfcc #GMM #GaussianMixtureModel #phoneme #sequencemodeling #machinelearning #datascience #soundclassification #ArtificialIntelligence #AI #signalprocessing #featureextraction #IITMadras #IITMadrasBS

W6_L3: Limited vocabulary speech recognition

Welcome to Week 6 Lecture 3 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explores the application of Gaussian Mixture Models (GMMs) and Hidden Markov Models (HMMs) to speech processing, with a focus on automatic speech recognition (ASR) or speech-to-text systems. We examine how these statistical models are used to build practical systems that convert spoken language into text, highlighting both the theoretical underpinnings and real-world challenges. The lecture also revisits concepts of pattern classification using Gaussians and Gaussian Mixture Models, showing how they provide the foundation for speech feature modeling in ASR frameworks. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechprocessing #speechrecognition #ASR #speechtotext #GMM #GaussianMixtureModel #HMM #HiddenMarkovModel #PatternClassification #StatisticalModeling #FeatureExtraction #MelFilterbank #LogMelFilterbank #MFCC #AudioProcessing #SignalProcessing #IITMadras #IITMadrasBS

W6_L7: HMM applied to speech | limited vocabulary recognition with yes/no acoustic models

Welcome to Week 6 Lecture 7 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture dives into the complexities of speech recognition when working with a limited vocabulary. It focuses on how to handle sequences of sounds or phonemes within words such as “yes” and “no” when building acoustic models. The lecture introduces Hidden Markov Models (HMMs) as a powerful solution to the challenge of unknown phoneme boundaries and demonstrates how combining HMMs with Gaussian Mixture Models (GMMs) enables the construction of more robust speech recognition systems. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechrecognition #HMM #HiddenMarkovModel #GMM #GaussianMixtureModel #Phoneme #Acoustics #FeatureExtraction #MFCC #MelFilterBank #StateModeling #SequenceLearning #MachineLearning #DeepLearning #AISpeech #YesNo #Vocabulary #AcousticModel #IITMadras #IITMadrasBS

W6_L5: Introduction to HMM GMM model for speech recognition | part 1

Welcome to Week 6 Lecture 5 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explores the application of Gaussian Mixture Models (GMMs) and Hidden Markov Models (HMMs) to speech processing through the example of a simple “yes/no” word recognition system. Building on the previous discussion, it addresses the challenge of segmenting speech data into sound units. An iterative approach is introduced, beginning with uniform segmentation of utterances, constructing GMMs for each sound, and then refining the segmentation using these models. The lecture also highlights the limitations of isolated GMM-based methods, motivating the need for sequence modeling. Finally, it shows how HMMs can effectively overcome these issues, framing the recognition task as deciding whether “yes” or “no” was spoken based on observed test utterances. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #GMM #HMM #SpeechProcessing #SpeechRecognition #GaussianMixtureModel #HiddenMarkovModel #FeatureExtraction #MFCC #LogMelFilterBank #MachineLearning #SequenceModeling #WordRecognition #PatternRecognition #AcousticModeling #ViterbiAlgorithm #BinaryClassification #YesNoRecognition #IITMadras #IITMadrasBS

W6_L6: Introduction to HMM GMM model for speech recognition | part 2

Welcome to Week 6 Lecture 6 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explains how Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) are applied in speech recognition systems. It introduces the concept of hidden states, such as phonemes, and observable states like Mel-Frequency Cepstral Coefficients (MFCCs), and demonstrates how these models are trained using transcribed speech data. The lecture discusses how transition and observation probabilities are estimated, along with the importance of lexicons and dictionaries in mapping words to phonemes. It also provides a detailed explanation of the Viterbi algorithm and the Forward-Backward Algorithm for segmentation and model training, while introducing the left-to-right (Bakis) model structure commonly used in HMMs for speech recognition. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechrecognition #HMM #GMM #HiddenMarkovModel #GaussianMixtureModel #Phonemes #MFCC #MelFrequencyCepstralCoefficients #Viterbi #ForwardBackwardAlgorithm #SpeechProcessing #ASR #Lexicon #Dictionary #TransitionProbability #ObservationProbability #BakisModel #LeftToRightModel #Phonetics #Audio #SignalProcessing #IITMadras #IITMadrasBS

W7_L6: Introduction to deep neural network

Welcome to Week 7 Lecture 6 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture reviews the application of Gaussian Mixture Models (GMMs) and Hidden Markov Models (HMMs) in speech recognition. Revisiting the “yes/no” example, we illustrate supervised learning and demonstrate how speech sounds (phonemes) can be represented as hidden states in an HMM, while observable speech features such as Mel-filterbank coefficients are generated from these states. We then discuss how GMMs model the probability of observing features given a phoneme state and how the Viterbi algorithm identifies the most likely phoneme sequence for an input signal. Finally, we show how lexicons connect phoneme sequences to meaningful words, setting the stage for more advanced deep neural network approaches in speech processing. About IIT Madras' Online Bachelor of Science Programme IIIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #SpeechRecognition #GMM #HMM #GaussianMixtureModel #HiddenMarkovModel #SpeechProcessing #SignalProcessing #Phoneme #ViterbiAlgorithm #MelFilterbank #AcousticModeling #SupervisedLearning #Lexicon #MachineLearning #LanguageModel #IITMadras #IITMadrasBS

W7_L2: Different types of HMM models | three-state phoneme, triphone & state tying

Welcome to Week 7 Lecture 2 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview Want to understand how speech recognition truly works? This lecture delves into how phonemes are modeled using Hidden Markov Models (HMMs) enhanced with Gaussian Mixture Models (GMMs). Discover the significance of three-state modeling for capturing phoneme transitions and the impact of context-dependent (triphone) models on recognition accuracy. The lecture also introduces state tying techniques to address data insufficiency and improve performance, laying the foundation for large-scale speech systems. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechrecognition #phoneme #HMM #GMM #Triphone #StateTying #VoiceTechnology #SpeechProcessing #AcousticModeling #LanguageModeling #HiddenMarkovModel #GaussianMixtureModel #ContextDependent #IndianLanguages #Alexa #GoogleAssistant #ArtificialIntelligence #AI #MachineLearning #DeepLearning #IITMadras #IITMadrasBS

W6_L7: HMM applied to speech | yes/no word recognition with HMM-GMM acoustic models

Welcome to Week 6 Lecture 7 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture dives into the complexities of speech recognition when working with a limited vocabulary. It explores how to handle sequences of sounds (phonemes) within words like “yes” and “no” when building acoustic models. The session introduces Hidden Markov Models (HMMs) as a solution to the challenge of unknown phoneme boundaries, showing how integrating HMMs with Gaussian Mixture Models (GMMs) leads to more robust and practical speech recognition systems. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechrecognition #HMM #HiddenMarkovModel #GMM #GaussianMixtureModel #Phoneme #Acoustics #FeatureExtraction #MFCC #MelFilterBank #StateModeling #SequenceLearning #MachineLearning #DeepLearning #AISpeech #YesNo #Vocabulary #AcousticModel #IITMadras #IITMadrasBS

W7_L3: Lexicon / pronunciation dictionary

Welcome to Week 7 Lecture 3 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture dives into the fascinating world of language models and their crucial role in automatic speech recognition (ASR) as well as broader natural language processing (NLP) applications. We explore how language models estimate the probability of word sequences to support tasks such as error correction, next-word prediction, and machine translation. Using the classic example of “recognize speech” vs. “wreck a nice beach,” the session illustrates how language models disambiguate similar-sounding phrases and significantly improve ASR accuracy. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #LanguageModel #ASR #AutomaticSpeechRecognition #NLP #NaturalLanguageProcessing #SpeechRecognition #Probability #WordSequence #AcousticModel #LanguageModeling #MachineTranslation #NextWordPrediction #ErrorCorrection #IITMadras #IITMadrasBS

W7_L7: Deep learning basics | part 1 | n-gram language models & perplexity

Welcome to Week 7 Lecture 7 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the concept of N-gram language models for predicting word sequences. We begin with Unigrams, which model the probability of individual words, then extend to Bigrams that capture the probability of a word given its predecessor, and generalize further to N-grams. The session explains how these models are built by counting word occurrences in large text corpora, their ability to approximate natural language, and their limitations in capturing long-range dependencies. Finally, we discuss how language models are evaluated using perplexity, a key metric in natural language processing. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #ngram #languageModel #unigram #bigram #trigram #wordSequence #probability #textMining #nlp #naturalLanguageProcessing #markovChain #perplexity #dataScience #machineLearning #IITMadras #IITMadrasBS

W7_L5: N gram language model | history & evolution of neural networks in speech processing

Welcome to Week 7 Lecture 5 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides a historical overview and introduction to neural networks in speech processing. We trace the evolution of neural networks from their early foundations to their resurgence during the deep learning era. The session highlights key milestones, influential figures such as Hinton, Schmidhuber, and Alex Graves, and explains how methods transitioned from classical GMM-HMM frameworks to powerful neural network-based approaches, including CTC-based speech recognition. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #NeuralNetworks #SpeechProcessing #DeepLearning #GMM #HMM #MachineLearning #ArtificialIntelligence #Hinton #Schmidhuber #AlexGraves #CTC #SpeechRecognition #EvolutionofAI #NeuralNetworkHistory #IITMadras #IITMadrasBS

W7_L4: Introduction to language model | artificial neurons & perceptrons in neural networks

Welcome to Week 7 Lecture 4 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces the fundamental concept of an artificial neuron by drawing inspiration from its biological counterpart. It explains how inputs are assigned weights, combined linearly, and passed through a non-linear activation function to generate an output. We explore the power of non-linearity in neural networks, discuss the limitations of single-layer perceptrons, and highlight the importance of multi-layer architectures to solve non-linearly separable problems like XOR. The session also touches upon key contributions from pioneers like Rosenblatt and Minsky. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #artificialneuron #neuron #neuralnetwork #perceptron #deeplearning #machinelearning #AI #artificialintelligence #nonlinearity #linearcombination #weightedinputs #activationfunction #XOR #patternrecognition #training #bias #dendrites #axon #Minsky #Rosenblatt #signalprocessing #decisionboundary #IITMadras #IITMadrasBS

W7_L8: Deep learning basics | part 2 | activation functions & non-linearities in neural networks

Welcome to Week 7 Lecture 8 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides a foundational understanding of non-linearities in neural networks, focusing on key activation functions such as sigmoid, tanh, and ReLU. We discuss why non-linearities are critical for representing complex decision boundaries, and explore the mathematical properties of each function, including their derivatives and practical implications. The lecture concludes with a real-world example of a neural network used to classify patients into diabetic and non-diabetic categories based on glucose level, body mass index, and insulin level, highlighting the importance of non-linear modeling in classification tasks. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #neuralnetworks #deeplearning #nonlinearity #sigmoid #tanh #relu #activationfunction #machinelearning #artificialintelligence #math #derivatives #classification #diabetic #glucose #bodymassindex #insulin #feedforwardnetworks #training #weights #AI #IITMadras #IITMadrasBS

W8_L6: Introduction to CTC | part 2 | neural networks, softmax & automated classification

Welcome to Week 8 Lecture 6 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture delves into the fascinating world of neural networks, exploring how they can be trained to automate complex tasks such as diabetes diagnosis. We begin with the fundamentals of feed-forward neural networks, including input nodes, hidden layers, and output, before moving into the training process: initializing random weights, forward passes, loss functions (mean squared error and cross-entropy), and backpropagation for weight adjustment. The lecture also covers optimization techniques, iterative algorithms, and the challenges of local versus global minima. Finally, we highlight the importance of the Softmax function for multi-class classification problems, showing its applications in handwriting recognition, speech processing, and broader pattern recognition tasks. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #NeuralNetworks #MachineLearning #DeepLearning #DiabetesDiagnosis #ArtificialIntelligence #TrainingData #Backpropagation #LossFunction #MeanSquaredError #CrossEntropy #Softmax #Classification #PatternRecognition #FeedForwardNetwork #WeightsAndBias #AutomatedSystems #DataScience #AI #AIinHealthcare #Healthcare #AutomatedDiagnosis #TrainingNeuralNetworks #HandwritingRecognition #GMM #HMM #IterativeAlgorithm #DecisionBoundary #HiddenLayer #InputLayer #OutputLayer #NonLinearity #Sigmoid #HyperbolicTangent #ReLU #Bias #GradientDescent #OptimizationAlgorithms #NeuralNetworkTraining #DataAnalysis #DataMining #Algorithm #IITMadras #IITMadrasBS

W8_L1: Feed forward neural network | part 1 | backpropagation, softmax & gradient descent

Welcome to Week 8 Lecture 1 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture provides a refresher on feedforward neural networks, emphasizing the importance of non-linearities such as sigmoid, ReLU, and tanh. We discuss how these networks can be applied to real-world classification problems, including cat/dog image recognition and diabetes diagnosis. The lecture introduces supervised learning concepts like backpropagation and weight adjustment, explores the Softmax function for probabilistic classification, and explains how gradient descent is used to minimize error. Finally, we connect these concepts to speech processing, showing how feedforward networks form the basis for advanced applications in speech recognition and signal modeling. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #neuralnetworks #feedforward #deeplearning #machinelearning #classification #backpropagation #softmax #gradientdescent #speechprocessing #nonlinearity #tanh #relu #sigmoid #supervisedlearning #weights #training #error #optimization #AI #IITMadras #IITMadrasBS

W9_L2: Recurrent neural networks

Welcome to Week 9 Lecture 2 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture dives into the application of Deep Neural Networks (DNNs) in speech recognition, specifically within hybrid models that combine DNNs with Hidden Markov Models (HMMs). We discuss the crucial concept of “alignment” for training DNNs to predict phonemes from Mel filter bank features and why frame-by-frame classification is limiting. To overcome these constraints, the lecture introduces Connectionist Temporal Classification (CTC), which focuses on accurate sequence prediction of phonemes rather than isolated frame decisions. This transition to sequence-based learning marks an important advancement in building robust, real-world speech recognition systems. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechprocessing #speechrecognition #deeplearning #dnn #hmm #hybridmodel #ctc #connectionisttemporalclassification #alexgraves #phoneme #alignment #melfilterbank #acousticmodeling #neuralnetworks #sequencemodeling #IITMadras #IITMadrasBS

W8_L3: Revision of feed forward neural networks

Welcome to Week 8 Lecture 3 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture introduces Connectionist Temporal Classification (CTC) and its role in advancing speech recognition. We begin by contrasting traditional frame-by-frame alignment methods with CTC’s approach, which eliminates the need for explicit alignment and instead maximizes the probability of the correct phoneme or character sequence. The lecture explains how CTC improves sequence accuracy, particularly in phonemic languages, and how it enables end-to-end speech recognition by bypassing the requirement for a lexicon. These concepts lay the foundation for modern neural network-based speech recognition systems that directly map audio to text. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #CTC #SpeechRecognition #DeepLearning #ConnectionistTemporalClassification #Phonemes #EndToEndSpeechRecognition #AutomaticSpeechRecognition #ASR #NeuralNetworks #MachineLearning #SequenceModeling #HiddenMarkovModel #HMM #PhoneticLanguages #Lexicon #AcousticModeling #IITMadras #IITMadrasBS

W8_L5: Introduction to CTC | part 1

Welcome to Week 8 Lecture 5 of the course "Speech Technology" by Profs. S. Umesh and Hema A. Murthy. Full Course: https://study.iitm.ac.in/ds/course_pages/BSEE4001.html Video Overview This lecture explores the transition from traditional GMM-HMM based speech recognition systems to modern DNN-HMM hybrid models. We discuss how Deep Neural Networks (DNNs) replace Gaussian Mixture Models (GMMs) for acoustic modeling, providing more powerful feature representation. The session also introduces Connectionist Temporal Classification (CTC) as a method for direct speech-to-text or speech-to-character transcription, particularly useful in phonemic languages. Finally, we examine the limitations of frame-by-frame DNN decisions and explain why incorporating sequence-level information is essential for building robust speech recognition systems. About IIT Madras' Online Bachelor of Science Programme IIT Madras offers four-year BS programmes that aim to provide quality education to all, irrespective of age, educational background, or location. The BS programme has multiple levels, which provide flexibility to students to exit at any of these levels. Depending on the courses completed and credits earned, the learner can receive a Foundation Certificate from IITM CODE (Centre for Outreach and Digital Education), Diploma(s) from IIT Madras, or BSc/BS Degrees from IIT Madras. For more details, visit: https://www.iitm.ac.in/academics/study-at-iitm/non-campus-bs-programmes #speechrecognition #ASR #automaticspeechrecognition #speechtotext #GMMHMM #DNNHMM #deeplearning #neuralnetworks #acousticmodeling #CTC #connectionisttemporalclassification #phonemes #characters #sequenceinformation #hybridmodel #melfilterbank #mfcc #HMM #DNN #phonemiclanguages #statisticalapproach #speechprocessing #IITMadras #IITMadrasBS