Speech recognition - Glosarium AI

📖

istilah

Automatic Speech Recognition (ASR)

Artificial intelligence system capable of automatically converting spoken speech into written text, using acoustic and linguistic models to transcribe audio signals.

📖

istilah

Acoustic Model

Statistical or neural model that establishes the correspondence between phonetic units and acoustic features extracted from the speech signal to identify pronounced sounds.

📖

istilah

Phonetic Transcription

Symbolic representation of speech sounds using the International Phonetic Alphabet (IPA) or other phonetic notation systems for speech analysis and processing.

📖

istilah

Hidden Markov Model (HMM)

Sequential statistical model used in traditional speech recognition to model the temporal sequence of phonemes and their relationship with acoustic observations.

📖

istilah

Mel-Frequency Cepstral Coefficients (MFCC)

Set of acoustic features extracted from the audio signal that represent the speech spectrum on a Mel scale, which is more perceptually relevant for speech recognition.

📖

istilah

Voice Activity Detection (VAD)

Algorithmic technique that automatically identifies and segments portions of audio signal containing speech as opposed to silences or background noises.

📖

istilah

End-to-End Speech Recognition

Modern approach that uses a single neural model to directly map raw audio signals to character sequences, eliminating traditional intermediate components.

📖

istilah

Word Error Rate (WER)

Standard evaluation metric in speech recognition that calculates the error rate with respect to a reference, including word substitutions, insertions, and deletions.

📖

istilah

Connectionist Temporal Classification (CTC)

Training algorithm that allows neural networks to learn mappings between variable-length sequences without requiring prior alignment between audio and text.

📖

istilah

Speaker Diarization

Process of automatically segmenting an audio stream into homogeneous segments and assigning these segments to different identified speakers in the recording.

📖

istilah

Speech Enhancement

Set of signal processing techniques aimed at improving speech quality and intelligibility by reducing background noise and acoustic interferences.

📖

istilah

Pronunciation Lexicon

Database containing phonetic transcriptions of vocabulary words, essential for mapping recognized phoneme sequences to corresponding orthographic words.

Glosarium AI

Automatic Speech Recognition (ASR)

Acoustic Model

Phonetic Transcription

Hidden Markov Model (HMM)

Mel-Frequency Cepstral Coefficients (MFCC)

Voice Activity Detection (VAD)

End-to-End Speech Recognition

Word Error Rate (WER)

Connectionist Temporal Classification (CTC)

Speaker Diarization

Speech Enhancement

Pronunciation Lexicon

Tidak ada hasil ditemukan