AI 용어집
인공지능 완전 사전
MFCC (Mel-frequency Cepstral Coefficients)
Cepstral coefficients calculated on a Mel frequency scale that approximate human audio perception, widely used in speech recognition and music analysis.
Chroma Features
12-dimensional representation that projects the audio spectrum onto the 12 pitch classes of the chromatic scale, independent of octave, ideal for harmonic analysis.
Spectrogram
Visual representation of the frequency spectrum of an audio signal over time, obtained by successively applying the Fourier transform on sliding windows.
Short-Time Fourier Transform (STFT)
Time-frequency analysis technique that divides the signal into short segments and applies the Fourier transform to each segment to capture spectral evolution.
Zero-Crossing Rate
Measure of the number of times the audio signal crosses the zero axis per unit time, useful indicator for discriminating voiced from unvoiced sounds.
Spectral Centroid
Weighted center of gravity of the spectral distribution of a signal, indicating the dominant average frequency and allowing characterization of perceived brightness.
Spectral Rolloff
Frequency below which a specified percentage (typically 85% or 95%) of the total spectral energy is found, indicator of spectral distribution.
Spectral Flux
Measure of the change in spectral distribution between successive frames, used to detect transitions and attacks in audio signals.
Spectral Bandwidth
Standard deviation of the spectral distribution around the spectral centroid, quantifying the frequency range of the audio signal.
Spectral Contrast
Measure of the energy difference between spectral peaks and valleys in different frequency bands, distinguishing harmonic sounds from noise.
Tonnetz
Geometric representation of harmonic relationships in a 6D space based on thirds and fifths, capturing the tonal structure of music.
Tempogram
Time-frequency representation of tempo, showing the evolution of predominant rhythmic speeds in an audio signal.
Wavelet Transform
Decomposition of the signal into wavelets of different scales and positions, offering better temporal resolution for high frequencies.
Constant-Q Transform
Time-frequency transform with a constant Q factor, providing logarithmic frequency resolution suitable for musical analysis.
Mel-Spectrogram
Spectrogram with frequencies converted to the Mel scale which better corresponds to human perception of pitch.
Gammatone Filterbank
Filter bank modeling the response of the human cochlea, used to extract relevant psychoacoustic features.
Harmonic-Percussive Separation
Source separation technique distinguishing harmonic components with stable temporal structure from percussive components with stable frequency structure.
Pitch Tracking
Automatic estimation of the fundamental frequency (F0) of an audio signal over time, essential for melodic analysis.
Spectral Flatness
Ratio between the geometric mean and arithmetic mean of the spectrum, measuring the tonal quality (close to 0) or noisy quality (close to 1) of the signal.
Autocorrelation
Measure of similarity between a signal and its time-shifted versions, used to detect periodicity and estimate pitch.