Audio Feature Engineering - 인공지능 용어집

📖

용어

MFCC (Mel-frequency Cepstral Coefficients)

Cepstral coefficients calculated on a Mel frequency scale that approximate human audio perception, widely used in speech recognition and music analysis.

📖

용어

Chroma Features

12-dimensional representation that projects the audio spectrum onto the 12 pitch classes of the chromatic scale, independent of octave, ideal for harmonic analysis.

📖

용어

Spectrogram

Visual representation of the frequency spectrum of an audio signal over time, obtained by successively applying the Fourier transform on sliding windows.

📖

용어

Short-Time Fourier Transform (STFT)

Time-frequency analysis technique that divides the signal into short segments and applies the Fourier transform to each segment to capture spectral evolution.

📖

용어

Zero-Crossing Rate

Measure of the number of times the audio signal crosses the zero axis per unit time, useful indicator for discriminating voiced from unvoiced sounds.

📖

용어

Spectral Centroid

Weighted center of gravity of the spectral distribution of a signal, indicating the dominant average frequency and allowing characterization of perceived brightness.

📖

용어

Spectral Rolloff

Frequency below which a specified percentage (typically 85% or 95%) of the total spectral energy is found, indicator of spectral distribution.

📖

용어

Spectral Flux

Measure of the change in spectral distribution between successive frames, used to detect transitions and attacks in audio signals.

📖

용어

Spectral Bandwidth

Standard deviation of the spectral distribution around the spectral centroid, quantifying the frequency range of the audio signal.

📖

용어

Spectral Contrast

Measure of the energy difference between spectral peaks and valleys in different frequency bands, distinguishing harmonic sounds from noise.

📖

용어

Tonnetz

Geometric representation of harmonic relationships in a 6D space based on thirds and fifths, capturing the tonal structure of music.

📖

용어

Tempogram

Time-frequency representation of tempo, showing the evolution of predominant rhythmic speeds in an audio signal.

📖

용어

Wavelet Transform

Decomposition of the signal into wavelets of different scales and positions, offering better temporal resolution for high frequencies.

📖

용어

Constant-Q Transform

Time-frequency transform with a constant Q factor, providing logarithmic frequency resolution suitable for musical analysis.

📖

용어

Mel-Spectrogram

Spectrogram with frequencies converted to the Mel scale which better corresponds to human perception of pitch.

📖

용어

Gammatone Filterbank

Filter bank modeling the response of the human cochlea, used to extract relevant psychoacoustic features.

📖

용어

Harmonic-Percussive Separation

Source separation technique distinguishing harmonic components with stable temporal structure from percussive components with stable frequency structure.

📖

용어

Pitch Tracking

Automatic estimation of the fundamental frequency (F0) of an audio signal over time, essential for melodic analysis.

📖

용어

Spectral Flatness

Ratio between the geometric mean and arithmetic mean of the spectrum, measuring the tonal quality (close to 0) or noisy quality (close to 1) of the signal.

📖

용어

Autocorrelation

Measure of similarity between a signal and its time-shifted versions, used to detect periodicity and estimate pitch.

AI 용어집