AI 용어집
인공지능 완전 사전
Automatic Imputation
Technique that automatically replaces missing values in a dataset using statistical methods or predictive models. Automatic imputation adapts the replacement strategy according to the variable type and pattern of missing data.
Automatic Normalization
Process that automatically adjusts the scale of numerical variables to bring them into a standardized range, typically between 0 and 1. This technique eliminates biases related to different measurement units and optimizes the convergence of learning algorithms.
Automatic Categorical Encoding
Systematic method of automatically converting categorical variables into numerical representations suitable for machine learning algorithms. It selects and applies the most appropriate encoding technique based on the cardinality and nature of the categories.
Automatic Outlier Detection
Algorithm that automatically identifies abnormal or extreme observations in a dataset using statistical methods or unsupervised learning. The detection dynamically adapts to multivariate distributions and specific data contexts.
Automated Data Pipeline
Orchestrated sequence of data transformations that automatically execute from raw data to the final format for modeling. This pipeline ensures reproducibility and continuous optimization of preprocessing steps.
Automatic Feature Selection
Algorithmic process that automatically identifies and retains the most relevant variables for a given prediction problem. This technique uses importance metrics, statistical tests, or wrapper methods to optimize model performance.
Automatic Logarithmic Transformation
Automatic application of logarithmic transformations to skewed variables to normalize their distribution. The algorithm detects variables requiring this transformation based on skewness and kurtosis measures.
Automatic Discretization
Process that automatically converts continuous variables into categorical variables by identifying optimal breakpoints. This technique uses methods like entropy-based binning or quantiles to maximize predictive power.
Automatic Scaling
Automatic standardization of numerical features to eliminate scale differences between variables. The process adapts the scaling method according to the data distribution and target algorithm requirements.
Automatic Missing Values Handling
Complete system automatically analyzing missing data patterns and applying the most appropriate processing strategy. This approach combines detection, classification, and adaptive imputation according to the missingness mechanism.
Automatic Class Balancing
Technique automatically adjusting class distribution in imbalanced classification problems via oversampling, undersampling, or hybrid methods. The algorithm optimizes the bias-variance tradeoff to improve performance on minority classes.
Automatic Dimensionality Reduction
Automatic application of techniques like PCA, t-SNE, or autoencoders to reduce the number of variables while preserving relevant information. The system selects the optimal method based on data structure and modeling objectives.
Automatic Feature Extraction
Automatic generation of informative features from raw data using deep learning algorithms or statistical methods. This transformation creates higher-level representations optimized for the target task.
Automatic Text Cleaning
Automated preprocessing pipeline applying normalization, tokenization, stop word removal, and stemming/lemmatization to text data. The process adapts according to the specific language and domain of documents.
Automatic Denoising
Process automatically removing noise from data using filtering, smoothing, or unsupervised learning techniques. This method preserves relevant signals while reducing artifacts that could harm modeling.
Automatic Standardization
Automatic transformation of variables to follow a normal distribution with mean 0 and standard deviation 1. This technique identifies variables requiring standardization and applies the appropriate transformation.
Automatic Feature Scaling
Adaptive process that automatically applies the most appropriate scaling technique (min-max, robust, quantile) based on the distribution of each variable. This optimization improves the convergence and performance of machine learning algorithms.