LightGBM - Glossario IA

📖

termini

Leaf-wise Growth

Tree splitting strategy that chooses the leaf with the largest reduction in loss to split, unlike level-wise growth, allowing for faster convergence with less depth.

📖

termini

Feature Binning

Technique for discretizing continuous features into discrete intervals (bins) to speed up the calculation of split points and reduce memory footprint, at the cost of a slight loss in precision.

📖

termini

Gradient-Based One-Side Sampling (GOSS)

Innovative sampling method from LightGBM that keeps all instances with large gradients and performs random sampling on those with small gradients, speeding up training without significant loss of accuracy.

📖

termini

Exclusive Feature Bundling (EFB)

Dimensionality reduction algorithm that identifies and groups mutually exclusive features (rarely non-zero at the same time) into a single composite feature, thus reducing the number of features.

📖

termini

Gradient Histogram

Data structure used by LightGBM to store gradients and hessians in bins, allowing for fast calculation of statistics for each potential split point during tree construction.

📖

termini

Num Leaves

Main parameter of LightGBM that controls the maximum number of leaves in each tree, directly influencing model complexity and the bias-variance tradeoff; more important than `max_depth` for leaf-wise growth.

📖

termini

L1 and L2 Regularization

Regularization parameters (`lambda_l1`, `lambda_l2`) applied to leaf weights to control model complexity and prevent overfitting by respectively penalizing high weights and the magnitude of the weights.

📖

termini

Min Data in Leaf

Minimum number of samples required in a leaf (or minimum total weight), a key parameter to avoid creating overly specific leaves and combat overfitting in LightGBM models.

📖

termini

CatBoost Feature Handling

LightGBM's ability to natively handle categorical features using a specific transformation that maps them to integers, thus avoiding manual one-hot encoding and improving efficiency.

📖

termini

Leaf-wise Growth Overfitting

Specific risk of leaf-wise growth where the model can overfit by creating very deep and specialized leaves, requiring increased regularization (e.g., `num_leaves`, `min_data_in_leaf`) to control it.

📖

termini

DART (Dropouts meet Multiple Additive Regression Trees)

Boosting variant implemented in LightGBM that applies the dropout technique to previous trees when adding a new tree, improving regularization and performance on certain datasets.

Glossario IA

Leaf-wise Growth

Feature Binning

Gradient-Based One-Side Sampling (GOSS)

Exclusive Feature Bundling (EFB)

Gradient Histogram

Num Leaves

L1 and L2 Regularization

Min Data in Leaf

CatBoost Feature Handling

Leaf-wise Growth Overfitting

DART (Dropouts meet Multiple Additive Regression Trees)

Nessun risultato trovato