Gradient-based Neural Architecture Search

📖

Begriffe

DARTS

Differentiable Architecture Search, a pioneering method that transforms the discrete architecture search problem into a continuous optimization problem using differentiable architecture weights.

📖

Begriffe

Relaxed architecture

A continuous representation of a discrete architecture space where candidate operations are combined with softmax weights, allowing optimization via gradient descent.

📖

Begriffe

Architecture weights

Continuous parameters (often denoted as alpha) that determine the relative importance of each candidate operation in the relaxed architecture and are optimized via gradient.

📖

Begriffe

Mixed operations

A weighted combination of several candidate operations (convolution, pooling, etc.) in a relaxed architecture, where the weights determine the contribution of each operation.

📖

Begriffe

Bi-level optimization

A two-level optimization problem where the network weights are optimized at the lower level and the architecture parameters at the upper level, requiring second-order gradients.

📖

Begriffe

Computational cell

A basic, repeatable block in the network architecture whose internal structure (connections and operations) is automatically discovered by NAS.

📖

Begriffe

Architecture discretization

The final process in NAS where the relaxed continuous architecture is converted into a discrete architecture by selecting the operation with the highest alpha weight for each connection.

📖

Begriffe

Architecture gradient

The gradient of the validation loss with respect to the architecture weights, used to update the network structure during the architecture search.

📖

Begriffe

Supercell

A basic structure larger than a simple cell, containing several interconnected cells to increase the complexity and expressiveness of the search space.

📖

Begriffe

Path pruning

Technique of progressively pruning less important architecture paths based on their architecture weights, reducing computational complexity during the search.

📖

Begriffe

Differentiable skip connections

Skip connections with learnable weights in the relaxed architecture, allowing the model to dynamically decide whether to use these connections.

📖

Begriffe

Continuous search space

Relaxation of the discrete search space into a continuous domain where each possible architecture corresponds to a point in this continuous space.

📖

Begriffe

Alpha parameters

Continuous variables in differentiable NAS that control the mixing of operations on each connection and are optimized to find the best architecture.

📖

Begriffe

Joint optimization

Simultaneous process of optimizing the network weights and architecture parameters, typically done alternately in differentiable NAS methods.

📖

Begriffe

Approximate gradient

Technique used to approximate computationally expensive second-order gradients in bi-level optimization, usually by ignoring certain terms to improve efficiency.

📖

Begriffe

Architecture parameters

The set of architecture weights that define the network's structure in differentiable NAS, distinct from the model weights which define the data transformations.

📖

Begriffe

Continuous relaxation

Mathematical transformation that converts a discrete combinatorial optimization problem into a continuous one, allowing the use of gradient-based optimization methods.

📖

Begriffe

Warm-up phase

Initial phase of differentiable NAS where training focuses on the network weights before starting the optimization of architecture parameters.

KI-Glossar

DARTS

Relaxed architecture

Architecture weights

Mixed operations

Bi-level optimization

Computational cell

Architecture discretization

Architecture gradient

Supercell

Path pruning

Differentiable skip connections

Continuous search space

Alpha parameters

Joint optimization

Approximate gradient

Architecture parameters

Continuous relaxation

Warm-up phase

Keine Ergebnisse gefunden