KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
DARTS
Differentiable Architecture Search, a pioneering method that transforms the discrete architecture search problem into a continuous optimization problem using differentiable architecture weights.
Relaxed architecture
A continuous representation of a discrete architecture space where candidate operations are combined with softmax weights, allowing optimization via gradient descent.
Architecture weights
Continuous parameters (often denoted as alpha) that determine the relative importance of each candidate operation in the relaxed architecture and are optimized via gradient.
Mixed operations
A weighted combination of several candidate operations (convolution, pooling, etc.) in a relaxed architecture, where the weights determine the contribution of each operation.
Bi-level optimization
A two-level optimization problem where the network weights are optimized at the lower level and the architecture parameters at the upper level, requiring second-order gradients.
Computational cell
A basic, repeatable block in the network architecture whose internal structure (connections and operations) is automatically discovered by NAS.
Architecture discretization
The final process in NAS where the relaxed continuous architecture is converted into a discrete architecture by selecting the operation with the highest alpha weight for each connection.
Architecture gradient
The gradient of the validation loss with respect to the architecture weights, used to update the network structure during the architecture search.
Supercell
A basic structure larger than a simple cell, containing several interconnected cells to increase the complexity and expressiveness of the search space.
Path pruning
Technique of progressively pruning less important architecture paths based on their architecture weights, reducing computational complexity during the search.
Differentiable skip connections
Skip connections with learnable weights in the relaxed architecture, allowing the model to dynamically decide whether to use these connections.
Continuous search space
Relaxation of the discrete search space into a continuous domain where each possible architecture corresponds to a point in this continuous space.
Alpha parameters
Continuous variables in differentiable NAS that control the mixing of operations on each connection and are optimized to find the best architecture.
Joint optimization
Simultaneous process of optimizing the network weights and architecture parameters, typically done alternately in differentiable NAS methods.
Approximate gradient
Technique used to approximate computationally expensive second-order gradients in bi-level optimization, usually by ignoring certain terms to improve efficiency.
Architecture parameters
The set of architecture weights that define the network's structure in differentiable NAS, distinct from the model weights which define the data transformations.
Continuous relaxation
Mathematical transformation that converts a discrete combinatorial optimization problem into a continuous one, allowing the use of gradient-based optimization methods.
Warm-up phase
Initial phase of differentiable NAS where training focuses on the network weights before starting the optimization of architecture parameters.