AI-ordlista
Den kompletta ordlistan över AI
A/B Testing
Experimental methodology comparing two versions (A and B) of a model or service to determine which performs better according to predefined metrics, typically through random traffic distribution.
Multivariate Testing
Advanced technique testing multiple variables and their combinations simultaneously to identify overall optimization, allowing evaluation of interactions between different factors of the model.
Blue-Green Deployment
Deployment pattern with two identical environments where traffic completely switches from the old version (Blue) to the new one (Green) after full validation, minimizing downtime.
Feature Flag
Control mechanism allowing dynamic activation/deactivation of specific features or models without redeployment, facilitating experiments and quick rollbacks.
Traffic Splitting
Intelligent routing technique proportionally distributing requests between different model versions according to configurable rules for A/B tests or gradual deployments.
Statistical Significance
Probabilistic measure determining whether observed differences between tested variants are due to real effects rather than chance, typically with a p-value threshold < 0.05.
P-value
Probability of observing results at least as extreme as those measured if the null hypothesis were true, serving as a decision criterion in hypothesis testing.
Confidence Interval
Range of estimated values containing the true value of the measured parameter with a defined probability (typically 95%), quantifying the uncertainty of experimental estimates.
Control Group
Population sample receiving the reference version (usually the current model) serving as a baseline for statistical comparison with experimental variants.
Treatment Group
Population segment exposed to the experimental variant of the model or treatment being tested, allowing for the measurement of relative impact compared to the control group.
Baseline Model
Reference model used as a point of comparison to evaluate improvements made by new versions, often the model currently in production.
Champion-Challenger
Continuous competition strategy where the current champion model is constantly challenged by challenger models, with the best performer progressively replacing the champion.
Progressive Rollout
Incremental deployment of a new model with a gradual increase in traffic percentage, allowing for continuous validation and minimization of the risk of negative impact.
Experimentation Platform
Centralized infrastructure managing the complete lifecycle of experiments, from variant creation to statistical analysis of results and decision automation.
Metric Drift
Phenomenon of gradual degradation of a model's performance metrics in production, detected through continuous monitoring and requiring periodic re-evaluations.
Sample Size Calculation
Statistical process determining the minimum number of observations required to detect a significant difference with a given statistical power, essential for test planning.
Bayesian A/B Testing
Alternative approach using Bayesian probabilities to evaluate variants, enabling continuous decisions with smaller samples and intuitive interpretation of results.
Sequential Testing
Analysis methodology that allows evaluating results at predefined intervals without inflating the Type I error risk, optimizing the duration and costs of experiments.