KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Model Versioning
A system for managing machine learning model versions, allowing for tracking iterations, comparing performance, and reverting to previous versions in case of regression.
Data Version Control (DVC)
An open-source tool that extends Git to manage the versioning of large datasets and models, storing metadata in Git and binary files in cloud storage.
Git LFS
Git Large File Storage, a Git extension for versioning large files such as datasets and models by storing them separately from the main Git repository.
Experiment Tracking
The systematic process of recording hyperparameters, metrics, artifacts, and results from ML experiments to ensure traceability and reproducibility.
Model Registry
A centralized system for storing and managing ML model versions with their metadata, deployment statuses, and performance history.
Reproducibility
The ability to recreate the exact same results from an ML experiment using the same data, code, parameters, and execution environment.
Artifact Repository
A versioned storage system for ML artifacts, including trained models, preprocessed datasets, and other binary files with metadata management.
Model Drift Detection
The process of continuously monitoring performance and data distributions to identify model degradation caused by changes in data patterns.
Continuous Integration for ML (CI/ML)
Automating code, data, and model validation tests on each commit to ensure quality and consistency in the ML pipeline.
Continuous Training
Automating the periodic retraining of models with new data to maintain their relevance in the face of evolving data patterns.
Model Monitoring
Continuous monitoring in production of predictions, performance, and input distributions to detect anomalies and ensure models are functioning correctly.
Data Provenance
Complete documentation of the origin, history, and transformations of data used in ML pipelines, essential for auditing and compliance.
Semantic Versioning
Version numbering convention (X.Y.Z) for models where X indicates major changes, Y feature additions, and Z minor bug fixes.
Lakehouse Architecture
Hybrid architecture combining the flexibility of data lakes with the structured management of data warehouses to optimize ML data versioning and analysis.
Model Baseline
A reference model or initial version used as a point of comparison to evaluate improvements or detect regressions in new versions.