AI-woordenlijst
Het complete woordenboek van kunstmatige intelligentie
Data Warehouse
Centralized data repository optimized for analysis and decision-making, collecting operational and historical data from multiple sources. Designed to support complex analytical queries on massive volumes of structured data.
Data Mart
Subset of a data warehouse focused on a specific business domain or particular department. Facilitates access to relevant data for targeted analyses while reducing query complexity.
ETL (Extract, Transform, Load)
Data integration process extracting information from heterogeneous sources, transforming it according to business rules, then loading it into the data warehouse. Ensures data quality and consistency before analysis.
ELT (Extract, Load, Transform)
Modern integration approach where raw data is first loaded into the target system then transformed in-situ. Optimizes performance on cloud platforms and distributed architectures.
OLAP (Online Analytical Processing)
Multidimensional analysis technology enabling complex queries on large volumes of historical data. Supports drill-down, roll-up, slice and dice operations for data exploration.
OLTP (Online Transaction Processing)
Real-time transaction management system optimized for CRUD operations (Create, Read, Update, Delete). Designed to handle a large number of short, atomic transactions with high concurrency.
Star Schema
Data modeling for data warehouse with a central fact table surrounded by denormalized dimension tables. Optimizes analytical query performance by minimizing joins.
Snowflake Schema
Variant of star schema where dimension tables are normalized into multiple table hierarchies. Reduces data redundancy but increases analytical query complexity.
Fact Table
Central table in a dimensional schema containing numerical measures and foreign keys to dimensions. Stores quantitative business facts such as sales, transactions, or performance indicators.
Dimension Table
Table describing the context of measures in the fact table, containing qualitative descriptive attributes. Enables data analysis along different axes such as time, geography, or products.
Data Vault
Hybrid modeling methodology combining the advantages of 3NF and star schema for scalable data warehouses. Separates hubs, links, and satellites to ensure auditability and scalability.
Columnar Database
Database storing data by columns rather than rows, optimizing analytical queries on column subsets. Significantly reduces response times and storage space for BI workloads.
In-Memory Database
Database system primarily storing data in RAM for near-instantaneous access performance. Drastically accelerates complex analyses and interactive reports on data warehouse data.
Distributed Query Processing
Technique executing queries across multiple compute nodes in parallel to process massive data volumes. Divides processing into distributed tasks to optimize resource utilization and reduce response times.
Data Federation
Virtual integration approach providing a unified view of data from heterogeneous sources without physical duplication. Enables real-time analysis on distributed systems while preserving source data.
Aggregate Table
Pre-calculated table containing summarized data at different granularity levels to accelerate recurring queries. Essential optimization strategy for BI report performance on large volumes.
Slowly Changing Dimension (SCD)
Technique for managing changes in dimension tables to track the historical evolution of attributes. Implements different strategies (Type 1, 2, 3) based on temporal data traceability needs.
Data Pipeline
Sequence of automated processes capturing, transforming, and delivering data from source to final destination. Orchestrates the continuous flow of data to feed analytical systems and BI applications.