KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Column Family
Logical grouping of related columns in columnar databases, enabling hierarchical data organization for more efficient access.
Row Group
Processing unit in columnar formats containing a set of vertically stored rows, optimizing I/O operations and compression.
Column Chunk
Physical data fragment containing values from a specific column, compressed and stored independently to enable selective data access.
Parquet Format
Open-source columnar storage format optimized for analytical workloads, using efficient encoding and advanced compression techniques.
ORC Format
Columnar format optimized for Apache Hive, providing high compression and fast query performance with strict data typing.
Vectorized Execution
Processing technique where operations are applied to batches of data in parallel, reducing overhead and improving columnar query throughput.
Predicate Pushdown
Optimization pushing query filters to the data source, reducing the amount of data read and processed in columnar systems.
Column Pruning
Technique eliminating the reading of unrequired columns in a query, leveraging columnar organization to minimize disk access.
Dictionary Encoding
Compression method replacing repeated values with short identifiers, particularly effective for categorical data in columnar systems.
Zone Maps
Metadata indicating minimum and maximum values in data segments, allowing rapid elimination of irrelevant blocks during queries.
Delta Encoding
Compression technique storing differences between successive values rather than absolute values, optimal for ordered and temporal data.
RLE Encoding
Run Length Encoding compressing sequences of identical values by storing the value and the number of consecutive occurrences.
Bloom Filters
Probabilistic data structures allowing quick determination of a value's absence in a set, optimizing searches in columnar systems.
Skip Index
Metadata allowing direct skipping to relevant data blocks during sequential column reading, accelerating data scans.
Vertical Partitioning
Process of physically dividing data into partitions based on columns, enabling efficient distribution and parallelism in columnar clusters.
Pushdown Aggregation
Optimization moving aggregation calculations to the storage layer, reducing the volume of data transferred in columnar architectures.