AI-ordlista
Den kompletta ordlistan över AI
Apache Kafka
Open-source distributed streaming platform designed to handle real-time data streams with high throughput and low latency, used as a message broker and log storage system.
Apache Flink
Distributed stream and batch processing framework that offers complex event processing capabilities with state management and exactly-once semantics for real-time applications.
Windowing
Fundamental stream processing technique that divides the continuous data stream into time-based or count-based windows to perform aggregations and analyses on data subsets.
Backpressure
Flow control mechanism that allows processing systems to regulate the speed of data producers when consumers cannot keep up, thus preventing system saturation.
Watermark
Temporal marker embedded in the data stream that allows tracking the progress of event time and managing late data in stream processing systems.
Stateful Processing
Processing paradigm where operations maintain a persistent state between events, essential for aggregations, joins, and complex pattern detection in data streams.
Exactly-Once Semantics
Processing guarantee that ensures each stream event is processed exactly once, even in case of failures, combining at-least-once delivery with consumer-side deduplication.
CEP (Complex Event Processing)
Event processing technology that identifies meaningful patterns and complex correlations from multiple event streams in real-time to trigger immediate actions.
Micro-batching
Hybrid approach that processes data streams by collecting micro-batches of events over short intervals, combining the advantages of batch processing and pure event processing.
Event Sourcing
Architectural pattern where all state changes are recorded as an immutable sequence of events, allowing reconstruction of past states and complete system audit.
Apache Storm
Distributed real-time stream processing system designed for extremely low latencies, using a topology of spouts and bolts to transform and analyze data streams.
Change Data Capture (CDC)
Technique that captures and propagates data changes from transactional databases to real-time streaming systems, enabling continuous synchronization and analysis.
Event Time vs Processing Time
Two fundamental temporal concepts where event time corresponds to when the event occurred, while processing time is when it is processed by the system.
Stream Analytics
Discipline that applies advanced analytical techniques on continuous data streams to extract insights, detect anomalies and make real-time decisions.
Data Pipeline Streaming
Data pipeline architecture specifically designed for continuous processing where data flows through multiple transformation and enrichment stages without intermediate storage.
Message Queue
Middleware component that ensures asynchronous communication between message producers and consumers, guaranteeing reliable event delivery in distributed architectures.
Real-time ETL
A process of extracting, transforming, and loading data that runs continuously on real-time streams, unlike traditional batch ETL which runs periodically.
Apache Beam
A unified framework for batch and stream data processing that provides an abstract programming model capable of running on multiple runners like Flink, Spark, or Dataflow.