Distributed Computing Models - Yapay Zeka Sözlüğü

📖

terimler

MapReduce

Parallel programming model for processing large datasets on clusters, dividing processing into two main phases: Map for filtering and transforming, and Reduce for aggregating results.

📖

terimler

Lambda Architecture

Data processing architecture combining a batch path for comprehensive analysis and a speed path for real-time results, with a unified service layer to merge both views.

📖

terimler

Kappa Architecture

Simplification of Lambda architecture using only a stream processing pipeline, where data is processed in real-time and historical queries are satisfied by replaying events.

📖

terimler

Batch Processing

Processing mode where data is collected and processed in batches at predefined intervals, optimized for throughput rather than latency, typical of traditional ETL analyses.

📖

terimler

Stream Processing

Continuous processing of data in motion as it is generated, enabling real-time analysis with minimal latency between capture and processing.

📖

terimler

Distributed File System

File system storing data across multiple servers while appearing as a single system to users, ensuring replication and fault tolerance for reliability.

📖

terimler

HDFS

Hadoop Distributed File System, distributed file system designed to store petabytes of data on standard hardware with high fault tolerance through block replication.

📖

terimler

YARN

Yet Another Resource Negotiator, Hadoop resource manager separating data processing from resource management, enabling execution of multiple frameworks on the same cluster.

📖

terimler

RDD

Resilient Distributed Dataset, fundamental data structure of Spark representing an immutable and partitioned collection of objects that can be computed in parallel with automatic fault tolerance.

📖

terimler

Data Locality

Distributed computing principle where tasks are executed on nodes containing the necessary data, minimizing network transfer and significantly improving performance.

📖

terimler

Speculative Execution

Fault tolerance mechanism launching copies of slow tasks on other nodes, using the first completed result to reduce the impact of faulty or overloaded nodes.

📖

terimler

DAG

Directed Acyclic Graph, representation of the Spark workflow where transformations are organized in a directed acyclic graph, optimizing parallel execution of steps.

📖

terimler

Fault Tolerance

Ability of a distributed system to continue functioning correctly in case of component failures, typically through redundancy, replication, and automatic recovery mechanisms.

📖

terimler

Consistency Model

Contract defining data consistency guarantees in a distributed system, ranging from strong consistency to eventual consistency based on application needs.

📖

terimler

Combiner

MapReduce optimization function executed locally on each mapper to reduce the volume of data transferred during shuffle, applying pre-aggregation before the reduce phase.

YZ Sözlüğü

MapReduce

Lambda Architecture

Kappa Architecture

Batch Processing

Stream Processing

Distributed File System

HDFS

YARN

RDD

Data Locality

Speculative Execution

DAG

Fault Tolerance

Consistency Model

Combiner

Sonuç bulunamadı