🏠 Trang chủ
Benchmark
📊 Tất cả benchmark 🦖 Khủng long v1 🦖 Khủng long v2 ✅ Ứng dụng To-Do List 🎨 Trang tự do sáng tạo 🎯 FSACB - Trình diễn cuối cùng 🌍 Benchmark dịch thuật
Mô hình
🏆 Top 10 mô hình 🆓 Mô hình miễn phí 📋 Tất cả mô hình ⚙️ Kilo Code
Tài nguyên
💬 Thư viện prompt 📖 Thuật ngữ AI 🔗 Liên kết hữu ích

Thuật ngữ AI

Từ điển đầy đủ về Trí tuệ nhân tạo

162
danh mục
2.032
danh mục con
23.060
thuật ngữ
📖
thuật ngữ

Apache Hadoop MapReduce

Programming model and distributed implementation for processing large datasets on clusters, dividing processing into Map and Reduce phases. MapReduce is one of the first popular frameworks for large-scale batch processing.

📖
thuật ngữ

ETL (Extract, Transform, Load)

Data integration process consisting of extracting data from heterogeneous sources, transforming it according to defined business rules, then loading it into a target system. ETL pipelines are typically executed in batch to synchronize data.

📖
thuật ngữ

Job Scheduling

Automatic scheduling mechanism for batch processing tasks according to predefined schedules, dependencies, or event triggers. Modern schedulers manage parallelization, retries, and execution monitoring.

📖
thuật ngữ

Shuffling

Costly data redistribution operation between cluster nodes during grouping or aggregation phases in distributed processing. Shuffling often represents the main bottleneck in MapReduce and Spark jobs.

📖
thuật ngữ

HDFS

Distributed file system designed to store large files on standard machines with fault tolerance through replication. HDFS provides high-performance sequential access suitable for batch processing with MapReduce.

📖
thuật ngữ

YARN

Resource orchestrator for the Hadoop ecosystem, responsible for allocating CPU, memory, and storage to distributed applications. YARN enables concurrent execution of multiple processing frameworks on the same Hadoop cluster.

📖
thuật ngữ

Apache Sqoop

Bidirectional data transfer tool between Hadoop and relational databases, optimized for massive parallel imports/exports. Sqoop automatically generates the necessary MapReduce code to efficiently move data.

📖
thuật ngữ

Apache Hive

Data warehousing infrastructure built on Hadoop providing a SQL-like interface (HQL) for querying large volumes of data stored in HDFS. Hive translates queries into MapReduce jobs for batch execution.

🔍

Không tìm thấy kết quả