🏠 Startseite
Vergleiche
📊 Alle Benchmarks 🦖 Dinosaurier v1 🦖 Dinosaurier v2 ✅ To-Do-Listen-Apps 🎨 Kreative freie Seiten 🎯 FSACB - Ultimatives Showcase 🌍 Übersetzungs-Benchmark
Modelle
🏆 Top 10 Modelle 🆓 Kostenlose Modelle 📋 Alle Modelle ⚙️ Kilo Code
Ressourcen
💬 Prompt-Bibliothek 📖 KI-Glossar 🔗 Nützliche Links

KI-Glossar

Das vollständige Wörterbuch der Künstlichen Intelligenz

162
Kategorien
2.032
Unterkategorien
23.060
Begriffe
📖
Begriffe

HDFS

Hadoop's primary distributed file system designed to store petabytes of data on standard machine clusters with automatic replication and fault tolerance.

📖
Begriffe

MapReduce

Programming paradigm and implementation for distributed processing of large datasets on clusters, dividing tasks into mapping and reduction phases.

📖
Begriffe

YARN

Hadoop's resource manager that orchestrates the allocation of CPU and memory resources to applications while managing task lifecycles in the cluster.

📖
Begriffe

HBase

Distributed, column-oriented, non-relational NoSQL database built on HDFS, offering real-time access to massive data with strong consistency.

📖
Begriffe

Hive

Data warehouse infrastructure on Hadoop enabling querying of large datasets with a SQL-like language (HiveQL) while using MapReduce for execution.

📖
Begriffe

Pig

High-level data analysis platform using the Pig Latin language to express complex data transformation programs executed on Hadoop.

📖
Begriffe

Spark

Ultra-fast unified processing engine for Big Data, offering APIs in Scala, Java, Python and R with support for SQL, streaming, machine learning and graph processing.

📖
Begriffe

ZooKeeper

Centralized distributed coordination service for maintaining configuration information, naming, distributed synchronization, and group service management.

📖
Begriffe

Flume

Distributed, reliable, and available service for collecting, aggregating, and moving large amounts of streaming data to HDFS with an agent-based architecture.

📖
Begriffe

Sqoop

Tool designed to efficiently transfer bulk data between Hadoop and structured databases such as relational databases.

📖
Begriffe

Oozie

Workflow and coordinator system for managing and executing complex Hadoop data processing pipelines with time-based and conditional dependencies.

📖
Begriffe

Mahout

Library of distributed machine learning and data mining algorithms implemented on Hadoop MapReduce for processing large datasets.

📖
Begriffe

Ambari

Hadoop cluster management and monitoring platform offering a web interface for provisioning, managing, and monitoring the complete Hadoop ecosystem.

📖
Begriffe

HCatalog

Metadata and table management service for the Hadoop ecosystem, providing a unified view of data for tools like Pig, Hive, and MapReduce.

📖
Begriffe

Avro

Data serialization system with evolving schema, providing compact and fast data formats for exchanges between Hadoop services.

📖
Begriffe

Parquet

Columnar file format optimized for analytical query performance on Hadoop, with efficient compression and support for complex types.

📖
Begriffe

Impala

Massively parallel SQL query engine for Hadoop providing low-latency interactive query performance on data stored in HDFS and HBase.

📖
Begriffe

Tez

Generalized acyclic data execution framework for Hadoop YARN, optimizing performance of complex processing by eliminating unnecessary MapReduce phases.

📖
Begriffe

Storm

Distributed real-time stream processing system for Hadoop, capable of processing massive volumes of data with millisecond-level latencies.

📖
Begriffe

Kafka

High-performance, high-availability distributed messaging platform for collecting and processing real-time data streams in the Hadoop ecosystem.

🔍

Keine Ergebnisse gefunden