Glosarium AI
Kamus lengkap Kecerdasan Buatan
Apache Spark SQL
Spark module providing a SQL and DataFrame interface to execute distributed queries with Catalyst optimization and Tungsten execution for enhanced performance.
Presto
Open source distributed SQL query engine designed for interactive analysis of large-scale data across various sources without data movement.
Apache Drill
Distributed schema-less query engine enabling SQL analysis of NoSQL data and structured files without predefined schema, with native JSON support.
HiveQL
SQL-like query language for Apache Hive, transforming queries into MapReduce or Tez jobs for distributed data analysis in Hadoop.
Apache Impala
Massively parallel SQL query engine for Hadoop, providing low-latency analytics with native architecture bypassing MapReduce for direct data access.
Trino
High-performance distributed SQL query engine, formerly PrestoSQL, optimized for federated data analysis across multiple sources with parallel execution.
Cost-Based Optimization
Optimization strategy using statistics on data volumes and distributions to evaluate and select the most efficient execution plan.
Apache Calcite
Dynamic data management framework providing SQL parsing, validation, optimization, and query execution for many distributed database engines.
Vectorized Query Execution
Query execution technique that processes data in batches rather than row by row, improving CPU cache utilization and performance.
Distributed Join
Data join operation distributed across multiple nodes, requiring partitioning and shuffle strategies to efficiently combine distributed datasets.
Adaptive Query Execution
Dynamic optimization approach that adjusts the execution plan in real-time based on statistics collected during execution to improve performance.