🏠 Home
Benchmark Hub
📊 All Benchmarks 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List Applications 🎨 Creative Free Pages 🎯 FSACB - Ultimate Showcase 🌍 Translation Benchmark
Models
🏆 Top 10 Models 🆓 Free Models 📋 All Models ⚙️ Kilo Code
Resources
💬 Prompts Library 📖 AI Glossary 🔗 Useful Links

AI Glossary

The complete dictionary of Artificial Intelligence

162
categories
2,032
subcategories
23,060
terms
📖
terms

Amazon S3

Highly scalable cloud object storage service from AWS offering 99.999999999% durability and used as the primary repository for Big Data with storage classes adapted to different access patterns.

📖
terms

Amazon EMR

Managed AWS service for running Big Data frameworks like Apache Spark, Hadoop, and Presto on dynamic clusters, enabling large-scale distributed processing with simplified infrastructure management.

📖
terms

Amazon Redshift

Fully managed cloud data warehouse from AWS using a massively parallel architecture (MPP) to analyze petabytes of data with performance optimized for complex analytical queries.

📖
terms

Amazon Athena

Serverless interactive query service from AWS allowing direct analysis of data in S3 using standard SQL, without requiring infrastructure management or prior data loading.

📖
terms

AWS Glue

Serverless ETL service from AWS that automates data discovery, preparation, and loading with a centralized data catalog and built-in transformation capabilities based on Apache Spark.

📖
terms

Azure Data Lake Storage

Massively scalable and secure data repository from Azure optimized for Big Data analytical workloads, combining the storage capacity of a data lake with the performance of a file system.

📖
terms

Azure Synapse Analytics

Unified hybrid analytics platform from Azure integrating data warehousing, data integration, and Big Data analytics with SQL and Spark processing capabilities in the same environment.

📖
terms

Azure Databricks

Unified analytics service based on Apache Spark in Azure, offering a collaborative environment for Big Data processing, machine learning, and real-time analytics with optimized clusters.

📖
terms

Google Cloud Storage

Google Cloud's unified object storage service offering high availability, durability, and performance for Big Data with different storage classes optimized based on access frequencies.

📖
terms

Google BigQuery

Google Cloud's serverless data warehouse enabling real-time analysis of petabytes with interactive SQL queries and a serverless architecture that automatically scales according to needs.

📖
terms

Google Dataproc

Google Cloud's managed service for running Apache Spark and Hadoop with quickly provisioned clusters, offering native integration with the GCP ecosystem and optimized costs for Big Data processing.

📖
terms

Google Dataflow

Google Cloud's serverless stream and batch processing service based on Apache Beam, enabling execution of distributed data pipelines with automatic autoscaling and simplified management.

📖
terms

Snowflake

Multi-cloud Data Cloud platform offering a fully managed data warehouse with compute architecture separated from storage, enabling independent scaling and secure data sharing.

📖
terms

ELT Pipeline

Modern data integration pattern where data is first loaded raw into a cloud warehouse then transformed using its computing capabilities, optimizing performance for massive volumes.

📖
terms

Auto-scaling Cluster

Capability of cloud Big Data platforms to dynamically adjust the number of compute nodes based on workload, optimizing costs and performance without manual intervention.

📖
terms

Serverless Analytics

Data analytics paradigm where the underlying infrastructure is fully managed by the cloud provider, allowing users to focus on analytical logic without managing servers or clusters.

🔍

No results found