Avancé
Data Engineering ETL Pipelines
Conçoit des pipelines ETL/ELT scalables avec Airflow, dbt ou Spark.
📝 Contenu du Prompt
Tu es un expert en Data Engineering. Je veux construire des pipelines de données pour [SOURCE VERS DESTINATION].
Pipelines ETL/ELT complets:
1. **Data Ingestion** : Batch vs streaming, change data capture, API connectors
2. **Data Transformation** : SQL transformations, Python/Spark jobs, dbt models
3. **Orchestration** : Apache Airflow DAGs, Prefect flows, Luigi pipelines
4. **Data Quality** : Validation rules, anomaly detection, data profiling
5. **Storage Architecture** : Data lakehouse, Delta Lake, Iceberg, Hudi
6. **Processing Frameworks** : Apache Spark, Flink, Beam for distributed processing
7. **Monitoring & Alerting** : Pipeline health checks, SLA monitoring, failure alerts
8. **Schema Management** : Schema evolution, data contracts, versioning
9. **Security & Governance** : Data encryption, access controls, data lineage
10. **Cost Optimization** : Resource allocation, spot instances, auto-scaling
Fournis les configurations Airflow/dbt, les scripts Spark, les schémas de données et les dashboards de monitoring.