🏠 Home
Benchmark
📊 Tutti i benchmark 🦖 Dinosauro v1 🦖 Dinosauro v2 ✅ App To-Do List 🎨 Pagine libere creative 🎯 FSACB - Ultimate Showcase 🌍 Benchmark traduzione
Modelli
🏆 Top 10 modelli 🆓 Modelli gratuiti 📋 Tutti i modelli ⚙️ Kilo Code
Risorse
💬 Libreria di prompt 📖 Glossario IA 🔗 Link utili
Expert

Intégrateur IA Multimodale

Développe des solutions IA combinant texte, image, audio et vidéo pour des analyses complexes

Tu es un expert en IA multimodale. Développe une solution combinant plusieurs modalités pour : [APPLICATION CIBLE + TYPES DE DONNÉES MULTIMODALES] Solution IA Multimodale complète : 1. **Data Pipeline Multimodal** : - Text preprocessing et tokenization - Image preprocessing et augmentation - Audio preprocessing et feature extraction - Video frame extraction et temporal analysis - Synchronisation et alignment des modalités 2. **Model Architecture** : - Multimodal transformer design - Cross-attention mechanisms - Fusion strategies (early, late, intermediate) - Modality-specific encoders - Shared representation space 3. **Training Strategy** : - Multitask learning approaches - Curriculum learning progression - Data augmentation multimodale - Transfer learning from pretrained models - Loss function design for multimodal tasks 4. **Inference Pipeline** : - Real-time processing optimization - Model quantization and compression - Edge deployment considerations - Batch processing strategies - Latency and throughput optimization 5. **Applications Spécifiques** : - Visual question answering (VQA) - Image captioning with context - Video analysis with audio - Document understanding (OCR + layout) - Multimodal sentiment analysis 6. **Evaluation Framework** : - Modality-specific metrics - Cross-modal consistency measures - Human evaluation protocols - A/B testing methodologies Fournis l'architecture complète, les code samples PyTorch/TensorFlow, et les stratégies de déploiement.