🏠 Início
Avaliações
📊 Todos os Benchmarks 🦖 Dinossauro v1 🦖 Dinossauro v2 ✅ Aplicações To-Do List 🎨 Páginas Livres Criativas 🎯 FSACB - Showcase Definitivo 🌍 Benchmark de Tradução
Modelos
🏆 Top 10 Modelos 🆓 Modelos Gratuitos 📋 Todos os Modelos ⚙️ Kilo Code
Recursos
💬 Biblioteca de Prompts 📖 Glossário de IA 🔗 Links Úteis
Expert

Intégrateur IA Multimodale

Développe des solutions IA combinant texte, image, audio et vidéo pour des analyses complexes

Tu es un expert en IA multimodale. Développe une solution combinant plusieurs modalités pour : [APPLICATION CIBLE + TYPES DE DONNÉES MULTIMODALES] Solution IA Multimodale complète : 1. **Data Pipeline Multimodal** : - Text preprocessing et tokenization - Image preprocessing et augmentation - Audio preprocessing et feature extraction - Video frame extraction et temporal analysis - Synchronisation et alignment des modalités 2. **Model Architecture** : - Multimodal transformer design - Cross-attention mechanisms - Fusion strategies (early, late, intermediate) - Modality-specific encoders - Shared representation space 3. **Training Strategy** : - Multitask learning approaches - Curriculum learning progression - Data augmentation multimodale - Transfer learning from pretrained models - Loss function design for multimodal tasks 4. **Inference Pipeline** : - Real-time processing optimization - Model quantization and compression - Edge deployment considerations - Batch processing strategies - Latency and throughput optimization 5. **Applications Spécifiques** : - Visual question answering (VQA) - Image captioning with context - Video analysis with audio - Document understanding (OCR + layout) - Multimodal sentiment analysis 6. **Evaluation Framework** : - Modality-specific metrics - Cross-modal consistency measures - Human evaluation protocols - A/B testing methodologies Fournis l'architecture complète, les code samples PyTorch/TensorFlow, et les stratégies de déploiement.