🏠 Hem
Benchmarkar
📊 Alla benchmarkar 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List-applikationer 🎨 Kreativa fria sidor 🎯 FSACB - Ultimata uppvisningen 🌍 Översättningsbenchmark
Modeller
🏆 Topp 10 modeller 🆓 Gratis modeller 📋 Alla modeller ⚙️ Kilo Code
Resurser
💬 Promptbibliotek 📖 AI-ordlista 🔗 Användbara länkar
📖
Evaluation and Metrics

MMLU (Massive Multitask Language Understanding) Benchmark

A comprehensive benchmark designed to measure a LLM's knowledge and comprehension abilities across a wide range of 57 subjects, from elementary math to US law and history. It assesses the model's ability to answer multiple-choice questions.

← Tillbaka