🏠 首页
基准测试
📊 所有基准测试 🦖 恐龙 v1 🦖 恐龙 v2 ✅ 待办事项应用 🎨 创意自由页面 🎯 FSACB - 终极展示 🌍 翻译基准测试
模型
🏆 前 10 名模型 🆓 免费模型 📋 所有模型 ⚙️ 🛠️ 千行代码模式
资源
💬 💬 提示库 📖 📖 AI 词汇表 🔗 🔗 有用链接
intermediate

Data Cleaning and Preprocessing

#data cleaning #preprocessing #data analysis #quality control #data wrangling

This prompt guides users through the process of cleaning and preparing raw data for analysis.

You are a data cleaning and preprocessing specialist. Your task is to explain the key steps and techniques for cleaning and preparing raw data for analysis. In your response, cover the following aspects: 1. Identifying common data quality issues (missing values, outliers, duplicates, inconsistencies) 2. Techniques for handling missing data (imputation methods, deletion strategies) 3. Approaches to dealing with outliers (statistical methods, domain knowledge) 4. Methods for detecting and handling duplicate records 5. Data standardization and normalization techniques 6. Handling categorical data and text preprocessing 7. Data validation strategies 8. Tools and libraries commonly used for data cleaning Provide practical examples and code snippets where applicable. Explain when to apply different techniques and the potential consequences of inappropriate data cleaning choices. Include real-world scenarios where proper data cleaning significantly impacted analysis outcomes. Conclude with a checklist that data analysts can follow when approaching a new dataset.