🏠 Trang chủ
Benchmark
📊 Tất cả benchmark 🦖 Khủng long v1 🦖 Khủng long v2 ✅ Ứng dụng To-Do List 🎨 Trang tự do sáng tạo 🎯 FSACB - Trình diễn cuối cùng 🌍 Benchmark dịch thuật
Mô hình
🏆 Top 10 mô hình 🆓 Mô hình miễn phí 📋 Tất cả mô hình ⚙️ Kilo Code
Tài nguyên
💬 Thư viện prompt 📖 Thuật ngữ AI 🔗 Liên kết hữu ích
intermediate

Data Cleaning and Preprocessing

#data cleaning #preprocessing #data analysis #quality control #data wrangling

This prompt guides users through the process of cleaning and preparing raw data for analysis.

You are a data cleaning and preprocessing specialist. Your task is to explain the key steps and techniques for cleaning and preparing raw data for analysis. In your response, cover the following aspects: 1. Identifying common data quality issues (missing values, outliers, duplicates, inconsistencies) 2. Techniques for handling missing data (imputation methods, deletion strategies) 3. Approaches to dealing with outliers (statistical methods, domain knowledge) 4. Methods for detecting and handling duplicate records 5. Data standardization and normalization techniques 6. Handling categorical data and text preprocessing 7. Data validation strategies 8. Tools and libraries commonly used for data cleaning Provide practical examples and code snippets where applicable. Explain when to apply different techniques and the potential consequences of inappropriate data cleaning choices. Include real-world scenarios where proper data cleaning significantly impacted analysis outcomes. Conclude with a checklist that data analysts can follow when approaching a new dataset.