Data Exploration & Cleaning
Prepare and understand your data with AI-assisted exploration and QA.
Analyze this dataset for quality issues: identify missing values, outliers, data type inconsistencies, and duplicate records. Provide a summary of issues and prioritized cleaning recommendations. [Insert data sample or describe dataset structure]
Perform exploratory data analysis on this dataset. Identify key statistics, distributions, correlations between variables, and potential patterns. Suggest which variables are most important for further analysis. [Insert data or dataset description]
I have raw data that needs transformation before analysis. Help me plan the transformation steps: normalization, feature engineering, aggregation, and encoding. Explain how each transformation improves the data for analysis.
Review this data schema and identify any potential data quality risks before I begin analysis: [paste schema or field list]. What assumptions should I validate before trusting these fields in calculations?
I'm noticing unexpected values in my [field name] column. Here are some sample values: [paste examples]. Help me identify whether this is a data entry issue, a formatting inconsistency, or a legitimate data pattern — and suggest how to handle it.
SQL & Database Analysis
Write efficient queries and optimize database performance.
Write a SQL query that: [describe your analysis requirement]. The query should handle [specify edge cases]. Include comments explaining the logic and return [specify desired output format].
Help me optimize this SQL query for performance: [paste your query]. It currently takes [timeframe] to execute. What indexing strategies, JOIN optimizations, or query restructuring would improve speed? Provide the optimized query with explanations.
I need to extract insights from this database schema: [describe tables, columns, relationships]. Help me identify the best way to query this for [specify analysis goal]. Suggest joins, aggregations, and filtering strategies.
Write a SQL query using window functions to calculate [specific metric: running total, rank, YoY change, moving average] by [dimension: region, product, date]. Explain each window function used.
I have two tables I need to join: [describe Table A] and [describe Table B]. The relationship is [one-to-many/many-to-many]. What's the correct JOIN type, and how do I avoid row duplication or data loss?
Python Data Analysis
Generate and optimize Python code for data manipulation and analysis.
Write a Python script using pandas to: [describe your data manipulation task]. The input is [describe data format]. I need the output to be [specify desired output]. Include error handling and comments.
Help me perform [specify statistical analysis: regression, correlation, hypothesis testing, ANOVA] on my dataset. [Describe your data and variables]. Provide Python code using scipy and statsmodels, including interpretation of results.
Review this Python data analysis code for efficiency, readability, and correctness: [paste your code]. Identify bugs, suggest optimizations, and recommend better approaches. Provide refactored code with explanations.
Write a Python function that automates this recurring data task: [describe what you do manually]. The function should accept [describe inputs], return [describe outputs], and handle edge cases like [describe edge cases].
My pandas script is running slowly on a dataset with [X million rows]. Here's the code: [paste code]. Suggest vectorization, chunking, or other optimizations to reduce memory usage and processing time.
Visualization & Dashboarding
Design effective visualizations that communicate insights clearly.
I have [describe your data: sales by region and time, customer segments, etc.]. I need to show [describe your analytical goal]. What visualization types would be most effective? Suggest 2-3 options with pros/cons of each.
Help me design a dashboard for [describe audience and purpose: executive KPIs, operational metrics, customer analytics]. Key metrics include [list metrics]. Suggest layout, visualization types, drill-down capabilities, and refresh frequency.
I have these key findings from my analysis: [describe findings]. Help me craft a compelling data story with: talking points, recommended visuals, key statistics, and recommendations for [describe audience: executive, team, investors].
Write Python code using matplotlib/seaborn to create a [chart type] that shows [describe what to visualize]. Include proper labels, color scheme for [audience: print, screen, colorblind-friendly], and a clear chart title that states the insight.
My dashboard is too busy and stakeholders find it overwhelming. Here are the current metrics: [list them all]. Help me prioritize to the 5-7 most critical KPIs for [audience role] and suggest how to structure the remaining detail in drill-throughs.
Modeling & Forecasting
Build predictive models and forecasts to drive data-driven decisions.
Help me build a predictive model to [describe prediction goal: forecast revenue, predict churn, identify high-value customers]. I have [describe data available]. Recommend an algorithm, suggest features, and outline model validation approach.
I need to forecast [describe what: monthly revenue, website traffic, demand] for the next [time period]. I have [describe historical data]. Suggest appropriate forecasting methods, explain seasonality/trends, and provide confidence intervals.
Help me engineer features for my predictive model. I have raw data including [describe available fields]. For predicting [describe target], what new features should I create? How should I handle categorical variables, missing values, and scaling?
My model's accuracy is [X%] on training data but only [Y%] on validation data. Here's my model setup: [describe]. How do I diagnose whether this is overfitting, underfitting, or data leakage? What steps should I take to improve generalization?
Walk me through how to interpret the results of my [model type: logistic regression, random forest, XGBoost]. Here are my model metrics: [paste metrics]. What does this mean in business terms, and what are the limitations of these results?
Stakeholder Reporting & Communication
Translate analytical findings into business language stakeholders act on.
Translate this technical analysis summary into an executive-level insight: [paste your findings]. Use plain language, lead with the business implication, and include one clear recommendation. Target length: 3 sentences.
I need to write a monthly analytics report for [audience: marketing team, C-suite, product team]. Key metrics this period: [list metrics with values]. Help me structure the report with: headline metric, trend narrative, key drivers, and 2-3 recommendations.
Write a slack message or email to [stakeholder role] explaining why [metric] changed [X%] last [period]. Include: what happened, why it happened, what we're doing about it, and what to expect next period. Tone: calm and factual.
My analysis shows [describe finding], but I anticipate pushback because [describe concern or skepticism]. Help me anticipate the 5 most likely objections and write confident, evidence-based responses to each.
Documentation & Analysis Best Practices
Document your work for reproducibility, audits, and team handoffs.
Write a methodology documentation for this analysis: [describe your analysis]. Include: objective, data sources, assumptions, transformations applied, statistical methods used, and limitations. Format for a technical audience.
Create a data dictionary for this dataset: [paste field names and sample values]. For each field, include: description, data type, valid value ranges, and notes on known data quality issues.
Review my analysis for potential methodological flaws or biases: [describe your approach]. What assumptions am I making that could be wrong? What alternative explanations exist for my findings? Where should I add caveats?