Follow AiTechWorlds on LinkedIn for professional AI content!Follow Now →
20 minLesson 15 of 18
Automation & Workflows

Advanced Data Analysis: CSV & Excel

Data Analysis with ChatGPT

ChatGPT handles data analysis in two modes: analyzing data you paste directly into the conversation, and writing analysis code (Python, SQL, R) that you run yourself. Both are valuable, and the best professionals combine them strategically.

Direct Data Analysis (Code Interpreter / Advanced Data Analysis)

With ChatGPT's Advanced Data Analysis mode enabled, you can upload CSV, Excel, JSON files and ChatGPT will analyze them directly — running Python code, generating charts, doing statistical analysis.

When to use this: Quick exploratory analysis, generating visualizations, small to medium datasets, ad-hoc questions you'd normally open a spreadsheet for.

How to use it:

  1. Enable the tool icon in ChatGPT (or use the file upload button)
  2. Upload your file
  3. Ask specific questions
[Upload CSV file]

Analyze this dataset. Give me:
1. A summary of what the data contains (columns, row count, date range)
2. Key statistics for numerical columns (mean, median, min, max, any outliers)
3. The top 5 trends or patterns you notice
4. Questions this data raises that I should investigate further

Paste-In Analysis for Smaller Datasets

For smaller tables or aggregated data, paste directly:

Analyze this sales data. Identify trends, anomalies, and insights I should 
act on. Focus on what's actionable, not just descriptive.

Month | Revenue | New Customers | Churn Rate | NPS
Jan   | $45,200 | 23            | 2.1%       | 42
Feb   | $48,100 | 31            | 1.9%       | 44
Mar   | $44,800 | 19            | 2.8%       | 39
Apr   | $52,300 | 28            | 2.2%       | 41
May   | $58,100 | 35            | 2.0%       | 46
Jun   | $55,400 | 29            | 2.4%       | 43

Notice the prompt asks for actionable insights, not just a description of the numbers. The description is obvious — the insight is the value.

Writing Analysis Code

For larger datasets or repeatable analysis, ask ChatGPT to write the code:

Python/Pandas

Write Python code to analyze a CSV file of customer orders.

File: orders.csv
Columns: order_id, customer_id, product_id, quantity, price, date, status

Analysis to produce:
1. Monthly revenue totals (as a pandas DataFrame)
2. Top 10 customers by total spend
3. Products with highest return rate (status = 'returned')
4. Average order value by month

Requirements:
- Use pandas and matplotlib
- Include error handling for missing values
- Save each analysis result to a separate CSV in an /output folder
- Create a bar chart of monthly revenue and save as monthly_revenue.png

Add comments explaining each major step.

SQL Analysis Queries

Write SQL queries to analyze our user behavior data.

Schema:
- users: id, email, created_at, plan (free/pro), country
- events: id, user_id, event_type, properties (jsonb), occurred_at
- subscriptions: id, user_id, plan, started_at, ended_at, mrr

Queries I need:
1. Monthly active users (users who fired any event in the last 30 days)
2. Conversion funnel from signup → first event → pro upgrade
3. Monthly churn rate (subscriptions ended / total active at start of month)
4. Top 10 events by frequency in the last 7 days
5. Average days from signup to first paid subscription

Database: PostgreSQL. Use CTEs for readability.

Excel Formula Analysis

I have this data in Excel. Write the formulas I need.

Sheet structure:
- Column A: Date
- Column B: Revenue
- Column C: Expenses
- Column D: New Customers
- Column E: Churned Customers

Formulas needed:
1. Running total of revenue (Column F)
2. Net profit (Column G)
3. Month-over-month revenue growth % (Column H)
4. Net customer change (new minus churned, Column I)
5. Revenue per customer this month, assuming Column J has total customer count

Use Excel syntax. Assume data starts at row 2.

Interpreting Analysis Results

Once you have data or analysis output, ask for interpretation:

Here's the output of my analysis:
[paste data/results]

Interpret these results for a business audience. Specifically:
1. What's the most important thing this data is telling us?
2. What's unexpected or surprising?
3. What doesn't this data explain that we should investigate?
4. What action should we take based on this?

Context: [brief description of our business/situation]

Turning Data Into Narratives

For presentations and reports, data needs a story:

Turn this data into a narrative for my quarterly business review.

Data:
[paste data or analysis results]

Audience: Board of directors — they want the insight, not the numbers.

Structure:
- One-sentence headline that captures the key story
- 3 paragraphs: what happened, why it happened, what we're doing about it
- The 2-3 most important numbers mentioned in context (not a table)

Note the data clearly, but the narrative is what matters here.

Identifying Patterns and Anomalies

Look at this time series data and identify:
1. Trends (sustained increases or decreases)
2. Seasonality (patterns that repeat on a weekly/monthly/annual cycle)
3. Anomalies (data points that don't fit the pattern)
4. Inflection points (where the trend changed)

For each anomaly or inflection point, suggest 2-3 possible explanations I should investigate.

Data:
[paste time series]

Preparing for Data Conversations

Before meetings where you'll discuss data:

I'm presenting this data to our executive team in 30 minutes.

Data:
[paste]

Prepare me for the conversation:
1. What are the 3 most likely questions they'll ask?
2. What are the 3 most likely challenges to this data (methodology, interpretation)?
3. For each question/challenge, what's the right answer?
4. Is there anything in this data I should proactively address rather than waiting for them to notice?

Checking Your Analysis

After doing your own analysis, ask ChatGPT to review it:

I've done an analysis and reached these conclusions:
[paste your analysis]

Data I used:
[paste data]

Please:
1. Check my math on key calculations
2. Identify any logical errors in my reasoning
3. Point out any alternative explanations for the patterns I found
4. Note any statistical or analytical red flags (small sample size, correlation ≠ causation, etc.)

The Limits of AI Data Analysis

ChatGPT can be wrong about numbers. Always verify key calculations yourself, especially anything that will be used in decisions. The model can confidently produce arithmetic errors.

It doesn't know your business context. An anomaly in your data might have an obvious internal explanation (a system outage, a one-off campaign). ChatGPT doesn't know that. You need to bring the context.

Correlation is not causation. ChatGPT will identify patterns correctly but may suggest causal explanations that aren't valid. Always think critically about causation.

Use Advanced Data Analysis for real computation. When precision matters, use the file upload + code interpreter. When you paste data in text form, you're asking the model to do math in its head — less reliable.

Next lesson: Automation with Zapier and Make — connecting ChatGPT to your workflows.

📱

Get this course's notes on Telegram!

Free cheat sheets, summaries & practice exercises

Get Notes Free →
!