30 Data Analyst Interview Questions and Answers (SQL + Case Studies)

Getting a data analyst role in 2026 means clearing three distinct hurdles in the same loop: SQL technical rounds, open-ended business case problems, and behavioral stories. Most candidates over-prepare on one and bomb the other two.

This guide covers all three — with real questions, structured answers, and a live practice link so you can rehearse before the clock starts.

Practice Data Analyst Interview Free

Quick Cheat Sheet: What Every DA Interview Tests

Round	What They're Testing	Common Mistake
SQL Technical	Can you write accurate, optimized queries?	Forgetting NULLs, using wrong JOIN type
Business Case	Do you think in metrics, not just data?	Jumping to analysis before defining the metric
Behavioral	Have you driven decisions, not just reports?	Saying "I built a dashboard" without the impact
Stats / Probability	Do you understand your own outputs?	Confusing correlation with causation

The most common failure mode: treating data analyst interviews like SQL exams. They're business problem-solving interviews that happen to use SQL.

SQL Interview Questions for Data Analysts (2026)

SQL rounds are pass/fail in most companies. These are the 10 patterns that come up most often.

1. Calculate monthly retention rate

Question: Write a SQL query to calculate monthly user retention rate — i.e., what percentage of users who were active in month N are still active in month N+1?

What they're testing: Window functions, date arithmetic, self-joins or CTEs.

Strong answer structure:

WITH monthly_active AS (
  SELECT
    user_id,
    DATE_TRUNC('month', event_date) AS activity_month
  FROM events
  GROUP BY 1, 2
),
retention AS (
  SELECT
    m1.activity_month AS cohort_month,
    COUNT(DISTINCT m1.user_id) AS users_in_month,
    COUNT(DISTINCT m2.user_id) AS users_retained_next_month
  FROM monthly_active m1
  LEFT JOIN monthly_active m2
    ON m1.user_id = m2.user_id
    AND m2.activity_month = m1.activity_month + INTERVAL '1 month'
  GROUP BY 1
)
SELECT
  cohort_month,
  users_in_month,
  users_retained_next_month,
  ROUND(100.0 * users_retained_next_month / users_in_month, 2) AS retention_rate_pct
FROM retention
ORDER BY cohort_month;

Before/After: Most candidates write a subquery-based version that works but doesn't scale. The CTE version above is easier to debug and shows you think about readability.

2. INNER JOIN vs LEFT JOIN — when to use each

Question: Explain the difference between INNER JOIN and LEFT JOIN. When would you choose LEFT JOIN?

Strong answer: INNER JOIN returns only rows where the match exists in both tables. LEFT JOIN returns all rows from the left table, with NULLs for unmatched right-side columns.

Choose LEFT JOIN when: you need to preserve the full universe of your base table — e.g., showing all users regardless of whether they completed a purchase. If you use INNER JOIN there, users with no purchases disappear from the output.

Follow-up trap: "Which is faster?" — The answer depends on data distribution and indexes, not the JOIN type itself. Saying "LEFT JOIN is slower" shows misunderstanding.

3. Find duplicate rows

Question: You have a users table. Write a query to find all user_ids that appear more than once.

SELECT user_id, COUNT(*) AS occurrences
FROM users
GROUP BY user_id
HAVING COUNT(*) > 1;

Extension they'll add: "Now delete the duplicates, keeping the most recent." This requires ROW_NUMBER() + CTE + DELETE from ranked results.

4. Running total / cumulative sum

SELECT
  order_date,
  revenue,
  SUM(revenue) OVER (ORDER BY order_date) AS running_total
FROM orders;

Key detail: OVER (ORDER BY order_date) with no PARTITION gives a global running sum. Add PARTITION BY user_id for per-user running totals.

5. Identify a sudden drop in website traffic

Question: Our website traffic dropped 40% last Tuesday. Walk me through how you'd analyze it.

This is a diagnostic business case, not a SQL exercise. Structure your answer:

Confirm the signal: Is it across all sources or one channel? All pages or one section?
Segment: Break by source/medium, device type, geo, landing page.
Correlate with events: Deploys, campaigns, algorithm updates, competitor actions.
Hypothesize + test: Form 2-3 hypotheses, then pull SQL to validate each.
Communicate finding: What's the most likely cause? What's the confidence level?

This question separates candidates who pull data from candidates who think in hypotheses.

Business Case Interview Questions for Data Analysts

These require less SQL and more structured reasoning.

6. Design a dashboard to track sales performance

What they want: A structured approach, not a list of charts.

Strong answer:

Primary metric: Revenue MoM and vs target
Leading indicators: Pipeline value, conversion rate at each stage
Diagnostic slices: By region, product line, rep, channel
Alerts: Revenue below threshold, conversion rate drop ≥10% vs prior week

Common mistake: designing the dashboard before asking "who's the audience and what decision does this inform?"

7. Define a metric for user engagement

Common answers (weak): "Daily active users" or "time on site."

Better answer: Engagement metrics should map to business outcomes. For a SaaS product, a better engagement metric is "core action completion rate" — the percentage of sessions where the user completed the action the product promises (e.g., sent a message, ran a report). This metric predicts retention better than time-on-site.

8. How do you handle missing data in a dataset?

Structure your answer around the reason for missing data:

MCAR (Missing Completely at Random): Safe to drop rows or use mean imputation
MAR (Missing at Random): Use model-based imputation (predict missing values from other features)
MNAR (Missing Not at Random): The missingness itself is signal — encode it as a feature, don't just fill it

Never say "I drop missing rows" without explaining why it's safe to do so.

Behavioral Interview Questions for Data Analysts

9. Describe a time your analysis changed a business decision

Formula: Use STAR — but front-load the business outcome.

Weak: "I analyzed the churn data and made a dashboard." Strong: "We were about to launch a win-back email campaign targeting all churned users from the past year. I analyzed the underlying churn reasons and found that 40% had churned due to billing failures, not product dissatisfaction. I segmented these users and recommended a separate reactivation flow. The targeted campaign had 3.2× the reactivation rate of the generic blast."

The key structure: what decision was being made → what data revealed → what you recommended → what happened.

10. Tell me about a stakeholder who disagreed with your findings

Interviewers use this to assess how you handle pushback without abandoning rigorous analysis.

Framework:

Describe what your analysis showed
Describe the stakeholder's counter-position (show you understood their concern)
Describe how you tested or probed their concern
Share the outcome — even if they were partially right

Showing you can update your view based on new information is more impressive than "I stuck to my guns."

Stats & Probability Questions

11. What is a p-value and how do you interpret it?

A p-value is the probability of observing your result (or a more extreme one) if the null hypothesis is true. A p-value of 0.04 does NOT mean there's a 96% chance your hypothesis is correct — it means "if nothing were going on, observing this result by chance has 4% probability."

Common mistake interviewers catch: "A low p-value proves my hypothesis." It doesn't. It only rejects the null.

12. What's the difference between statistical and practical significance?

A result can be statistically significant (p < 0.05) but practically meaningless if the actual effect size is tiny. Always pair significance with effect size (Cohen's d, lift%, relative change). In business, 0.1% conversion lift that's statistically significant is not worth building a feature for.

Common Data Analyst Interview Questions — Quick Reference List

Write a SQL query to find the second-highest salary in each department
How would you detect seasonality in a time series?
What is the difference between mean, median, and mode — when does each matter?
Explain VLOOKUP vs INDEX/MATCH in Excel
What is data normalization and when would you NOT normalize?
How do you validate the quality of a new dataset you've never seen before?
What's an A/B test and how do you determine sample size?
Walk me through how you'd build a churn prediction model from scratch
What is a cohort analysis and when would you use it?
How do you communicate uncertainty in your findings to non-technical stakeholders?

How to Practice Data Analyst Interviews Online

Reading answers is not the same as saying them out loud under time pressure. Data analyst interviews have a pacing component — most SQL rounds are 30–45 minutes for 2–3 problems.

Start Free Data Analyst Mock Interview

ManyOffer's data analyst mock interview runs a full simulated round — SQL questions, a business case, and behavioral — with AI feedback on structure, completeness, and communication. No signup required.

FAQ

How hard are data analyst interviews in 2026? SQL rounds are standardized and have clear right/wrong answers — these are the most learnable part. Business case rounds are harder because there's no single correct answer; what matters is structured thinking. Behavioral rounds reward candidates with concrete impact stories (numbers, decisions, outcomes), not generic process descriptions.

Do I need Python for a data analyst interview? Depends on the role. For most analyst roles at non-tech companies, SQL + Excel is sufficient. For analyst roles at tech companies or data-heavy orgs, Python (pandas, matplotlib) is expected — especially for take-home projects.

What SQL level do data analyst interviews expect? At minimum: JOINs, GROUP BY + HAVING, subqueries, basic window functions (ROW_NUMBER, RANK, SUM OVER). Advanced roles add CTEs, complex window frames, performance-aware query writing.

How many rounds does a typical data analyst interview have? Usually 3–5: recruiter screen → SQL take-home or online assessment → technical SQL round → business case or product sense round → behavioral loop. Senior roles add a stakeholder presentation or case study.

30 Data Analyst Interview Questions and Answers (SQL + Case Studies)

30 Data Analyst Interview Questions and Answers (SQL + Case Studies)

Quick Cheat Sheet: What Every DA Interview Tests

SQL Interview Questions for Data Analysts (2026)

1. Calculate monthly retention rate

2. INNER JOIN vs LEFT JOIN — when to use each

3. Find duplicate rows

4. Running total / cumulative sum

5. Identify a sudden drop in website traffic

Business Case Interview Questions for Data Analysts

6. Design a dashboard to track sales performance

7. Define a metric for user engagement

8. How do you handle missing data in a dataset?

Behavioral Interview Questions for Data Analysts

9. Describe a time your analysis changed a business decision

10. Tell me about a stakeholder who disagreed with your findings

Stats & Probability Questions

11. What is a p-value and how do you interpret it?

12. What's the difference between statistical and practical significance?

Common Data Analyst Interview Questions — Quick Reference List

How to Practice Data Analyst Interviews Online

FAQ

Tags

Related Articles

15 Product Manager Interview Exercises (2026 Practice Guide)

Product Manager Interview Questions 2026: Answers & Examples

Google Behavioral Interview Questions 2026: Googleyness, Leadership, Examples & Strong Answer Patterns

Ready to Practice Your Interview Skills?