About Data Scientist interviews in India
Indian tech interviews typically run across multiple rounds — a technical screen, one or two deep-dive rounds, a system/design or practical round, and an HR and managerial discussion. Interviewers care as much about how you reason as the final answer, so think aloud, state assumptions, and use real examples from your own work.
🎯 Interview Success Tips
STAR MethodSituation → Task → Action → Result. Use for every behavioural question. Quantify the Result.
Research FirstRead company news, LinkedIn page, Glassdoor reviews and the interviewer's profile before the interview.
Salary TipNever give a number first. Ask: "What is the budgeted range for this role?" — always.
Virtual InterviewsTest camera + mic 30 min before. Good lighting, neutral background. Join 5 min early.
🔧 Technical Questions
Technical Question 1
Explain the bias-variance tradeoff.
💡 How to answer: High bias = underfitting (model too simple, misses patterns); high variance = overfitting (model memorises noise). The goal is the sweet spot. Mention how regularisation, more data, or simpler models shift the balance.
Technical Question 2
How do you handle an imbalanced dataset (e.g. fraud detection)?
💡 How to answer: Don't rely on accuracy. Use precision/recall, F1, ROC-AUC. Techniques: resampling (SMOTE, undersampling), class weights, anomaly-detection framing, and choosing a threshold that fits the business cost of false negatives.
Technical Question 3
Difference between supervised, unsupervised and reinforcement learning?
💡 How to answer: Supervised uses labelled data (regression, classification). Unsupervised finds structure in unlabelled data (clustering, PCA). Reinforcement learns via reward signals from an environment. Give one real example of each.
Technical Question 4
What is regularisation? Compare L1 and L2.
💡 How to answer: Penalises large weights to reduce overfitting. L1 (Lasso) drives some weights to zero, doing feature selection. L2 (Ridge) shrinks weights smoothly. ElasticNet blends both.
Technical Question 5
How do you evaluate a regression model vs a classification model?
💡 How to answer: Regression: RMSE, MAE, R². Classification: accuracy, precision, recall, F1, ROC-AUC, confusion matrix. Always tie the metric to the business decision the model supports.
Technical Question 6
Explain p-value and statistical significance in simple terms.
💡 How to answer: The probability of seeing a result this extreme if the null hypothesis were true. p<0.05 is the common threshold. Warn against p-hacking and stress practical vs statistical significance.
Technical Question 7
Walk me through how you'd build a customer-churn model.
💡 How to answer: Define churn precisely, gather features (usage, tenure, support tickets), handle leakage, split train/validation/test, baseline first, try logistic regression then gradient boosting, evaluate on recall, then translate into a retention action.
Technical Question 8
What is overfitting and how do you detect and prevent it?
💡 How to answer: Model performs well on train, poorly on test. Detect via a validation gap. Prevent with cross-validation, regularisation, early stopping, dropout, more data, or a simpler model.
Technical Question 9
Explain the difference between bagging and boosting.
💡 How to answer: Bagging trains models in parallel on bootstrapped samples and averages them (Random Forest) — reduces variance. Boosting trains sequentially, each correcting the last (XGBoost, LightGBM) — reduces bias.
Technical Question 10
How would you design an A/B test and decide the winner?
💡 How to answer: State hypothesis and primary metric, compute required sample size from baseline rate and minimum detectable effect, randomise, run to significance (avoid peeking), then check p-value and confidence interval before rolling out.
🧠 Behavioural Questions
Behavioural Question 1
Tell me about a model you shipped to production. What was the impact?
💡 How to answer: Use STAR. Cover the business problem, your approach, how it was deployed and monitored, and the quantified outcome (revenue, cost, accuracy lift). Mention what you'd improve.
Behavioural Question 2
Describe a time stakeholders rejected your analysis. What did you do?
💡 How to answer: Show you listened, found the real objection (often trust or framing), re-presented with business language and clear visuals, and built buy-in. Avoid sounding defensive.
Behavioural Question 3
How do you keep your data-science skills current?
💡 How to answer: Mention Kaggle, papers/newsletters, reproducing techniques on real data, and learning the business domain — not just chasing new algorithms.
💡 Situational Questions
Situational Question 1
Your model's accuracy dropped suddenly in production. How do you debug?
💡 How to answer: Check for data drift, pipeline/feature breakage, label delay, and seasonality. Compare input distributions train vs live, validate the feature store, and roll back if needed while you investigate.
Situational Question 2
A stakeholder wants a model deployed in 3 days but data is messy. What do you do?
💡 How to answer: Set expectations, ship a simple, explainable baseline that delivers value, document data-quality risks, and plan an iteration. Communicate the tradeoff between speed and reliability clearly.
Situational Question 3
You discover a feature is leaking the target. What now?
💡 How to answer: Stop, remove the leaking feature, re-evaluate honestly, and explain why the earlier 'great' metric was misleading. Integrity over impressive numbers.
💰 Salary Questions
Salary Question 1
What are your salary expectations as a data scientist?
💡 How to answer: Deflect first: 'I'd like to understand the scope and team before numbers.' Anchor on market data — entry ₹8–14 LPA, mid ₹18–30 LPA, senior ₹35 LPA+ in India, varying by city and company tier.
Salary Question 2
We can match your current CTC but not more. How do you respond?
💡 How to answer: Quantify the value you add and cite market benchmarks from Glassdoor/AmbitionBox. If base is fixed, negotiate joining bonus, ESOPs, or an early review at 6 months.
🎤 Ask Interviewer Questions
Ask Interviewer Question 1
What does the data infrastructure and ML stack look like here?
💡 How to answer: Shows you care about how models actually reach production. Reveals maturity — feature store, MLOps, or notebooks-on-laptops.
Ask Interviewer Question 2
How is success measured for this role in the first year?
💡 How to answer: Surfaces whether the role is research, analytics, or production ML, and aligns expectations early.
Ask Interviewer Question 3
How do data scientists and engineers collaborate here?
💡 How to answer: Tells you whether you'll be blocked on deployment and how cross-functional the team really is.