Home
Docs
Machine Learning
Model
Model Overview
Model Overview
This page orients you across the model families that show up most in interviews and in practice. Use it as a quick decision guide, then jump into the focused pages for details and pitfalls.
What to know cold:
Bias–variance trade‑off and how it manifests across families
Regularization knobs (L1/L2, early stopping, depth, learning rate)
Data/feature requirements (scaling, sparsity, linear separability)
Calibration and thresholding vs. raw scores
Common families and when to reach for them:
Linear/Logistic Regression: strong baselines; fast, interpretable; requires feature scaling and linear-ish relationships.
Tree‑based Methods (DT/RF/GBM): minimal preprocessing; robust to nonlinearity and mixed feature types; watch depth/learning‑rate.
Naive Bayes: very fast text baseline; independence assumption often “good enough” with sparse features.
SVM: strong on medium‑sized tabular datasets; requires scaling; kernel choices matter.
PCA: unsupervised dimensionality reduction; apply only on train folds to avoid leakage.
k‑Means: quick clustering; prefers spherical clusters; standardize features first.
Ensembles: bagging/boosting/stacking to trade variance vs. bias and push performance.
Next steps:
See the dedicated pages for tuning, diagnostics, and interview prompts:
/docs/machine_learning/model/linear-and-logistic-regression/
/docs/machine_learning/model/tree-based-methods/
/docs/machine_learning/model/bayes-theorem-and-naive-bayes/
/docs/machine_learning/model/support-vector-machines/
/docs/machine_learning/model/principal-component-analysis-pca/
/docs/machine_learning/model/k-means-clustering/
/docs/machine_learning/model/ensemble-learning/