Data

Key topics:

  • Data leakage: identify common sources and prevention.
  • Splits: IID vs. temporal splits; stratification.
  • Feature engineering: categorical handling, normalization, encoding.
  • Imbalance: resampling, class weights, thresholding.