Section : Introduction to neural networks | HAX912X - Generalized Linear Model / Machine Learning

Résumé de section

- Sélectionner l’activité Slides
  
  Slides Fichier
- Sélectionner l’activité Pytorch code
  
  Pytorch code Fichier
- Sélectionner l’activité PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
  
  PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs URL
- Sélectionner l’activité Binary Cross-Entropy with Logits and Class Weights
  
  Binary Cross-Entropy with Logits and Class Weights Page
  
  Binary Cross-Entropy with Logits and Class Weights
  
  The BCEWithLogitsLoss function in PyTorch is a numerically stable implementation of the binary cross-entropy (BCE) loss
  
  It combines a Sigmoid activation and the BCE computation in a single operation, which avoids numerical instabilities for large positive or negative logits
- Sélectionner l’activité Credit Card Fraud Detection
  
  Credit Card Fraud Detection Page
  
  https://www.kaggle.com/mlg-ulb/creditcardfraud
  
  This Python script builds several classifiers for the highly imbalanced “creditcard” dataset and compares them on held-out data. It covers preprocessing, logistic regression (with different imbalance strategies), random forests, and two PyTorch models (a logistic regression and a small MLP) trained with a class-imbalance-aware loss.
  
  Setup and data loading
  
  – Resets variables (IPython “%reset”)
  
  – Imports NumPy, pandas, scikit-learn utilities, and later PyTorch
  
  – Loads the CSV file creditcard.csv into a pandas DataFrame
  
  – Splits columns into features X (all columns except “Class”) and target y = “Class”
  
  – Prints class counts to show the severe imbalance (very few frauds)
  
  Feature scaling
  
  – Fits a StandardScaler on X (mean 0, variance 1) and transforms X to float32
  
  Note: in a fully rigorous pipeline, the scaler should be fit on the training set only, then applied to the test set
  
  Random under-sampling to create a balanced subset
  
  – Counts the number of fraud cases and gets their indices
  
  – Randomly selects the same number of non-fraud indices
  
  – Concatenates both to form a balanced subset (under_sample_data)
  
  – Extracts X_undersample and y_undersample from that subset
  
  Train/test splits
  
  – Splits the full data (X, y) into train and test (70/30)
  
  – Splits the balanced subset similarly
  
  – Prints class counts in each split for both the original and the undersampled data
  
  Logistic regression (scikit-learn)
  
  (a) Baseline on the original imbalanced data:
  
  – Trains LogisticRegression without regularization
  
  – Predicts on X_test and prints a confusion table (y_test vs predictions)
  (b) Class-weighted logistic regression:
  
  – Trains LogisticRegression with class_weight=“balanced” (the minority class is up-weighted automatically)
  
  – Predicts on X_test and prints the confusion table
  
  (c) Logistic regression on the undersampled (balanced) subset:
  
  – Trains LogisticRegression on X_train_undersample, y_train_undersample, with more iterations and a Newton solver.
  
  – Predicts on X_test_undersample and prints the confusion table
  
  Random forest (scikit-learn)
  
  (a) Trains a RandomForestClassifier on the original imbalanced training set (100 trees, max_features ≈ sqrt(30), OOB enabled, parallel jobs)
  
  – Predicts on X_test and prints the confusion table
  (b) Trains the same RandomForestClassifier on the undersampled training set
  
  – Predicts on X_test_undersample and prints the confusion table
  
  PyTorch setup
  
  – Selects the Apple Silicon Metal backend device (“mps”) if available
  
  – Converts train/test arrays to PyTorch tensors; moves the test tensors to the device
  
  – Builds a TensorDataset and DataLoader for the training data (batch size 1024, shuffled)
  
  – Computes pos_weight = (#negatives / #positives) from y_train. This ratio is used by the loss to counter the imbalance
  
  Generic PyTorch training loop (train_model)
  
  – Moves the model to device and defines Adam optimizer
  
  – Uses BCEWithLogitsLoss(pos_weight=pos_weight): this expects raw logits and internally applies a stable sigmoid + binary cross-entropy; pos_weight increases the penalty for misclassifying the positive class
  
  – For a given number of epochs:
  
  • Iterates over mini-batches: forward pass → compute loss → backpropagate → optimizer step
  
  • Tracks mean training loss per epoch
  
  • Periodically evaluates test loss on the full test set (still with BCEWithLogitsLoss)
  
  – Returns the trained model
  
  PyTorch logistic regression
  
  – Defines a single Linear layer with output size 1 (equivalent to logistic regression without explicit sigmoid on the last layer, because the loss uses logits)
  
  – Trains it with train_model (50 epochs, lr=1e-3)
  
  – Switches to eval mode, computes logits on X_test_t, applies sigmoid, thresholds at 0.5 to obtain class predictions, and prints the confusion table against y_test
  
  PyTorch MLP
  
  – Defines a small feed-forward network: Linear → GELU → Linear → GELU → Linear → GELU → Linear(→1)
  
  – Trains it with weight decay for mild regularization (80 epochs, lr=1e-3, weight_decay=1e-5)
  
  – Evaluates as above: logits → sigmoid → threshold 0.5 → confusion table
  
  What the script demonstrates
  
  – Several standard ways to handle class imbalance: raw training on imbalanced data, class weighting, and random under-sampling
  
  – Comparison of linear (logistic regression) and non-linear (random forest, MLP) models
  
  – How to train PyTorch classifiers for imbalanced binary classification using BCEWithLogitsLoss with a positive-class weight, and how to evaluate them via a confusion table on a held-out test set