Introduction to neural networks
Résumé de section
-
-
Binary Cross-Entropy with Logits and Class Weights
The BCEWithLogitsLoss function in PyTorch is a numerically stable implementation of the binary cross-entropy (BCE) loss
It combines a Sigmoid activation and the BCE computation in a single operation, which avoids numerical instabilities for large positive or negative logits
-
https://www.kaggle.com/mlg-ulb/creditcardfraud
This Python script builds several classifiers for the highly imbalanced “creditcard” dataset and compares them on held-out data. It covers preprocessing, logistic regression (with different imbalance strategies), random forests, and two PyTorch models (a logistic regression and a small MLP) trained with a class-imbalance-aware loss.
-
Setup and data loading
– Resets variables (IPython “%reset”)
– Imports NumPy, pandas, scikit-learn utilities, and later PyTorch
– Loads the CSV file creditcard.csv into a pandas DataFrame
– Splits columns into features X (all columns except “Class”) and target y = “Class”
– Prints class counts to show the severe imbalance (very few frauds)
-
Feature scaling
– Fits a StandardScaler on X (mean 0, variance 1) and transforms X to float32
Note: in a fully rigorous pipeline, the scaler should be fit on the training set only, then applied to the test set
-
Random under-sampling to create a balanced subset
– Counts the number of fraud cases and gets their indices
– Randomly selects the same number of non-fraud indices
– Concatenates both to form a balanced subset (under_sample_data)
– Extracts X_undersample and y_undersample from that subset
-
Train/test splits
– Splits the full data (X, y) into train and test (70/30)
– Splits the balanced subset similarly
– Prints class counts in each split for both the original and the undersampled data
-
Logistic regression (scikit-learn)
(a) Baseline on the original imbalanced data:
– Trains LogisticRegression without regularization
– Predicts on X_test and prints a confusion table (y_test vs predictions)
(b) Class-weighted logistic regression:
– Trains LogisticRegression with class_weight=“balanced” (the minority class is up-weighted automatically)
– Predicts on X_test and prints the confusion table
(c) Logistic regression on the undersampled (balanced) subset:
– Trains LogisticRegression on X_train_undersample, y_train_undersample, with more iterations and a Newton solver.
– Predicts on X_test_undersample and prints the confusion table
-
Random forest (scikit-learn)
(a) Trains a RandomForestClassifier on the original imbalanced training set (100 trees, max_features ≈ sqrt(30), OOB enabled, parallel jobs)
– Predicts on X_test and prints the confusion table
(b) Trains the same RandomForestClassifier on the undersampled training set
– Predicts on X_test_undersample and prints the confusion table
-
PyTorch setup
– Selects the Apple Silicon Metal backend device (“mps”) if available
– Converts train/test arrays to PyTorch tensors; moves the test tensors to the device
– Builds a TensorDataset and DataLoader for the training data (batch size 1024, shuffled)
– Computes pos_weight = (#negatives / #positives) from y_train. This ratio is used by the loss to counter the imbalance
-
Generic PyTorch training loop (train_model)
– Moves the model to device and defines Adam optimizer
– Uses BCEWithLogitsLoss(pos_weight=pos_weight): this expects raw logits and internally applies a stable sigmoid + binary cross-entropy; pos_weight increases the penalty for misclassifying the positive class
– For a given number of epochs:
• Iterates over mini-batches: forward pass → compute loss → backpropagate → optimizer step
• Tracks mean training loss per epoch
• Periodically evaluates test loss on the full test set (still with BCEWithLogitsLoss)
– Returns the trained model
-
PyTorch logistic regression
– Defines a single Linear layer with output size 1 (equivalent to logistic regression without explicit sigmoid on the last layer, because the loss uses logits)
– Trains it with train_model (50 epochs, lr=1e-3)
– Switches to eval mode, computes logits on X_test_t, applies sigmoid, thresholds at 0.5 to obtain class predictions, and prints the confusion table against y_test
-
PyTorch MLP
– Defines a small feed-forward network: Linear → GELU → Linear → GELU → Linear → GELU → Linear(→1)
– Trains it with weight decay for mild regularization (80 epochs, lr=1e-3, weight_decay=1e-5)
– Evaluates as above: logits → sigmoid → threshold 0.5 → confusion table
What the script demonstrates
– Several standard ways to handle class imbalance: raw training on imbalanced data, class weighting, and random under-sampling
– Comparison of linear (logistic regression) and non-linear (random forest, MLP) models
– How to train PyTorch classifiers for imbalanced binary classification using BCEWithLogitsLoss with a positive-class weight, and how to evaluate them via a confusion table on a held-out test set
-