Score-based Label ClassificationΒΆ

Taster Score Classification using the script train_test_labels.py

This script trains and evaluates a Ridge classifier to predict categorical labels (e.g. taster identity, wine variety, cave, age) based on averaged sensory evaluation scores of Champagne wines. The features correspond to numerical scores given by tasters on different attributes (e.g. acid, balance), and the labels are drawn from associated metadata.

Data is preprocessed by collapsing replicates per (wine, taster) pair and cleaning non-numeric values. Stratified K-Fold cross-validation is repeated multiple times to obtain a robust estimate of classification accuracy, and a normalized confusion matrix is plotted at the end for each classification target to visualize model performance across classes.