Cake Classification Features Analysis
Machine Learning

Cake Classification Features Analysis

Evaluating handcrafted versus deep features for fine-grained food classification

Deep convolutional descriptors slice through frosting-level nuances; handcrafted stats simply can’t keep up.

Home/Research/Cake Classification Features Analysis

Project Information

Course
Machine Learning
Authors
Andrea Alberti
Date
February 2024
Pages
5
View Code

Technologies

Pythonpvml libraryPVMLNetNumPyOpenCVMatplotlib

Abstract

Compared handcrafted descriptors and CNN-derived features for classifying 15 cake categories (1,800 images). Low-level statistics (color histogram, edge direction, co-occurrence) fed an MLP but plateaued at 31% accuracy, while PVMLNet feature maps (layer −5) coupled with an MLP achieved 90% test accuracy. Transfer learning by fine-tuning PVMLNet reached 80%, highlighting the importance of deep representations.

About

The dataset of 15 cake types (chocolate, tiramisu, cheesecake, etc.) is split 100/20 per class for train/test. Handcrafted descriptors—color histograms, edge direction histograms, grey-level co-occurrence matrices—are concatenated and normalised (mean-var, min-max, max-abs) before feeding an MLP. Despite tuning, performance stagnates around 31% due to intra-class variability. Switch to PVMLNet: intermediate activations from layers −1 to −7 are compared, with flattened layer −5 delivering 90% accuracy and converging in <100 epochs. Transfer learning replaces PVMLNet’s final layer with the trained MLP head, but full fine-tuning settles at 80%, still below the feature-extraction approach. Error analysis via confusion matrices flags persistent confusions (e.g., chocolate-mousse vs ice-cream cake) and guides future data augmentation ideas.

Key Results

90%
Accuracy
90%
Neural Features
31%
Low-Level Features
80%
Transfer Learning

Key Findings

  • Color-histogram-only MLP stabilised near 21% test accuracy after 5,000 epochs; adding edge-direction histograms raised it to 31%.
  • PVMLNet layer −5 activations flattened into an MLP delivered 90% test accuracy with <100 training epochs.
  • Fine-tuning PVMLNet via transfer learning achieved 80% accuracy, 10 percentage points below the feature-extraction pipeline.
  • Confusion analysis showed chocolate-mousse, apple-pie and tiramisù frequently misclassified as ice-cream, carrot or chocolate cakes because of visual similarity.

Methodology

Extract color histograms, edge-direction histograms and GLCM features for MLP baselinesNormalise handcrafted features with mean-variance, min-max and max-abs scalingCompare PVMLNet intermediate activations (layers −1…−7) and flatten layer −5Train MLPs on handcrafted and CNN-derived features in runs ≤100 epochsReplace PVMLNet’s final layer with the trained MLP head for transfer-learning experiments
Cake Classification Features Analysis | Andrea Alberti | Andrea Alberti