All Projects

Deep Learning for Object Recognition

Python · PyTorch · Deep Learning · Computer Vision · Segmentation · Classification · Data Augmentation


Deep Learning for Object Recognition

Overview

For the IAPR course, our group developed a two-stage deep learning pipeline for instance segmentation and classification of 13 different chocolate types from images. The project involved working with weakly annotated data and adhered to strict constraints, including limited model parameters (~9.7M total) and no use of pre-trained models.


My Role

  • Contributed to the architectural design of the two-stage pipeline (Attention U-Net for segmentation, custom CNN for classification).
  • Involved in training the Attention U-Net (8M parameters) for precise binary segmentation of chocolates from diverse backgrounds, using a BCEDiceLoss.
  • Implemented the watershed algorithm for instance separation of touching chocolates.
  • Contributed to developing and training the custom 'FeatureExtractor' CNN (1.7M parameters) for classifying the segmented chocolate instances, using Label Smoothing Loss.

Challenges

  • Operating under strict model parameter limits and without pre-trained networks.
  • Achieving robust segmentation and classification with weakly annotated data and class imbalance.

Outcomes

  • Successfully developed a pipeline that accurately identified and counted all 13 chocolate types.
  • Achieved a validation Dice score of ~0.98 for segmentation and classification F1-score of ~97% on extracted regions.
  • The solution effectively handled various backgrounds and some object occlusions.

Figures

Architecture diagram of our two-stage pipeline with Attention U-Net for segmentation and CNN for classification
Fig. 1: Architecture diagram of our two-stage pipeline with Attention U-Net for segmentation and CNN for classification
Segmentation results showing the binary masks generated by our Attention U-Net model
Fig. 2: Segmentation results showing the binary masks generated by our Attention U-Net model
Training curves for the FeatureExtractor CNN showing loss and accuracy over epochs
Fig. 3: Training curves for the FeatureExtractor CNN showing loss and accuracy over epochs
Example of chocolate classification results with detected instances and their predicted classes
Fig. 4: Example of chocolate classification results with detected instances and their predicted classes