Traffic Sign Detection and Classification on Jetson Nano using Tensorflow

3 min readJan 31, 2022

https://github.com/dekaio/TrafficSignDetClass/tree/dev

Dataset

The image dataset — German Traffic Sign Detection dataset (GTSRB) consists of 43 classes. Training Set has 34799 Images, test set has 12630 images and the validation set has 4410 images.

The dataset does have some issues such as low image contrast and class bias since not all the classes have enough images in it

Model Architecture

Two models are used:- One for Traffic Sign Detection(YOLO) and another one for Traffic Sign Classification (Trafficsignnet)

YOLO Traffic Sign Detection

YOLO, short for You Only Look Once is a convolutional neural network architecture designed for the purpose of object detection. More details here on its documentation page

YoloV3 Tiny was used because of its smaller model size.

In this project, the original YOLO Architecture is modified and trained to fit the dataset, since the GTSRB training dataset in YOLO format obtained from Kaggle had 4 output classes — prohibitory, mandatory, warning and others. Hence the detected images are also classified into one of the four classes. Since relabeling the images would take quite some time, it was decided to go ahead with 4 classes.

Traffic Sign Classification

The model used for classification has the following architecture

The model has 5 convolutional layers and 3 dense layers which connect to all the nodes of the previous layer.

In addition, batch normalization (which standardizes inputs to each layer) and max pooling(that selects the maximum element from the region of the feature map covered by the filter) were performed on each layer.

The output layer gives predictions in 43 classes of the GTSRB dataset.

Training and Evaluation

When a histogram equalization pre-processing step (CLAHE) was added to improve the dataset, the classification model accuracy during training reached 99% as shown below. The prediction also became significantly accurate and stable for consecutive frames.

Hardware implementation

Nvidia Jetson Nano was chosen as the platform for implementation because of its GPU and optimization capabilities.

A 2GB RAM was chosen to explore the limits of implementation. The detection and classification combined model size was more than what the hardware could bear during run time, hence deep compression was applied to make it run with less latency and more frames per second.

Version used

Jetpack 4.461 and Tensorflow 2.7.0

Deep Compression

The entire process of pruning, quantization, and encoding is referred to as deep compression.

The following process was followed.

Fine-tune pre-trained model with pruning: The whole classification model was pruned where it was started with 50% sparsity and ended with 80% sparsity.
Compression with TFlite.
Compression with gzip.

Performance

Frames per Seconds or FPS was used to evaluate how fast the combined classification and detection model ran on different hardware

An FPS of 1.5 was obtained on 2GB Jetson Nano (without pruning)

An FPS of 1.7 was obtained on 2GB Jetson Nano (with pruning)

An FPS of 9.76 was obtained on AMD Ryzen 7

A non-pruned model run, with recorded video detection and classification, takes about 1500 MB with the GUI running on Jetson Nano.

A pruned model run, with recorded video detection and classification, took about 1340 MB with GUI running on Jetson Nano.

References

https://www.pyimagesearch.com/2019/11/04/traffic-sign-classification-with-keras-and-deep-learning/

Pruning comprehensive guide | TensorFlow Model Optimization

TensorFlow Lite for mobile and embedded devices

www.tensorflow.org

https://alzaibkarovalia.medium.com/traffic-signs-detection-using-tensorflow-and-yolov3-yolov4-49