Vanilla UNet vs UNet with EfficientNetB0 as Encoder
In this blog, I will discuss the performance of Vanilla(basic/simple) UNet architecture vs UNet with EfficientNetB0 as Encoder.
Recently, I had participated in a Kaggle competition and for that competition, a maximum number of submissions were made using EfficientNet architectures. I was surprised to see that almost any submission I viewed had an implementation of EfficientNet models. Then, later I came across EfficientDet, an object detection architecture having EfficientNet architecture in the backbone! So I thought, why not use this as a backbone architecture in UNet and try it out?
To know about EfficientNet, you can refer to this link and for EfficientDet, this link.
To take a look at the UNet with EfficientNetB0 as encoder architecture, click here.
Vanilla UNet vs UNet with EfficientNetB0 as encoder
So, coming to the experiment I performed — I trained two models, one using simple UNet architecture and the other one having EfficientNetB0 as an encoder in UNet architecture. The dataset I used was Person Segmentation, from Kaggle.
Both the models were trained for 20 epochs on Google Colab. Now, why 20 epochs? The answer is I was performing Transfer learning on EffficientNetB0-UNet, so it achieved quite good values (in specific performance metrics) at 20 epochs. Since this blog deals with a comparison of two architectures, both were supposed to be trained on the same number of epochs. Results are as follows.
The Loss Values (Training Loss and Validation Loss)
In the left graph(Vanilla UNet), we can see a smooth line for training loss whereas an unstable line for validation loss. In the right graph (UNet_EfficienetNetB0), we can see two smooth lines (almost) for both training loss and validation loss. The vanilla UNet model shows lower loss during training but higher loss value during validation whereas UNet_EfficientNetB0 model has almost same value loss for both training and validation.
Recall values
Precision Values
For recall and precision graphs, we can see that Vanilla UNet performs better during training but on the validation set, UNet_EfficienetNetB0 performs better.
Inference
After training both the models on Google Colab, I tested them on my local machine. The results are as follows :
Original Images:-
Output:-
From the above inference, we can see that UNet with EfficientNetB0 as encoder gives better output compared to Vanilla UNet. I think playing around with the encoder part of UNet, as in say, using EfficientNetB1 or B2 may result in an even better model.
In my next blog, I will compare the performance of EfficientNetB0 to B7 as encoder in UNet.
Connect with me on LinkedIn :-D. Thanks for reading my blog!