This is output of UNet with EfficientNetB0 as encoder (Original image — link)

Vanilla UNet vs UNet with EfficientNetB0 as Encoder

4 min readJun 23, 2021

In this blog, I will discuss the performance of Vanilla(basic/simple) UNet architecture vs UNet with EfficientNetB0 as Encoder.

Recently, I had participated in a Kaggle competition and for that competition, a maximum number of submissions were made using EfficientNet architectures. I was surprised to see that almost any submission I viewed had an implementation of EfficientNet models. Then, later I came across EfficientDet, an object detection architecture having EfficientNet architecture in the backbone! So I thought, why not use this as a backbone architecture in UNet and try it out?

To know about EfficientNet, you can refer to this link and for EfficientDet, this link.

To take a look at the UNet with EfficientNetB0 as encoder architecture, click here.

Vanilla UNet vs UNet with EfficientNetB0 as encoder

So, coming to the experiment I performed — I trained two models, one using simple UNet architecture and the other one having EfficientNetB0 as an encoder in UNet architecture. The dataset I used was Person Segmentation, from Kaggle.

Both the models were trained for 20 epochs on Google Colab. Now, why 20 epochs? The answer is I was performing Transfer learning on EffficientNetB0-UNet, so it achieved quite good values (in specific performance metrics) at 20 epochs. Since this blog deals with a comparison of two architectures, both were supposed to be trained on the same number of epochs. Results are as follows.

The Loss Values (Training Loss and Validation Loss)

Left — Vanilla UNet Loss and Validation loss & Right — UNetEfficientNetB0 Loss and Validation loss

In the left graph(Vanilla UNet), we can see a smooth line for training loss whereas an unstable line for validation loss. In the right graph (UNet_EfficienetNetB0), we can see two smooth lines (almost) for both training loss and validation loss. The vanilla UNet model shows lower loss during training but higher loss value during validation whereas UNet_EfficientNetB0 model has almost same value loss for both training and validation.

Recall values

Precision Values

For recall and precision graphs, we can see that Vanilla UNet performs better during training but on the validation set, UNet_EfficienetNetB0 performs better.

Inference

After training both the models on Google Colab, I tested them on my local machine. The results are as follows :

Original Images:-

Left Image Source — link & Right image source — link

Output:-

From the above inference, we can see that UNet with EfficientNetB0 as encoder gives better output compared to Vanilla UNet. I think playing around with the encoder part of UNet, as in say, using EfficientNetB1 or B2 may result in an even better model.

In my next blog, I will compare the performance of EfficientNetB0 to B7 as encoder in UNet.

Connect with me on LinkedIn :-D. Thanks for reading my blog!

Vanilla UNet vs UNet with EfficientNetB0 as Encoder

Vanilla UNet vs UNet with EfficientNetB0 as encoder

The Loss Values (Training Loss and Validation Loss)

Recall values

Precision Values

Inference

Original Images:-

Output:-

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Sahil Chachra

Responses (2)