Comparative study — Using EfficientNetB0 to EfficientNetB7 as encoder in UNet. (Comparison of 8 architectures)

6 min readJul 20, 2021

In this blog I will compare performance of EfficinetNetB0 to EfficientNetB7 (all 8 architectures) as an encoder of UNet architecture. I will discuss about the data used, performance metrices and the final result.

Almost a month ago, I had published a blog on performance of Vanilla(basic) UNet architecture and EfficientNetB0 as encoder in UNet. Link to the blog is here.

Dataset

The dataset I used for this experiment was Carla Semantic Segmentation Dataset (1000 images). This dataset has been built using a simulator meaning it’s a Synthetic dataset. This dataset belongs to group of Autonomous vehicles datasets. The number of images used for training was 800 and 200 were used for validation.

Training

Each of these models were set to train for 100 epochs with EarlyStopping enabled. So, each model’s training stopped somewhere between 30 to 50 epochs. The parameter used to monitor training was validation loss.

Batch size and learning rate was decided by running several experiments on one architecture and then was same kept constant for the entire experiment (that is, batch size and learning rate was same for each model that was being trained). Now batch size and learning rate can be different for each architecture, but to maintain uniformity for the experiment, each model was trained using same hyperparameters.

You may now think, why the number of epochs were not same? So, by keeping the number of epochs same, we wouldn’t know which model converges faster, whether lesser number of parameters means faster convergence(less epochs needed for training). So, I allowed each model to take it’s own time to converge, giving better insights about the training.

EfficientNet architectures as Encoder in UNet

In the above image, we can see comparison of different architectures w.r.t the number of parameters and Top 1 accuracy on Imagenet dataset. We see EfficientNetB0 having very less number of parameters and still having better accuracy than ResNet 50 which has significant amount of parameters. Which shows how promising the EfficientNet family is! Lesser parameters means faster training and lighter model.

All these above numbers are when we are using these models for normal classification problems as in when we directly use an EfficientNet architecture. What about the number of parameters when we use it as an encoder in UNet?

Keeping the encoder part frozen (encoder.trainable = False in Keras), we have the following number of parameters.

Coming to training details of each architecture (the following table shows metrices and epoch number having least validation loss)

The above image shows — for each model, in which epoch least validation loss was achieved, total epochs the model was trained for and other relevant metrics.

What do we understand from above tables (Figure 2 and Figure 3)?

From figure 3, we see that from UNet_EfficientNetB0 to UNet_EfficientNetB6, validation MeanIOU remains same (except for UNet_EfficientNetB1 and UNet_EfficientNetB2), difference between validation loss is just 0.01 or 0.02 and validation accuracy is also almost same (difference of 0.01). The time for each epoch on average for these 6 architectures were 1 minute to 1.5 minute.
Referring to figure 2, from UNet_EfficientNetB0 to UNet_EfficientNetB6 (except UNet_EfficientNetB2), the number of trainable parameters are almost same. UNet_EfficientNetB2 has highest number of trainable parameters (even more than UNet_EfficientNetB7, which is the longest architecture compared to other 6) and its overall performance is worst compared to other models.
Coming to UNet_EfficientNetB7, we see significant increase in Validation MeanIOU and least validation loss amongst all. Even I was surprise to see this. If you refer figure 2, UNet_EfficientNetB7 has highest number of parameters, approximately 2 times of that in UNet_EfficientNetB0. UNet_EfficientNetB7 took about 2.5–3 mins for each epoch.
So if you want a light model which trains very fast and performs significantly good, you can sure go with UNet_EfficientNetB0. From above tables, you can see that from UNet_EfficientNetB0 to UNet_EfficientNetB6 the performance metrices are same so why to waste resources?
If training time and size of model is not a factor, then definitely go with UNet_EfficientNetB7. There is significant jump in performance as well as output.

Performance Graphs and Output

Note : The images are kept large to avoid blur

1. UNet_EfficientNetB0

**UNet_EfficientNetB0 — Epochs vs Validation Loss**

**UNet_EfficientNetB0 — Epochs vs Validation MeanIOU**

2. UNet_EfficientNetB1

**UNet_EfficientNetB1 — Epochs vs Validation Loss**

**UNet_EfficientNetB1 — Epochs vs Validation MeanIOU**

3. UNet_EfficientNetB2

**UNet_EfficientNetB2 — Epochs vs Validation Loss**

**UNet_EfficientNetB2 — Epochs vs Validation MeanIOU**

4. UNet_EfficientNetB3

**UNet_EfficientNetB3 — Epochs vs Validation Loss**

**UNet_EfficientNetB3— Epochs vs Validation MeanIOU**

5. UNet_EfficientNetB4

**UNet_EfficientNetB4 — Epochs vs Validation Loss**

**UNet_EfficientNetB4 — Epochs vs Validation MeanIOU**

6. UNet_EfficientNetB5

**UNet_EfficientNetB5 — Epochs vs Validation Loss**

**UNet_EfficientNetB5— Epochs vs Validation MeanIOU**

7. UNet_EfficientNetB6

**UNet_EfficientNetB6 — Epochs vs Validation Loss**

**UNet_EfficientNetB6 — Epochs vs Validation MeanIOU**

8. UNet_EfficientNetB7

**UNet_EfficientNetB7— Epochs vs Validation Loss**

**UNet_EfficientNetB7 — Epochs vs Validation MeanIOU**

Conclusion

The choice model is also not only based on performance metrices but also on the actual output of the model. In production what matters the most is output and the inference time of a model.

Thankyou :).
Connect with me on LinkedIn.