Loading…

Building an efficient convolution neural network from scratch: A case study on detecting and localizing slums

Designing a convolution neural network from scratch is one of the biggest challenges facing the creation of reproducible models. Despite feeding the model with an adequate amount of labeled data to mitigate fluctuations during training, the model still suffers from high variance in the final overall...

Full description

Saved in:
Bibliographic Details
Published in:Scientific African 2023-07, Vol.20, p.e01612, Article e01612
Main Authors: Moudden, Tarik El, Amnai, Mohamed
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Designing a convolution neural network from scratch is one of the biggest challenges facing the creation of reproducible models. Despite feeding the model with an adequate amount of labeled data to mitigate fluctuations during training, the model still suffers from high variance in the final overall accuracy and loss among identical training runs. Many of the reasons behind this are the randomness in data shuffling and augmentation, the behavior of gradient decent function, and the non-determinism in convnet layers and floating-point computation of the GPU. The method used to address the aforementioned issues, specifically in the case of negative transfer learning, is divided into three steps: First, designing an efficient lightweight convnet architecture with respect to available resources. Second, mitigating oscillations during training. Third, after setting the random seed across the training, select the appropriate weight initialization. Our extensive experiments in the use case of binary slum localization and detection show that our method improves the reproducibility of our model from scratch with an accuracy of 98.88±1.15%, a loss of 0.03±0.05 for a confidence level of 99.73%. These results make this model a strong competitor to the pre-trained models using transfer learning.
ISSN:2468-2276
2468-2276
DOI:10.1016/j.sciaf.2023.e01612