Loading…

Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding

[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. Howev...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2024-02
Main Authors:	Cabral, Raphael, Kalinowski, Marcos, Baldassarre, Maria Teresa, Villamizar, Hugo, Escovedo, Tatiana, Lopes, Hélio
Format:	Article
Language:	English
Subjects:	Algorithms Best practice Data science Design engineering Iterative methods Machine learning Maintainability Principles Scientists Software Software engineering
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Cabral, Raphael Kalinowski, Marcos Baldassarre, Maria Teresa Villamizar, Hugo Escovedo, Tatiana Lopes, Hélio
description	[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn't adhere to software design best practices. [Goal] In order to better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The study results provide statistically significant evidence that the adoption of the SOLID design principles can improve code understanding within the realm of ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the maintainability of ML code.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2924067241</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2924067241</sourcerecordid><originalsourceid>FETCH-proquest_journals_29240672413</originalsourceid><addsrcrecordid>eNqNjt0KgkAUhJcgSMp3ONC1oKv2c61FQlFQ3XQjix51xc7a7trzZ9ADdDUw8w0zE-bwMAy8TcT5jLnGtL7v89Wax3HosEdGbzRW1sJKqsE2CNmzF4UFVcH1fMxSSNHImuCiJRWy79CAIjiJopGEcESh6dtMVIlwpxK1sYLK0VqwaSU6g-5P52y5392Sg9dr9RrG0bxVg6YxyvmWR_54KQrC_6gPuehB2g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2924067241</pqid></control><display><type>article</type><title>Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding</title><source>Publicly Available Content Database</source><creator>Cabral, Raphael ; Kalinowski, Marcos ; Baldassarre, Maria Teresa ; Villamizar, Hugo ; Escovedo, Tatiana ; Lopes, Hélio</creator><creatorcontrib>Cabral, Raphael ; Kalinowski, Marcos ; Baldassarre, Maria Teresa ; Villamizar, Hugo ; Escovedo, Tatiana ; Lopes, Hélio</creatorcontrib><description>[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn't adhere to software design best practices. [Goal] In order to better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The study results provide statistically significant evidence that the adoption of the SOLID design principles can improve code understanding within the realm of ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the maintainability of ML code.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Best practice ; Data science ; Design engineering ; Iterative methods ; Machine learning ; Maintainability ; Principles ; Scientists ; Software ; Software engineering</subject><ispartof>arXiv.org, 2024-02</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2924067241?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Cabral, Raphael</creatorcontrib><creatorcontrib>Kalinowski, Marcos</creatorcontrib><creatorcontrib>Baldassarre, Maria Teresa</creatorcontrib><creatorcontrib>Villamizar, Hugo</creatorcontrib><creatorcontrib>Escovedo, Tatiana</creatorcontrib><creatorcontrib>Lopes, Hélio</creatorcontrib><title>Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding</title><title>arXiv.org</title><description>[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn't adhere to software design best practices. [Goal] In order to better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The study results provide statistically significant evidence that the adoption of the SOLID design principles can improve code understanding within the realm of ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the maintainability of ML code.</description><subject>Algorithms</subject><subject>Best practice</subject><subject>Data science</subject><subject>Design engineering</subject><subject>Iterative methods</subject><subject>Machine learning</subject><subject>Maintainability</subject><subject>Principles</subject><subject>Scientists</subject><subject>Software</subject><subject>Software engineering</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjt0KgkAUhJcgSMp3ONC1oKv2c61FQlFQ3XQjix51xc7a7trzZ9ADdDUw8w0zE-bwMAy8TcT5jLnGtL7v89Wax3HosEdGbzRW1sJKqsE2CNmzF4UFVcH1fMxSSNHImuCiJRWy79CAIjiJopGEcESh6dtMVIlwpxK1sYLK0VqwaSU6g-5P52y5392Sg9dr9RrG0bxVg6YxyvmWR_54KQrC_6gPuehB2g</recordid><startdate>20240208</startdate><enddate>20240208</enddate><creator>Cabral, Raphael</creator><creator>Kalinowski, Marcos</creator><creator>Baldassarre, Maria Teresa</creator><creator>Villamizar, Hugo</creator><creator>Escovedo, Tatiana</creator><creator>Lopes, Hélio</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240208</creationdate><title>Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding</title><author>Cabral, Raphael ; Kalinowski, Marcos ; Baldassarre, Maria Teresa ; Villamizar, Hugo ; Escovedo, Tatiana ; Lopes, Hélio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29240672413</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Best practice</topic><topic>Data science</topic><topic>Design engineering</topic><topic>Iterative methods</topic><topic>Machine learning</topic><topic>Maintainability</topic><topic>Principles</topic><topic>Scientists</topic><topic>Software</topic><topic>Software engineering</topic><toplevel>online_resources</toplevel><creatorcontrib>Cabral, Raphael</creatorcontrib><creatorcontrib>Kalinowski, Marcos</creatorcontrib><creatorcontrib>Baldassarre, Maria Teresa</creatorcontrib><creatorcontrib>Villamizar, Hugo</creatorcontrib><creatorcontrib>Escovedo, Tatiana</creatorcontrib><creatorcontrib>Lopes, Hélio</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cabral, Raphael</au><au>Kalinowski, Marcos</au><au>Baldassarre, Maria Teresa</au><au>Villamizar, Hugo</au><au>Escovedo, Tatiana</au><au>Lopes, Hélio</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding</atitle><jtitle>arXiv.org</jtitle><date>2024-02-08</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn't adhere to software design best practices. [Goal] In order to better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The study results provide statistically significant evidence that the adoption of the SOLID design principles can improve code understanding within the realm of ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the maintainability of ML code.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-02
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2924067241
source	Publicly Available Content Database
subjects	Algorithms Best practice Data science Design engineering Iterative methods Machine learning Maintainability Principles Scientists Software Software engineering
title	Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T23%3A23%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Investigating%20the%20Impact%20of%20SOLID%20Design%20Principles%20on%20Machine%20Learning%20Code%20Understanding&rft.jtitle=arXiv.org&rft.au=Cabral,%20Raphael&rft.date=2024-02-08&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2924067241%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_29240672413%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2924067241&rft_id=info:pmid/&rfr_iscdi=true