Loading…

The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction

Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust mo...

Full description

Saved in:
Bibliographic Details
Published in:IAENG international journal of applied mathematics 2023-09, Vol.53 (3), p.25-38
Main Authors: Adisa, Juliana Adeola, Ojo, Samuel, Owolawi, Pius Adewale, Pretorius, Agnieta, Ojo, Sunday Olusegun
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 38
container_issue 3
container_start_page 25
container_title IAENG international journal of applied mathematics
container_volume 53
creator Adisa, Juliana Adeola
Ojo, Samuel
Owolawi, Pius Adewale
Pretorius, Agnieta
Ojo, Sunday Olusegun
description Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust model to predict bankruptcy. Thus, we first applied an oversampling technique known as the Synthetic Minority Oversampling Technique (SMOTE) to reduce the skewness of the data. The balanced data was trained with the baseline models, the ensemble classifiers using different combination methods and the long short-term memory (LSTM) model. In addition, we employed an optimization technique called a genetic algorithm (GA) to optimize and determine the learning parameters of an LSTM network. We further determine the effects of using different training/testing ratios on the developed models. An autoencoder long short-term memory (LSTM) model was developed to extract the best feature representation of the input data. A comparative analysis was carried out between the LSTM-GA and autoencoder-LSTM. The results show that the improved LSTM-GA model with an accuracy of 98.11% performs better than other models. Overall, the research work concluded that all models and LSTM have good performances, while the optimized LSTM model via genetic algorithm outperforms the classical machine learning models.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2856542737</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2856542737</sourcerecordid><originalsourceid>FETCH-LOGICAL-p98t-29876e4681a452efb615c162db5398af130a3681e3d8c32d40588848e6cc035c3</originalsourceid><addsrcrecordid>eNo9jc1Kw0AUhYMoWGrf4YIbXQSSmczkZln6ZyHFQrMvk8lNO5LM1MlU8Bl8aYOKq3PgfHznJpqkRcHiokB5-99zvI9mw2DqJMtyjijYJPqqzgSrtiUdwLWw7WvVKaupgaUKCpRtYK-86imQhwN1I2echQ-jYEOWgtEw707Om3DuoXT2BIez8yGuyPewo975T3gqD9XuGVrnYW3saDeqg6UZgqdhgL2nxvxYH6K7VnUDzf5yGlXrVbV4icvXzXYxL-NLgSFmBeaSMompygSjtpap0KlkTS14gapNeaL4uBJvUHPWZIlAxAxJap1wofk0evzVXrx7v9IQjm_u6u34eGQopMhYznP-DWwHYA4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2856542737</pqid></control><display><type>article</type><title>The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction</title><source>Publicly Available Content Database</source><source>Coronavirus Research Database</source><creator>Adisa, Juliana Adeola ; Ojo, Samuel ; Owolawi, Pius Adewale ; Pretorius, Agnieta ; Ojo, Sunday Olusegun</creator><creatorcontrib>Adisa, Juliana Adeola ; Ojo, Samuel ; Owolawi, Pius Adewale ; Pretorius, Agnieta ; Ojo, Sunday Olusegun</creatorcontrib><description>Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust model to predict bankruptcy. Thus, we first applied an oversampling technique known as the Synthetic Minority Oversampling Technique (SMOTE) to reduce the skewness of the data. The balanced data was trained with the baseline models, the ensemble classifiers using different combination methods and the long short-term memory (LSTM) model. In addition, we employed an optimization technique called a genetic algorithm (GA) to optimize and determine the learning parameters of an LSTM network. We further determine the effects of using different training/testing ratios on the developed models. An autoencoder long short-term memory (LSTM) model was developed to extract the best feature representation of the input data. A comparative analysis was carried out between the LSTM-GA and autoencoder-LSTM. The results show that the improved LSTM-GA model with an accuracy of 98.11% performs better than other models. Overall, the research work concluded that all models and LSTM have good performances, while the optimized LSTM model via genetic algorithm outperforms the classical machine learning models.</description><identifier>ISSN: 1992-9978</identifier><identifier>EISSN: 1992-9986</identifier><language>eng</language><publisher>Hong Kong: International Association of Engineers</publisher><subject>Bankruptcy ; Classification ; Decision trees ; Financial analysis ; Genetic algorithms ; Machine learning ; Model accuracy ; Neural networks ; Optimization ; Optimization techniques ; Oversampling ; Parameters ; Securities markets ; Statistical models ; Support vector machines</subject><ispartof>IAENG international journal of applied mathematics, 2023-09, Vol.53 (3), p.25-38</ispartof><rights>2023. This work is published under https://creativecommons.org/licenses/by-nc-nd/4.0/ (the“License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2856542737/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2856542737?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25751,37010,38514,43893,44588,74182,74896</link.rule.ids></links><search><creatorcontrib>Adisa, Juliana Adeola</creatorcontrib><creatorcontrib>Ojo, Samuel</creatorcontrib><creatorcontrib>Owolawi, Pius Adewale</creatorcontrib><creatorcontrib>Pretorius, Agnieta</creatorcontrib><creatorcontrib>Ojo, Sunday Olusegun</creatorcontrib><title>The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction</title><title>IAENG international journal of applied mathematics</title><description>Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust model to predict bankruptcy. Thus, we first applied an oversampling technique known as the Synthetic Minority Oversampling Technique (SMOTE) to reduce the skewness of the data. The balanced data was trained with the baseline models, the ensemble classifiers using different combination methods and the long short-term memory (LSTM) model. In addition, we employed an optimization technique called a genetic algorithm (GA) to optimize and determine the learning parameters of an LSTM network. We further determine the effects of using different training/testing ratios on the developed models. An autoencoder long short-term memory (LSTM) model was developed to extract the best feature representation of the input data. A comparative analysis was carried out between the LSTM-GA and autoencoder-LSTM. The results show that the improved LSTM-GA model with an accuracy of 98.11% performs better than other models. Overall, the research work concluded that all models and LSTM have good performances, while the optimized LSTM model via genetic algorithm outperforms the classical machine learning models.</description><subject>Bankruptcy</subject><subject>Classification</subject><subject>Decision trees</subject><subject>Financial analysis</subject><subject>Genetic algorithms</subject><subject>Machine learning</subject><subject>Model accuracy</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Optimization techniques</subject><subject>Oversampling</subject><subject>Parameters</subject><subject>Securities markets</subject><subject>Statistical models</subject><subject>Support vector machines</subject><issn>1992-9978</issn><issn>1992-9986</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>COVID</sourceid><sourceid>PIMPY</sourceid><recordid>eNo9jc1Kw0AUhYMoWGrf4YIbXQSSmczkZln6ZyHFQrMvk8lNO5LM1MlU8Bl8aYOKq3PgfHznJpqkRcHiokB5-99zvI9mw2DqJMtyjijYJPqqzgSrtiUdwLWw7WvVKaupgaUKCpRtYK-86imQhwN1I2echQ-jYEOWgtEw707Om3DuoXT2BIez8yGuyPewo975T3gqD9XuGVrnYW3saDeqg6UZgqdhgL2nxvxYH6K7VnUDzf5yGlXrVbV4icvXzXYxL-NLgSFmBeaSMompygSjtpap0KlkTS14gapNeaL4uBJvUHPWZIlAxAxJap1wofk0evzVXrx7v9IQjm_u6u34eGQopMhYznP-DWwHYA4</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Adisa, Juliana Adeola</creator><creator>Ojo, Samuel</creator><creator>Owolawi, Pius Adewale</creator><creator>Pretorius, Agnieta</creator><creator>Ojo, Sunday Olusegun</creator><general>International Association of Engineers</general><scope>7SC</scope><scope>7TB</scope><scope>7X5</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>COVID</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K6~</scope><scope>K7-</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope></search><sort><creationdate>20230901</creationdate><title>The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction</title><author>Adisa, Juliana Adeola ; Ojo, Samuel ; Owolawi, Pius Adewale ; Pretorius, Agnieta ; Ojo, Sunday Olusegun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p98t-29876e4681a452efb615c162db5398af130a3681e3d8c32d40588848e6cc035c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Bankruptcy</topic><topic>Classification</topic><topic>Decision trees</topic><topic>Financial analysis</topic><topic>Genetic algorithms</topic><topic>Machine learning</topic><topic>Model accuracy</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Optimization techniques</topic><topic>Oversampling</topic><topic>Parameters</topic><topic>Securities markets</topic><topic>Statistical models</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Adisa, Juliana Adeola</creatorcontrib><creatorcontrib>Ojo, Samuel</creatorcontrib><creatorcontrib>Owolawi, Pius Adewale</creatorcontrib><creatorcontrib>Pretorius, Agnieta</creatorcontrib><creatorcontrib>Ojo, Sunday Olusegun</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Entrepreneurship Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>IAENG international journal of applied mathematics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Adisa, Juliana Adeola</au><au>Ojo, Samuel</au><au>Owolawi, Pius Adewale</au><au>Pretorius, Agnieta</au><au>Ojo, Sunday Olusegun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction</atitle><jtitle>IAENG international journal of applied mathematics</jtitle><date>2023-09-01</date><risdate>2023</risdate><volume>53</volume><issue>3</issue><spage>25</spage><epage>38</epage><pages>25-38</pages><issn>1992-9978</issn><eissn>1992-9986</eissn><abstract>Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust model to predict bankruptcy. Thus, we first applied an oversampling technique known as the Synthetic Minority Oversampling Technique (SMOTE) to reduce the skewness of the data. The balanced data was trained with the baseline models, the ensemble classifiers using different combination methods and the long short-term memory (LSTM) model. In addition, we employed an optimization technique called a genetic algorithm (GA) to optimize and determine the learning parameters of an LSTM network. We further determine the effects of using different training/testing ratios on the developed models. An autoencoder long short-term memory (LSTM) model was developed to extract the best feature representation of the input data. A comparative analysis was carried out between the LSTM-GA and autoencoder-LSTM. The results show that the improved LSTM-GA model with an accuracy of 98.11% performs better than other models. Overall, the research work concluded that all models and LSTM have good performances, while the optimized LSTM model via genetic algorithm outperforms the classical machine learning models.</abstract><cop>Hong Kong</cop><pub>International Association of Engineers</pub><tpages>14</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1992-9978
ispartof IAENG international journal of applied mathematics, 2023-09, Vol.53 (3), p.25-38
issn 1992-9978
1992-9986
language eng
recordid cdi_proquest_journals_2856542737
source Publicly Available Content Database; Coronavirus Research Database
subjects Bankruptcy
Classification
Decision trees
Financial analysis
Genetic algorithms
Machine learning
Model accuracy
Neural networks
Optimization
Optimization techniques
Oversampling
Parameters
Securities markets
Statistical models
Support vector machines
title The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T10%3A58%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20Effect%20of%20Imbalanced%20Data%20and%20Parameter%20Selection%20via%20Genetic%20Algorithm%20Long%20Short-Term%20Memory%20(LSTM)%20for%20Financial%20Distress%20Prediction&rft.jtitle=IAENG%20international%20journal%20of%20applied%20mathematics&rft.au=Adisa,%20Juliana%20Adeola&rft.date=2023-09-01&rft.volume=53&rft.issue=3&rft.spage=25&rft.epage=38&rft.pages=25-38&rft.issn=1992-9978&rft.eissn=1992-9986&rft_id=info:doi/&rft_dat=%3Cproquest%3E2856542737%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p98t-29876e4681a452efb615c162db5398af130a3681e3d8c32d40588848e6cc035c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2856542737&rft_id=info:pmid/&rfr_iscdi=true