Loading…
The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction
Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust mo...
Saved in:
Published in: | IAENG international journal of applied mathematics 2023-09, Vol.53 (3), p.25-38 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 38 |
container_issue | 3 |
container_start_page | 25 |
container_title | IAENG international journal of applied mathematics |
container_volume | 53 |
creator | Adisa, Juliana Adeola Ojo, Samuel Owolawi, Pius Adewale Pretorius, Agnieta Ojo, Sunday Olusegun |
description | Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust model to predict bankruptcy. Thus, we first applied an oversampling technique known as the Synthetic Minority Oversampling Technique (SMOTE) to reduce the skewness of the data. The balanced data was trained with the baseline models, the ensemble classifiers using different combination methods and the long short-term memory (LSTM) model. In addition, we employed an optimization technique called a genetic algorithm (GA) to optimize and determine the learning parameters of an LSTM network. We further determine the effects of using different training/testing ratios on the developed models. An autoencoder long short-term memory (LSTM) model was developed to extract the best feature representation of the input data. A comparative analysis was carried out between the LSTM-GA and autoencoder-LSTM. The results show that the improved LSTM-GA model with an accuracy of 98.11% performs better than other models. Overall, the research work concluded that all models and LSTM have good performances, while the optimized LSTM model via genetic algorithm outperforms the classical machine learning models. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2856542737</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2856542737</sourcerecordid><originalsourceid>FETCH-LOGICAL-p98t-29876e4681a452efb615c162db5398af130a3681e3d8c32d40588848e6cc035c3</originalsourceid><addsrcrecordid>eNo9jc1Kw0AUhYMoWGrf4YIbXQSSmczkZln6ZyHFQrMvk8lNO5LM1MlU8Bl8aYOKq3PgfHznJpqkRcHiokB5-99zvI9mw2DqJMtyjijYJPqqzgSrtiUdwLWw7WvVKaupgaUKCpRtYK-86imQhwN1I2echQ-jYEOWgtEw707Om3DuoXT2BIez8yGuyPewo975T3gqD9XuGVrnYW3saDeqg6UZgqdhgL2nxvxYH6K7VnUDzf5yGlXrVbV4icvXzXYxL-NLgSFmBeaSMompygSjtpap0KlkTS14gapNeaL4uBJvUHPWZIlAxAxJap1wofk0evzVXrx7v9IQjm_u6u34eGQopMhYznP-DWwHYA4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2856542737</pqid></control><display><type>article</type><title>The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction</title><source>Publicly Available Content Database</source><source>Coronavirus Research Database</source><creator>Adisa, Juliana Adeola ; Ojo, Samuel ; Owolawi, Pius Adewale ; Pretorius, Agnieta ; Ojo, Sunday Olusegun</creator><creatorcontrib>Adisa, Juliana Adeola ; Ojo, Samuel ; Owolawi, Pius Adewale ; Pretorius, Agnieta ; Ojo, Sunday Olusegun</creatorcontrib><description>Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust model to predict bankruptcy. Thus, we first applied an oversampling technique known as the Synthetic Minority Oversampling Technique (SMOTE) to reduce the skewness of the data. The balanced data was trained with the baseline models, the ensemble classifiers using different combination methods and the long short-term memory (LSTM) model. In addition, we employed an optimization technique called a genetic algorithm (GA) to optimize and determine the learning parameters of an LSTM network. We further determine the effects of using different training/testing ratios on the developed models. An autoencoder long short-term memory (LSTM) model was developed to extract the best feature representation of the input data. A comparative analysis was carried out between the LSTM-GA and autoencoder-LSTM. The results show that the improved LSTM-GA model with an accuracy of 98.11% performs better than other models. Overall, the research work concluded that all models and LSTM have good performances, while the optimized LSTM model via genetic algorithm outperforms the classical machine learning models.</description><identifier>ISSN: 1992-9978</identifier><identifier>EISSN: 1992-9986</identifier><language>eng</language><publisher>Hong Kong: International Association of Engineers</publisher><subject>Bankruptcy ; Classification ; Decision trees ; Financial analysis ; Genetic algorithms ; Machine learning ; Model accuracy ; Neural networks ; Optimization ; Optimization techniques ; Oversampling ; Parameters ; Securities markets ; Statistical models ; Support vector machines</subject><ispartof>IAENG international journal of applied mathematics, 2023-09, Vol.53 (3), p.25-38</ispartof><rights>2023. This work is published under https://creativecommons.org/licenses/by-nc-nd/4.0/ (the“License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2856542737/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2856542737?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25751,37010,38514,43893,44588,74182,74896</link.rule.ids></links><search><creatorcontrib>Adisa, Juliana Adeola</creatorcontrib><creatorcontrib>Ojo, Samuel</creatorcontrib><creatorcontrib>Owolawi, Pius Adewale</creatorcontrib><creatorcontrib>Pretorius, Agnieta</creatorcontrib><creatorcontrib>Ojo, Sunday Olusegun</creatorcontrib><title>The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction</title><title>IAENG international journal of applied mathematics</title><description>Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust model to predict bankruptcy. Thus, we first applied an oversampling technique known as the Synthetic Minority Oversampling Technique (SMOTE) to reduce the skewness of the data. The balanced data was trained with the baseline models, the ensemble classifiers using different combination methods and the long short-term memory (LSTM) model. In addition, we employed an optimization technique called a genetic algorithm (GA) to optimize and determine the learning parameters of an LSTM network. We further determine the effects of using different training/testing ratios on the developed models. An autoencoder long short-term memory (LSTM) model was developed to extract the best feature representation of the input data. A comparative analysis was carried out between the LSTM-GA and autoencoder-LSTM. The results show that the improved LSTM-GA model with an accuracy of 98.11% performs better than other models. Overall, the research work concluded that all models and LSTM have good performances, while the optimized LSTM model via genetic algorithm outperforms the classical machine learning models.</description><subject>Bankruptcy</subject><subject>Classification</subject><subject>Decision trees</subject><subject>Financial analysis</subject><subject>Genetic algorithms</subject><subject>Machine learning</subject><subject>Model accuracy</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Optimization techniques</subject><subject>Oversampling</subject><subject>Parameters</subject><subject>Securities markets</subject><subject>Statistical models</subject><subject>Support vector machines</subject><issn>1992-9978</issn><issn>1992-9986</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>COVID</sourceid><sourceid>PIMPY</sourceid><recordid>eNo9jc1Kw0AUhYMoWGrf4YIbXQSSmczkZln6ZyHFQrMvk8lNO5LM1MlU8Bl8aYOKq3PgfHznJpqkRcHiokB5-99zvI9mw2DqJMtyjijYJPqqzgSrtiUdwLWw7WvVKaupgaUKCpRtYK-86imQhwN1I2echQ-jYEOWgtEw707Om3DuoXT2BIez8yGuyPewo975T3gqD9XuGVrnYW3saDeqg6UZgqdhgL2nxvxYH6K7VnUDzf5yGlXrVbV4icvXzXYxL-NLgSFmBeaSMompygSjtpap0KlkTS14gapNeaL4uBJvUHPWZIlAxAxJap1wofk0evzVXrx7v9IQjm_u6u34eGQopMhYznP-DWwHYA4</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Adisa, Juliana Adeola</creator><creator>Ojo, Samuel</creator><creator>Owolawi, Pius Adewale</creator><creator>Pretorius, Agnieta</creator><creator>Ojo, Sunday Olusegun</creator><general>International Association of Engineers</general><scope>7SC</scope><scope>7TB</scope><scope>7X5</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>COVID</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K6~</scope><scope>K7-</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope></search><sort><creationdate>20230901</creationdate><title>The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction</title><author>Adisa, Juliana Adeola ; Ojo, Samuel ; Owolawi, Pius Adewale ; Pretorius, Agnieta ; Ojo, Sunday Olusegun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p98t-29876e4681a452efb615c162db5398af130a3681e3d8c32d40588848e6cc035c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Bankruptcy</topic><topic>Classification</topic><topic>Decision trees</topic><topic>Financial analysis</topic><topic>Genetic algorithms</topic><topic>Machine learning</topic><topic>Model accuracy</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Optimization techniques</topic><topic>Oversampling</topic><topic>Parameters</topic><topic>Securities markets</topic><topic>Statistical models</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Adisa, Juliana Adeola</creatorcontrib><creatorcontrib>Ojo, Samuel</creatorcontrib><creatorcontrib>Owolawi, Pius Adewale</creatorcontrib><creatorcontrib>Pretorius, Agnieta</creatorcontrib><creatorcontrib>Ojo, Sunday Olusegun</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Entrepreneurship Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>IAENG international journal of applied mathematics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Adisa, Juliana Adeola</au><au>Ojo, Samuel</au><au>Owolawi, Pius Adewale</au><au>Pretorius, Agnieta</au><au>Ojo, Sunday Olusegun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction</atitle><jtitle>IAENG international journal of applied mathematics</jtitle><date>2023-09-01</date><risdate>2023</risdate><volume>53</volume><issue>3</issue><spage>25</spage><epage>38</epage><pages>25-38</pages><issn>1992-9978</issn><eissn>1992-9986</eissn><abstract>Financial companies are grappling with a burning issue about bankruptcy prediction. There are many methods for bankruptcy prediction, including statistical models and machine learning. Real-life datasets are often imbalanced with high dimensionality. Therefore, it is challenging to train a robust model to predict bankruptcy. Thus, we first applied an oversampling technique known as the Synthetic Minority Oversampling Technique (SMOTE) to reduce the skewness of the data. The balanced data was trained with the baseline models, the ensemble classifiers using different combination methods and the long short-term memory (LSTM) model. In addition, we employed an optimization technique called a genetic algorithm (GA) to optimize and determine the learning parameters of an LSTM network. We further determine the effects of using different training/testing ratios on the developed models. An autoencoder long short-term memory (LSTM) model was developed to extract the best feature representation of the input data. A comparative analysis was carried out between the LSTM-GA and autoencoder-LSTM. The results show that the improved LSTM-GA model with an accuracy of 98.11% performs better than other models. Overall, the research work concluded that all models and LSTM have good performances, while the optimized LSTM model via genetic algorithm outperforms the classical machine learning models.</abstract><cop>Hong Kong</cop><pub>International Association of Engineers</pub><tpages>14</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1992-9978 |
ispartof | IAENG international journal of applied mathematics, 2023-09, Vol.53 (3), p.25-38 |
issn | 1992-9978 1992-9986 |
language | eng |
recordid | cdi_proquest_journals_2856542737 |
source | Publicly Available Content Database; Coronavirus Research Database |
subjects | Bankruptcy Classification Decision trees Financial analysis Genetic algorithms Machine learning Model accuracy Neural networks Optimization Optimization techniques Oversampling Parameters Securities markets Statistical models Support vector machines |
title | The Effect of Imbalanced Data and Parameter Selection via Genetic Algorithm Long Short-Term Memory (LSTM) for Financial Distress Prediction |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T10%3A58%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20Effect%20of%20Imbalanced%20Data%20and%20Parameter%20Selection%20via%20Genetic%20Algorithm%20Long%20Short-Term%20Memory%20(LSTM)%20for%20Financial%20Distress%20Prediction&rft.jtitle=IAENG%20international%20journal%20of%20applied%20mathematics&rft.au=Adisa,%20Juliana%20Adeola&rft.date=2023-09-01&rft.volume=53&rft.issue=3&rft.spage=25&rft.epage=38&rft.pages=25-38&rft.issn=1992-9978&rft.eissn=1992-9986&rft_id=info:doi/&rft_dat=%3Cproquest%3E2856542737%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p98t-29876e4681a452efb615c162db5398af130a3681e3d8c32d40588848e6cc035c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2856542737&rft_id=info:pmid/&rfr_iscdi=true |