Loading…

Random Forest as a promising application to predict basic-dye biosorption process using orange waste

•Machine Learning Artificial Neural Networks and Random Forest were used to predict dye adsorption.•A total of 7 variables were tested, performing more than 200 independent experiments.•Random Forest showed good performance in adsorption process prediction.•The Machine Learning procedure was carried...

Full description

Saved in:
Bibliographic Details
Published in:Journal of environmental chemical engineering 2020-08, Vol.8 (4), p.103952, Article 103952
Main Authors: de Miranda Ramos Soares, Arthur Pontes, de Oliveira Carvalho, Frede, de Farias Silva, Carlos Eduardo, da Silva Gonçalves, Andreza Heloiza, de Souza Abud, Ana Karla
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Machine Learning Artificial Neural Networks and Random Forest were used to predict dye adsorption.•A total of 7 variables were tested, performing more than 200 independent experiments.•Random Forest showed good performance in adsorption process prediction.•The Machine Learning procedure was carried out using Python as a programming language. In the present study, adsorption of methylene blue dye in residual agricultural biomass (orange bagasse) was modelled using o machine learning algorithm Random Forest (RF) and compared with the traditional Artificial Neural Networks (ANN) approach. The Machine Learning was performed using Python, a free and open source programming language. The models were built and validated with a combination of 202 independent experiments aimed at separately predicting the final concentration of methylene blue (Cf), adsorption capacity (Q) and adsorbate percentage removal (R%), having as input variables: Temperature, pH, adsorbent dosage, contact time, salinity, initial methylene blue concentration and rotation. The validation process of the models was carried out using the Coefficient of Determination (R2) and the Mean Squared Error (MSE). According to the obtained results, both RF and ANN models exhibited similar performances, as shown by their respective R2 values of 0.9739 and 0.9734 for Cf; 0.9932 and 0.9919, for Q; 0.9318 and 0.9257 for R%, as well as their respective MSE values of 0.0012 and 0.0016 for Cf; 0.0005 and 0.0007 for Q; 0.0015 and 0.0019 for R%. However, RF stood out due to its capacity to better capture data variation. Finally, it was possible to point out that both methods resulted in models able to satisfactorily predict all three response variables, thereby allowing less experimental effort.
ISSN:2213-3437
2213-3437
DOI:10.1016/j.jece.2020.103952