Loading…

Optimizing Recurrent Neural Network Architectures for De Novo Drug Design

In drug discovery, Deep Learning algorithms are emerging as a potential method to generate novel chemical structures since they can speed up the traditional process and decrease expenditure. Recurrent architectures are amongst the most promising methods for computational de novo drug design. One cur...

Full description

Saved in:
Bibliographic Details
Main Authors: Santos, Beatriz P., Abbasi, Maryam, Pereira, Tiago, Ribeiro, Bernardete, Arrais, Joel P.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 177
container_issue
container_start_page 172
container_title
container_volume
creator Santos, Beatriz P.
Abbasi, Maryam
Pereira, Tiago
Ribeiro, Bernardete
Arrais, Joel P.
description In drug discovery, Deep Learning algorithms are emerging as a potential method to generate novel chemical structures since they can speed up the traditional process and decrease expenditure. Recurrent architectures are amongst the most promising methods for computational de novo drug design. One current challenge consists in finding the optimal architecture and parameters for the recurrent network that assures the generation of valid molecules that span the chemical space. In this work we perform an evaluation on Recurrent Neural Networks which can learn the syntax of molecular representation in terms of SMILES notation. We optimize the computational framework based on the recurrent architecture and its hyper-parameters. Moreover, we evaluate the performance of two types of encoding and spatial arrangement of molecules: Embedding and One-hot Encoding, and datasets with and without stereo-chemical information, respectively. The proposed model showed improved performance when compared to the current literature, both in terms of percentage of valid generated SMILES and diversity with 98.7% and 0.88, for the ChEMBL dataset, respectively. Even when considering the ZINC biogenic library, with stereochemical information, the values were 94.5% and 0.90. The obtained results reveal the potential of the recurrent architectures in learning the SMILES syntax and adding novelty to generate promising compounds.
doi_str_mv 10.1109/CBMS52027.2021.00067
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_9474742</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9474742</ieee_id><sourcerecordid>9474742</sourcerecordid><originalsourceid>FETCH-LOGICAL-i203t-ae55bcca0edf8eecfddebb1574d8bf07f2f8862564bc10772b27719b8d205d083</originalsourceid><addsrcrecordid>eNotjF1LwzAYRqMgOOd-gV7kD7S-SZsmuZyd08HcwI_r0SRvanRrR9oq-ustKAeew3NzCLlmkDIG-qa8fXwWHLhMx2EpABTyhMy0VKwoRJ4zzopTMuGZ5IlmWp2Ti657BxAZE2JCVttjHw7hJzQ1fUI7xIhNTzc4xGo_qv9q4wedR_sWerT9ELGjvo10gXTTfrZ0EYd6PF2om0ty5qt9h7N_T8nr8u6lfEjW2_tVOV8ngUPWJxUKYaytAJ1XiNY7h8YwIXOnjAfpuVeq4KLIjWUgJTdcSqaNchyEA5VNydVfNyDi7hjDoYrfO53LEZ79Avq_TiU</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Optimizing Recurrent Neural Network Architectures for De Novo Drug Design</title><source>IEEE Xplore All Conference Series</source><creator>Santos, Beatriz P. ; Abbasi, Maryam ; Pereira, Tiago ; Ribeiro, Bernardete ; Arrais, Joel P.</creator><creatorcontrib>Santos, Beatriz P. ; Abbasi, Maryam ; Pereira, Tiago ; Ribeiro, Bernardete ; Arrais, Joel P.</creatorcontrib><description>In drug discovery, Deep Learning algorithms are emerging as a potential method to generate novel chemical structures since they can speed up the traditional process and decrease expenditure. Recurrent architectures are amongst the most promising methods for computational de novo drug design. One current challenge consists in finding the optimal architecture and parameters for the recurrent network that assures the generation of valid molecules that span the chemical space. In this work we perform an evaluation on Recurrent Neural Networks which can learn the syntax of molecular representation in terms of SMILES notation. We optimize the computational framework based on the recurrent architecture and its hyper-parameters. Moreover, we evaluate the performance of two types of encoding and spatial arrangement of molecules: Embedding and One-hot Encoding, and datasets with and without stereo-chemical information, respectively. The proposed model showed improved performance when compared to the current literature, both in terms of percentage of valid generated SMILES and diversity with 98.7% and 0.88, for the ChEMBL dataset, respectively. Even when considering the ZINC biogenic library, with stereochemical information, the values were 94.5% and 0.90. The obtained results reveal the potential of the recurrent architectures in learning the SMILES syntax and adding novelty to generate promising compounds.</description><identifier>EISSN: 2372-9198</identifier><identifier>EISBN: 9781665441216</identifier><identifier>EISBN: 1665441216</identifier><identifier>DOI: 10.1109/CBMS52027.2021.00067</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Biological system modeling ; Computer architecture ; Deep Learning ; Drug Design ; Drug Generation ; Drugs ; Encoding ; GRU ; Libraries ; LSTM ; Recurrent neural networks ; RNN ; SMILES ; Syntactics</subject><ispartof>2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS), 2021, p.172-177</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0003-2487-0097 ; 0000-0002-9770-7672 ; 0000-0002-9011-0734 ; 0000-0002-7986-8421 ; 0000-0003-4937-2334</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9474742$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9474742$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Santos, Beatriz P.</creatorcontrib><creatorcontrib>Abbasi, Maryam</creatorcontrib><creatorcontrib>Pereira, Tiago</creatorcontrib><creatorcontrib>Ribeiro, Bernardete</creatorcontrib><creatorcontrib>Arrais, Joel P.</creatorcontrib><title>Optimizing Recurrent Neural Network Architectures for De Novo Drug Design</title><title>2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS)</title><addtitle>CBMS</addtitle><description>In drug discovery, Deep Learning algorithms are emerging as a potential method to generate novel chemical structures since they can speed up the traditional process and decrease expenditure. Recurrent architectures are amongst the most promising methods for computational de novo drug design. One current challenge consists in finding the optimal architecture and parameters for the recurrent network that assures the generation of valid molecules that span the chemical space. In this work we perform an evaluation on Recurrent Neural Networks which can learn the syntax of molecular representation in terms of SMILES notation. We optimize the computational framework based on the recurrent architecture and its hyper-parameters. Moreover, we evaluate the performance of two types of encoding and spatial arrangement of molecules: Embedding and One-hot Encoding, and datasets with and without stereo-chemical information, respectively. The proposed model showed improved performance when compared to the current literature, both in terms of percentage of valid generated SMILES and diversity with 98.7% and 0.88, for the ChEMBL dataset, respectively. Even when considering the ZINC biogenic library, with stereochemical information, the values were 94.5% and 0.90. The obtained results reveal the potential of the recurrent architectures in learning the SMILES syntax and adding novelty to generate promising compounds.</description><subject>Biological system modeling</subject><subject>Computer architecture</subject><subject>Deep Learning</subject><subject>Drug Design</subject><subject>Drug Generation</subject><subject>Drugs</subject><subject>Encoding</subject><subject>GRU</subject><subject>Libraries</subject><subject>LSTM</subject><subject>Recurrent neural networks</subject><subject>RNN</subject><subject>SMILES</subject><subject>Syntactics</subject><issn>2372-9198</issn><isbn>9781665441216</isbn><isbn>1665441216</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2021</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjF1LwzAYRqMgOOd-gV7kD7S-SZsmuZyd08HcwI_r0SRvanRrR9oq-ustKAeew3NzCLlmkDIG-qa8fXwWHLhMx2EpABTyhMy0VKwoRJ4zzopTMuGZ5IlmWp2Ti657BxAZE2JCVttjHw7hJzQ1fUI7xIhNTzc4xGo_qv9q4wedR_sWerT9ELGjvo10gXTTfrZ0EYd6PF2om0ty5qt9h7N_T8nr8u6lfEjW2_tVOV8ngUPWJxUKYaytAJ1XiNY7h8YwIXOnjAfpuVeq4KLIjWUgJTdcSqaNchyEA5VNydVfNyDi7hjDoYrfO53LEZ79Avq_TiU</recordid><startdate>202106</startdate><enddate>202106</enddate><creator>Santos, Beatriz P.</creator><creator>Abbasi, Maryam</creator><creator>Pereira, Tiago</creator><creator>Ribeiro, Bernardete</creator><creator>Arrais, Joel P.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope><orcidid>https://orcid.org/0000-0003-2487-0097</orcidid><orcidid>https://orcid.org/0000-0002-9770-7672</orcidid><orcidid>https://orcid.org/0000-0002-9011-0734</orcidid><orcidid>https://orcid.org/0000-0002-7986-8421</orcidid><orcidid>https://orcid.org/0000-0003-4937-2334</orcidid></search><sort><creationdate>202106</creationdate><title>Optimizing Recurrent Neural Network Architectures for De Novo Drug Design</title><author>Santos, Beatriz P. ; Abbasi, Maryam ; Pereira, Tiago ; Ribeiro, Bernardete ; Arrais, Joel P.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i203t-ae55bcca0edf8eecfddebb1574d8bf07f2f8862564bc10772b27719b8d205d083</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Biological system modeling</topic><topic>Computer architecture</topic><topic>Deep Learning</topic><topic>Drug Design</topic><topic>Drug Generation</topic><topic>Drugs</topic><topic>Encoding</topic><topic>GRU</topic><topic>Libraries</topic><topic>LSTM</topic><topic>Recurrent neural networks</topic><topic>RNN</topic><topic>SMILES</topic><topic>Syntactics</topic><toplevel>online_resources</toplevel><creatorcontrib>Santos, Beatriz P.</creatorcontrib><creatorcontrib>Abbasi, Maryam</creatorcontrib><creatorcontrib>Pereira, Tiago</creatorcontrib><creatorcontrib>Ribeiro, Bernardete</creatorcontrib><creatorcontrib>Arrais, Joel P.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Santos, Beatriz P.</au><au>Abbasi, Maryam</au><au>Pereira, Tiago</au><au>Ribeiro, Bernardete</au><au>Arrais, Joel P.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Optimizing Recurrent Neural Network Architectures for De Novo Drug Design</atitle><btitle>2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS)</btitle><stitle>CBMS</stitle><date>2021-06</date><risdate>2021</risdate><spage>172</spage><epage>177</epage><pages>172-177</pages><eissn>2372-9198</eissn><eisbn>9781665441216</eisbn><eisbn>1665441216</eisbn><coden>IEEPAD</coden><abstract>In drug discovery, Deep Learning algorithms are emerging as a potential method to generate novel chemical structures since they can speed up the traditional process and decrease expenditure. Recurrent architectures are amongst the most promising methods for computational de novo drug design. One current challenge consists in finding the optimal architecture and parameters for the recurrent network that assures the generation of valid molecules that span the chemical space. In this work we perform an evaluation on Recurrent Neural Networks which can learn the syntax of molecular representation in terms of SMILES notation. We optimize the computational framework based on the recurrent architecture and its hyper-parameters. Moreover, we evaluate the performance of two types of encoding and spatial arrangement of molecules: Embedding and One-hot Encoding, and datasets with and without stereo-chemical information, respectively. The proposed model showed improved performance when compared to the current literature, both in terms of percentage of valid generated SMILES and diversity with 98.7% and 0.88, for the ChEMBL dataset, respectively. Even when considering the ZINC biogenic library, with stereochemical information, the values were 94.5% and 0.90. The obtained results reveal the potential of the recurrent architectures in learning the SMILES syntax and adding novelty to generate promising compounds.</abstract><pub>IEEE</pub><doi>10.1109/CBMS52027.2021.00067</doi><tpages>6</tpages><orcidid>https://orcid.org/0000-0003-2487-0097</orcidid><orcidid>https://orcid.org/0000-0002-9770-7672</orcidid><orcidid>https://orcid.org/0000-0002-9011-0734</orcidid><orcidid>https://orcid.org/0000-0002-7986-8421</orcidid><orcidid>https://orcid.org/0000-0003-4937-2334</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2372-9198
ispartof 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS), 2021, p.172-177
issn 2372-9198
language eng
recordid cdi_ieee_primary_9474742
source IEEE Xplore All Conference Series
subjects Biological system modeling
Computer architecture
Deep Learning
Drug Design
Drug Generation
Drugs
Encoding
GRU
Libraries
LSTM
Recurrent neural networks
RNN
SMILES
Syntactics
title Optimizing Recurrent Neural Network Architectures for De Novo Drug Design
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T11%3A16%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Optimizing%20Recurrent%20Neural%20Network%20Architectures%20for%20De%20Novo%20Drug%20Design&rft.btitle=2021%20IEEE%2034th%20International%20Symposium%20on%20Computer-Based%20Medical%20Systems%20(CBMS)&rft.au=Santos,%20Beatriz%20P.&rft.date=2021-06&rft.spage=172&rft.epage=177&rft.pages=172-177&rft.eissn=2372-9198&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CBMS52027.2021.00067&rft.eisbn=9781665441216&rft.eisbn_list=1665441216&rft_dat=%3Cieee_CHZPO%3E9474742%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i203t-ae55bcca0edf8eecfddebb1574d8bf07f2f8862564bc10772b27719b8d205d083%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=9474742&rfr_iscdi=true