Loading…

LTmatch: A Method to Abstract Pattern from Unstructured Log

Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is importa...

Full description

Saved in:
Bibliographic Details
Published in:Applied sciences 2021-06, Vol.11 (11), p.5302
Main Authors: Wang, Xiaodong, Zhao, Yining, Xiao, Haili, Wang, Xiaoning, Chi, Xuebin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c364t-d50a3f48cae31ce6b717c1e04d9bb6c1f04063e674d93768c987acdc407e5be13
cites cdi_FETCH-LOGICAL-c364t-d50a3f48cae31ce6b717c1e04d9bb6c1f04063e674d93768c987acdc407e5be13
container_end_page
container_issue 11
container_start_page 5302
container_title Applied sciences
container_volume 11
creator Wang, Xiaodong
Zhao, Yining
Xiao, Haili
Wang, Xiaoning
Chi, Xuebin
description Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods.
doi_str_mv 10.3390/app11115302
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_86b3aeb4dcca4e44a814f7604ce188a9</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_86b3aeb4dcca4e44a814f7604ce188a9</doaj_id><sourcerecordid>2635405886</sourcerecordid><originalsourceid>FETCH-LOGICAL-c364t-d50a3f48cae31ce6b717c1e04d9bb6c1f04063e674d93768c987acdc407e5be13</originalsourceid><addsrcrecordid>eNpNkM1LAzEQxYMoWGpP_gMBj7KaNNkkq6dS_Cis6KE9h9nZ2X7QNms2Pfjfu1qRvssMPx5vhsfYtRR3ShXiHtpW9sqVGJ-xwVhYkykt7fnJfslGXbcRvQqpnBQD9ljOd5Bw9cAn_I3SKtQ8BT6puhQBE_-AlCjueRPDji_2PT1gOkSqeRmWV-yigW1Ho785ZIvnp_n0NSvfX2bTSZmhMjpldS5ANdohkJJIprLSoiSh66KqDMpGaGEUGdsDZY3DwlnAGrWwlFck1ZDNjrl1gI1v43oH8csHWPtfEOLSQ0xr3JJ3plJAla4RQZPW4KRurBEaSToHRZ91c8xqY_g8UJf8Jhzivn_fj43KtcidM73r9ujCGLouUvN_VQr_U7Y_KVt9A4Pgb_U</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2635405886</pqid></control><display><type>article</type><title>LTmatch: A Method to Abstract Pattern from Unstructured Log</title><source>Publicly Available Content Database</source><creator>Wang, Xiaodong ; Zhao, Yining ; Xiao, Haili ; Wang, Xiaoning ; Chi, Xuebin</creator><creatorcontrib>Wang, Xiaodong ; Zhao, Yining ; Xiao, Haili ; Wang, Xiaoning ; Chi, Xuebin</creatorcontrib><description>Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods.</description><identifier>ISSN: 2076-3417</identifier><identifier>EISSN: 2076-3417</identifier><identifier>DOI: 10.3390/app11115302</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Algorithms ; Automation ; Clustering ; Computers ; Data mining ; Decision trees ; LCS ; log pattern extraction ; log template ; Methods ; Neural networks ; Software ; Structured data ; word matching rate</subject><ispartof>Applied sciences, 2021-06, Vol.11 (11), p.5302</ispartof><rights>2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c364t-d50a3f48cae31ce6b717c1e04d9bb6c1f04063e674d93768c987acdc407e5be13</citedby><cites>FETCH-LOGICAL-c364t-d50a3f48cae31ce6b717c1e04d9bb6c1f04063e674d93768c987acdc407e5be13</cites><orcidid>0000-0003-1336-8800</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2635405886/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2635405886?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,25731,27901,27902,36989,44566,74869</link.rule.ids></links><search><creatorcontrib>Wang, Xiaodong</creatorcontrib><creatorcontrib>Zhao, Yining</creatorcontrib><creatorcontrib>Xiao, Haili</creatorcontrib><creatorcontrib>Wang, Xiaoning</creatorcontrib><creatorcontrib>Chi, Xuebin</creatorcontrib><title>LTmatch: A Method to Abstract Pattern from Unstructured Log</title><title>Applied sciences</title><description>Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods.</description><subject>Algorithms</subject><subject>Automation</subject><subject>Clustering</subject><subject>Computers</subject><subject>Data mining</subject><subject>Decision trees</subject><subject>LCS</subject><subject>log pattern extraction</subject><subject>log template</subject><subject>Methods</subject><subject>Neural networks</subject><subject>Software</subject><subject>Structured data</subject><subject>word matching rate</subject><issn>2076-3417</issn><issn>2076-3417</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNpNkM1LAzEQxYMoWGpP_gMBj7KaNNkkq6dS_Cis6KE9h9nZ2X7QNms2Pfjfu1qRvssMPx5vhsfYtRR3ShXiHtpW9sqVGJ-xwVhYkykt7fnJfslGXbcRvQqpnBQD9ljOd5Bw9cAn_I3SKtQ8BT6puhQBE_-AlCjueRPDji_2PT1gOkSqeRmWV-yigW1Ho785ZIvnp_n0NSvfX2bTSZmhMjpldS5ANdohkJJIprLSoiSh66KqDMpGaGEUGdsDZY3DwlnAGrWwlFck1ZDNjrl1gI1v43oH8csHWPtfEOLSQ0xr3JJ3plJAla4RQZPW4KRurBEaSToHRZ91c8xqY_g8UJf8Jhzivn_fj43KtcidM73r9ujCGLouUvN_VQr_U7Y_KVt9A4Pgb_U</recordid><startdate>20210601</startdate><enddate>20210601</enddate><creator>Wang, Xiaodong</creator><creator>Zhao, Yining</creator><creator>Xiao, Haili</creator><creator>Wang, Xiaoning</creator><creator>Chi, Xuebin</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-1336-8800</orcidid></search><sort><creationdate>20210601</creationdate><title>LTmatch: A Method to Abstract Pattern from Unstructured Log</title><author>Wang, Xiaodong ; Zhao, Yining ; Xiao, Haili ; Wang, Xiaoning ; Chi, Xuebin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c364t-d50a3f48cae31ce6b717c1e04d9bb6c1f04063e674d93768c987acdc407e5be13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Automation</topic><topic>Clustering</topic><topic>Computers</topic><topic>Data mining</topic><topic>Decision trees</topic><topic>LCS</topic><topic>log pattern extraction</topic><topic>log template</topic><topic>Methods</topic><topic>Neural networks</topic><topic>Software</topic><topic>Structured data</topic><topic>word matching rate</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Xiaodong</creatorcontrib><creatorcontrib>Zhao, Yining</creatorcontrib><creatorcontrib>Xiao, Haili</creatorcontrib><creatorcontrib>Wang, Xiaoning</creatorcontrib><creatorcontrib>Chi, Xuebin</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Applied sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Xiaodong</au><au>Zhao, Yining</au><au>Xiao, Haili</au><au>Wang, Xiaoning</au><au>Chi, Xuebin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>LTmatch: A Method to Abstract Pattern from Unstructured Log</atitle><jtitle>Applied sciences</jtitle><date>2021-06-01</date><risdate>2021</risdate><volume>11</volume><issue>11</issue><spage>5302</spage><pages>5302-</pages><issn>2076-3417</issn><eissn>2076-3417</eissn><abstract>Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/app11115302</doi><orcidid>https://orcid.org/0000-0003-1336-8800</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2076-3417
ispartof Applied sciences, 2021-06, Vol.11 (11), p.5302
issn 2076-3417
2076-3417
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_86b3aeb4dcca4e44a814f7604ce188a9
source Publicly Available Content Database
subjects Algorithms
Automation
Clustering
Computers
Data mining
Decision trees
LCS
log pattern extraction
log template
Methods
Neural networks
Software
Structured data
word matching rate
title LTmatch: A Method to Abstract Pattern from Unstructured Log
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T07%3A49%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=LTmatch:%20A%20Method%20to%20Abstract%20Pattern%20from%20Unstructured%20Log&rft.jtitle=Applied%20sciences&rft.au=Wang,%20Xiaodong&rft.date=2021-06-01&rft.volume=11&rft.issue=11&rft.spage=5302&rft.pages=5302-&rft.issn=2076-3417&rft.eissn=2076-3417&rft_id=info:doi/10.3390/app11115302&rft_dat=%3Cproquest_doaj_%3E2635405886%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c364t-d50a3f48cae31ce6b717c1e04d9bb6c1f04063e674d93768c987acdc407e5be13%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2635405886&rft_id=info:pmid/&rfr_iscdi=true