Loading…
H2Hadoop: Improving Hadoop Performance Using the Metadata of Related Jobs
Cloud Computing leverages Hadoop framework for processing BigData in parallel. Hadoop has certain limitations that could be exploited to execute the job efficiently. These limitations are mostly because of data locality in the cluster, jobs and tasks scheduling, and resource allocations in Hadoop. E...
Saved in:
Published in: | IEEE transactions on cloud computing 2018-10, Vol.6 (4), p.1031-1040 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c291t-94da5c2ded94618f2fbfc93a604ce9bb05cbc63301fa7f0f780d4890eff740803 |
---|---|
cites | cdi_FETCH-LOGICAL-c291t-94da5c2ded94618f2fbfc93a604ce9bb05cbc63301fa7f0f780d4890eff740803 |
container_end_page | 1040 |
container_issue | 4 |
container_start_page | 1031 |
container_title | IEEE transactions on cloud computing |
container_volume | 6 |
creator | Alshammari, Hamoud Lee, Jeongkyu Bajwa, Hassan |
description | Cloud Computing leverages Hadoop framework for processing BigData in parallel. Hadoop has certain limitations that could be exploited to execute the job efficiently. These limitations are mostly because of data locality in the cluster, jobs and tasks scheduling, and resource allocations in Hadoop. Efficient resource allocation remains a challenge in Cloud Computing MapReduce platforms. We propose H2Hadoop, which is an enhanced Hadoop architecture that reduces the computation cost associated with BigData analysis. The proposed architecture also addresses the issue of resource allocation in native Hadoop. H2Hadoop provides a better solution for "text data", such as finding DNA sequence and the motif of a DNA sequence. Also, H2Hadoop provides an efficient Data Mining approach for Cloud Computing environments. H2Hadoop architecture leverages on NameNode's ability to assign jobs to the TaskTrakers (DataNodes) within the cluster. By adding control features to the NameNode, H2Hadoop can intelligently direct and assign tasks to the DataNodes that contain the required data without sending the job to the whole cluster. Comparing with native Hadoop, H2Hadoop reduces CPU time, number of read operations, and another Hadoop factors. |
doi_str_mv | 10.1109/TCC.2016.2535261 |
format | article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2151464050</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7420665</ieee_id><sourcerecordid>2151464050</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-94da5c2ded94618f2fbfc93a604ce9bb05cbc63301fa7f0f780d4890eff740803</originalsourceid><addsrcrecordid>eNpNkNFLwzAQxoMoOObeBV8CPnfepUna-CZF3WSiyPYc0jTRjW2ZSRX8723pEO_ljrvvuzt-hFwiTBFB3SyrasoA5ZSJXDCJJ2TEUJZZgRJP_9XnZJLSBrooBSpUIzKfsZlpQjjc0vnuEMP3ev9Ohw59ddGHuDN76-gq9YP2w9Fn15rGtIYGT9_c1rSuoU-hThfkzJttcpNjHpPVw_2ymmWLl8d5dbfILFPYZoo3RljWuEZxiaVnvvZW5UYCt07VNQhbW5nngN4UHnxRQsNLBc77gkMJ-ZhcD3u7bz-_XGr1JnzFfXdSMxTIJQfRq2BQ2RhSis7rQ1zvTPzRCLpnpjtmumemj8w6y9VgWTvn_uQFZyClyH8BCDdmNw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2151464050</pqid></control><display><type>article</type><title>H2Hadoop: Improving Hadoop Performance Using the Metadata of Related Jobs</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Alshammari, Hamoud ; Lee, Jeongkyu ; Bajwa, Hassan</creator><creatorcontrib>Alshammari, Hamoud ; Lee, Jeongkyu ; Bajwa, Hassan</creatorcontrib><description>Cloud Computing leverages Hadoop framework for processing BigData in parallel. Hadoop has certain limitations that could be exploited to execute the job efficiently. These limitations are mostly because of data locality in the cluster, jobs and tasks scheduling, and resource allocations in Hadoop. Efficient resource allocation remains a challenge in Cloud Computing MapReduce platforms. We propose H2Hadoop, which is an enhanced Hadoop architecture that reduces the computation cost associated with BigData analysis. The proposed architecture also addresses the issue of resource allocation in native Hadoop. H2Hadoop provides a better solution for "text data", such as finding DNA sequence and the motif of a DNA sequence. Also, H2Hadoop provides an efficient Data Mining approach for Cloud Computing environments. H2Hadoop architecture leverages on NameNode's ability to assign jobs to the TaskTrakers (DataNodes) within the cluster. By adding control features to the NameNode, H2Hadoop can intelligently direct and assign tasks to the DataNodes that contain the required data without sending the job to the whole cluster. Comparing with native Hadoop, H2Hadoop reduces CPU time, number of read operations, and another Hadoop factors.</description><identifier>ISSN: 2168-7161</identifier><identifier>EISSN: 2168-7161</identifier><identifier>EISSN: 2372-0018</identifier><identifier>DOI: 10.1109/TCC.2016.2535261</identifier><identifier>CODEN: ITCCF6</identifier><language>eng</language><publisher>Piscataway: IEEE Computer Society</publisher><subject>BigData ; Cloud computing ; Clusters ; Computational efficiency ; Computer architecture ; Cost analysis ; Data mining ; Deoxyribonucleic acid ; DNA ; File systems ; H2Hadoop ; Hadoop ; Hadoop Performance ; MapReduce ; Metadata ; Resource allocation ; Resource management ; Resource scheduling ; Task scheduling ; Text Data ; Text mining</subject><ispartof>IEEE transactions on cloud computing, 2018-10, Vol.6 (4), p.1031-1040</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-94da5c2ded94618f2fbfc93a604ce9bb05cbc63301fa7f0f780d4890eff740803</citedby><cites>FETCH-LOGICAL-c291t-94da5c2ded94618f2fbfc93a604ce9bb05cbc63301fa7f0f780d4890eff740803</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7420665$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27922,27923,54794</link.rule.ids></links><search><creatorcontrib>Alshammari, Hamoud</creatorcontrib><creatorcontrib>Lee, Jeongkyu</creatorcontrib><creatorcontrib>Bajwa, Hassan</creatorcontrib><title>H2Hadoop: Improving Hadoop Performance Using the Metadata of Related Jobs</title><title>IEEE transactions on cloud computing</title><addtitle>TCC</addtitle><description>Cloud Computing leverages Hadoop framework for processing BigData in parallel. Hadoop has certain limitations that could be exploited to execute the job efficiently. These limitations are mostly because of data locality in the cluster, jobs and tasks scheduling, and resource allocations in Hadoop. Efficient resource allocation remains a challenge in Cloud Computing MapReduce platforms. We propose H2Hadoop, which is an enhanced Hadoop architecture that reduces the computation cost associated with BigData analysis. The proposed architecture also addresses the issue of resource allocation in native Hadoop. H2Hadoop provides a better solution for "text data", such as finding DNA sequence and the motif of a DNA sequence. Also, H2Hadoop provides an efficient Data Mining approach for Cloud Computing environments. H2Hadoop architecture leverages on NameNode's ability to assign jobs to the TaskTrakers (DataNodes) within the cluster. By adding control features to the NameNode, H2Hadoop can intelligently direct and assign tasks to the DataNodes that contain the required data without sending the job to the whole cluster. Comparing with native Hadoop, H2Hadoop reduces CPU time, number of read operations, and another Hadoop factors.</description><subject>BigData</subject><subject>Cloud computing</subject><subject>Clusters</subject><subject>Computational efficiency</subject><subject>Computer architecture</subject><subject>Cost analysis</subject><subject>Data mining</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>File systems</subject><subject>H2Hadoop</subject><subject>Hadoop</subject><subject>Hadoop Performance</subject><subject>MapReduce</subject><subject>Metadata</subject><subject>Resource allocation</subject><subject>Resource management</subject><subject>Resource scheduling</subject><subject>Task scheduling</subject><subject>Text Data</subject><subject>Text mining</subject><issn>2168-7161</issn><issn>2168-7161</issn><issn>2372-0018</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNpNkNFLwzAQxoMoOObeBV8CPnfepUna-CZF3WSiyPYc0jTRjW2ZSRX8723pEO_ljrvvuzt-hFwiTBFB3SyrasoA5ZSJXDCJJ2TEUJZZgRJP_9XnZJLSBrooBSpUIzKfsZlpQjjc0vnuEMP3ev9Ohw59ddGHuDN76-gq9YP2w9Fn15rGtIYGT9_c1rSuoU-hThfkzJttcpNjHpPVw_2ymmWLl8d5dbfILFPYZoo3RljWuEZxiaVnvvZW5UYCt07VNQhbW5nngN4UHnxRQsNLBc77gkMJ-ZhcD3u7bz-_XGr1JnzFfXdSMxTIJQfRq2BQ2RhSis7rQ1zvTPzRCLpnpjtmumemj8w6y9VgWTvn_uQFZyClyH8BCDdmNw</recordid><startdate>20181001</startdate><enddate>20181001</enddate><creator>Alshammari, Hamoud</creator><creator>Lee, Jeongkyu</creator><creator>Bajwa, Hassan</creator><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20181001</creationdate><title>H2Hadoop: Improving Hadoop Performance Using the Metadata of Related Jobs</title><author>Alshammari, Hamoud ; Lee, Jeongkyu ; Bajwa, Hassan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-94da5c2ded94618f2fbfc93a604ce9bb05cbc63301fa7f0f780d4890eff740803</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>BigData</topic><topic>Cloud computing</topic><topic>Clusters</topic><topic>Computational efficiency</topic><topic>Computer architecture</topic><topic>Cost analysis</topic><topic>Data mining</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>File systems</topic><topic>H2Hadoop</topic><topic>Hadoop</topic><topic>Hadoop Performance</topic><topic>MapReduce</topic><topic>Metadata</topic><topic>Resource allocation</topic><topic>Resource management</topic><topic>Resource scheduling</topic><topic>Task scheduling</topic><topic>Text Data</topic><topic>Text mining</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alshammari, Hamoud</creatorcontrib><creatorcontrib>Lee, Jeongkyu</creatorcontrib><creatorcontrib>Bajwa, Hassan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on cloud computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alshammari, Hamoud</au><au>Lee, Jeongkyu</au><au>Bajwa, Hassan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>H2Hadoop: Improving Hadoop Performance Using the Metadata of Related Jobs</atitle><jtitle>IEEE transactions on cloud computing</jtitle><stitle>TCC</stitle><date>2018-10-01</date><risdate>2018</risdate><volume>6</volume><issue>4</issue><spage>1031</spage><epage>1040</epage><pages>1031-1040</pages><issn>2168-7161</issn><eissn>2168-7161</eissn><eissn>2372-0018</eissn><coden>ITCCF6</coden><abstract>Cloud Computing leverages Hadoop framework for processing BigData in parallel. Hadoop has certain limitations that could be exploited to execute the job efficiently. These limitations are mostly because of data locality in the cluster, jobs and tasks scheduling, and resource allocations in Hadoop. Efficient resource allocation remains a challenge in Cloud Computing MapReduce platforms. We propose H2Hadoop, which is an enhanced Hadoop architecture that reduces the computation cost associated with BigData analysis. The proposed architecture also addresses the issue of resource allocation in native Hadoop. H2Hadoop provides a better solution for "text data", such as finding DNA sequence and the motif of a DNA sequence. Also, H2Hadoop provides an efficient Data Mining approach for Cloud Computing environments. H2Hadoop architecture leverages on NameNode's ability to assign jobs to the TaskTrakers (DataNodes) within the cluster. By adding control features to the NameNode, H2Hadoop can intelligently direct and assign tasks to the DataNodes that contain the required data without sending the job to the whole cluster. Comparing with native Hadoop, H2Hadoop reduces CPU time, number of read operations, and another Hadoop factors.</abstract><cop>Piscataway</cop><pub>IEEE Computer Society</pub><doi>10.1109/TCC.2016.2535261</doi><tpages>10</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2168-7161 |
ispartof | IEEE transactions on cloud computing, 2018-10, Vol.6 (4), p.1031-1040 |
issn | 2168-7161 2168-7161 2372-0018 |
language | eng |
recordid | cdi_proquest_journals_2151464050 |
source | IEEE Electronic Library (IEL) Journals |
subjects | BigData Cloud computing Clusters Computational efficiency Computer architecture Cost analysis Data mining Deoxyribonucleic acid DNA File systems H2Hadoop Hadoop Hadoop Performance MapReduce Metadata Resource allocation Resource management Resource scheduling Task scheduling Text Data Text mining |
title | H2Hadoop: Improving Hadoop Performance Using the Metadata of Related Jobs |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T13%3A42%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=H2Hadoop:%20Improving%20Hadoop%20Performance%20Using%20the%20Metadata%20of%20Related%20Jobs&rft.jtitle=IEEE%20transactions%20on%20cloud%20computing&rft.au=Alshammari,%20Hamoud&rft.date=2018-10-01&rft.volume=6&rft.issue=4&rft.spage=1031&rft.epage=1040&rft.pages=1031-1040&rft.issn=2168-7161&rft.eissn=2168-7161&rft.coden=ITCCF6&rft_id=info:doi/10.1109/TCC.2016.2535261&rft_dat=%3Cproquest_ieee_%3E2151464050%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c291t-94da5c2ded94618f2fbfc93a604ce9bb05cbc63301fa7f0f780d4890eff740803%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2151464050&rft_id=info:pmid/&rft_ieee_id=7420665&rfr_iscdi=true |