Loading…

The research of decision tree mining based on Hadoop

For a single node massive data, the mining calculation of the decision-tree is very large. In order to solve this problem, this paper proposes the HF_SPRINT parallel algorithm that bases on the Hadoop platform. The parallel algorithm optimizes and improves the SPRINT algorithm as well as realizes th...

Full description

Saved in:
Bibliographic Details
Main Authors: Qiu Lu, Xiao-hui Cheng
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:For a single node massive data, the mining calculation of the decision-tree is very large. In order to solve this problem, this paper proposes the HF_SPRINT parallel algorithm that bases on the Hadoop platform. The parallel algorithm optimizes and improves the SPRINT algorithm as well as realizes the parallelization. The experimental results show that this algorithm has high acceleration ratio and good scalability.
DOI:10.1109/FSKD.2012.6234264