Loading…
A study and characterization of chemical properties of soil surface data using K-means algorithm
Soil is a vital natural resource on whose proper use depends the life supporting system of a country and the socio-economic development of its people. Clustering in agricultural soil datasets is a relatively novel research field. This research paper aims to study the Characterization of Chemical Pro...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Soil is a vital natural resource on whose proper use depends the life supporting system of a country and the socio-economic development of its people. Clustering in agricultural soil datasets is a relatively novel research field. This research paper aims to study the Characterization of Chemical Properties of Soil Surface Data of Bhanapur Micro watershed of Koppal District, Karnataka using K-means algorithm. This work computed average silhouette width which provides an evaluation of clustering validity, and might be used to select an appropriate number of clusters in the soil dataset. Soil dataset clustered by using K-means two clustering with Euclidean distance which provides average silhouette value is 0.7736. And this work also proved high intra-class similarity: cohesive within clusters and low inter-class similarity: distinctive between clusters in the soil dataset by K-means which reassigns points among clusters to decrease the sum of point-to-centroid distances, and then recomputed cluster centroids for the new cluster assignments. Kmeans clustering with Euclidean distance total sum of distance is 1.14402 based on the number of reassign soil data. By default, Kmeans begins the clustering process using a randomly selected set of initial centroid locations. K-means repeats the clustering process starting from different randomly selected centroids. The sum of distances within each cluster for that best solution is 1.1440 with Euclidean distance. K-means three clustering solution with Euclidean distance which provides average silhouette value is 0.6052. K-means three clustering solution with Cosine distance average silhouette value is 0.6219. Hierarchical Clustering dendrogram with Euclidean distance measure is 0.7935. Hierarchical Clustering dendrogram with Cosine distance measure is 0.6678. The results from hierarchical clustering dendrogram with Cosine distance are qualitatively similar to results from K-Means using three clusters. K-means cluster analysis with Euclidean and Cosine distance measures compared with Hierarchical clustering dendrogram. Based on the study and characterization of chemical properties of soil surface data using K-means, Cosine might be a good choice of distance measure. |
---|---|
DOI: | 10.1109/ICPRIME.2013.6496484 |