Loading…

ydata-profiling: Accelerating data-centric AI with high-quality data

ydata-profiling is an open-source Python package for advanced exploratory data analysis that enables users to generate data profiling reports in a simple, fast, and efficient manner, fostering a standardized and visual understanding of the data. Beyond traditional descriptive properties and statisti...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) 2023-10, Vol.554, p.126585, Article 126585
Main Authors: Clemente, Fabiana, Ribeiro, Gonçalo Martins, Quemy, Alexandre, Santos, Miriam Seoane, Pereira, Ricardo Cardoso, Barros, Alex
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:ydata-profiling is an open-source Python package for advanced exploratory data analysis that enables users to generate data profiling reports in a simple, fast, and efficient manner, fostering a standardized and visual understanding of the data. Beyond traditional descriptive properties and statistics, ydata-profiling follows a Data-Centric AI approach to exploratory analysis, as it focuses on the automatic detection and highlighting of complex data characteristics often associated with potential data quality issues, such as high ratios of missing or imbalanced data, infinite, unique, or constant values, skewness, high correlation, high cardinality, non-stationarity, seasonality, duplicate records, and other inconsistencies. The source code, documentation, and examples are available in the GitHub repository: https://github.com/ydataai/ydata-profiling.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2023.126585