Loading…
ModelSet: A labelled dataset of software models for machine learning
Curated collections of models are essential for the success of Machine Learning (ML) and Data Analytics in Model-Driven Engineering (MDE). However, current datasets are either too small or not properly curated. In this paper, we present ModelSet, a dataset composed of 5,466 Ecore models and 5,120 UM...
Saved in:
Published in: | Science of computer programming 2024-01, Vol.231, p.103009, Article 103009 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Curated collections of models are essential for the success of Machine Learning (ML) and Data Analytics in Model-Driven Engineering (MDE). However, current datasets are either too small or not properly curated. In this paper, we present ModelSet, a dataset composed of 5,466 Ecore models and 5,120 UML models which have been manually labelled to support ML tasks. We describe the structure of the dataset and explain how to use the associated library to develop ML applications in Python. Finally, we present some applications which can be addressed using ModelSet.
Tool Website: https://github.com/modelset
•We propose a labelled dataset of software models.•The dataset contains 5,466 Ecore models and 5,120 UML models.•We provide a Python library to facilitate applying ML techniques for MDE.•The software is available at https://github.com/modelset. |
---|---|
ISSN: | 0167-6423 1872-7964 |
DOI: | 10.1016/j.scico.2023.103009 |