Unknown

Dataset Information

0

Software reusability dataset based on static analysis metrics and reuse rate information.


ABSTRACT: The widely adopted component-based development paradigm considers the reuse of proper software components as a primary criterion for successful software development. As a result, various research efforts are directed towards evaluating the extent to which a software component is reusable. Prior efforts follow expert-based approaches, however the continuously increasing open-source software initiative allows the introduction of data-driven alternatives. In this context we have generated a dataset that harnesses information residing in online code hosting facilities and introduces the actual reuse rate of software components as a measure of their reusability. To do so, we have analyzed the most popular projects included in the maven registry and have computed a large number of static analysis metrics at both class and package levels using SourceMeter tool [2] that quantify six major source code properties: complexity, cohesion, coupling, inheritance, documentation and size. For these projects we additionally computed their reuse rate using our self-developed code search engine, AGORA [5]. The generated dataset contains analysis information regarding more than 24,000 classes and 2000 packages, and can, thus, be used as the information basis towards the design and development of data-driven reusability evaluation methodologies. The dataset is related to the research article entitled "Measuring the Reusability of Software Components using Static Analysis Metrics and Reuse Rate Information" [1].

SUBMITTER: Papamichail MD 

PROVIDER: S-EPMC6838442 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Software reusability dataset based on static analysis metrics and reuse rate information.

Papamichail Michail D MD   Diamantopoulos Themistoklis T   Symeonidis Andreas L AL  

Data in brief 20191019


The widely adopted component-based development paradigm considers the reuse of proper software components as a primary criterion for successful software development. As a result, various research efforts are directed towards evaluating the extent to which a software component is reusable. Prior efforts follow expert-based approaches, however the continuously increasing open-source software initiative allows the introduction of data-driven alternatives. In this context we have generated a dataset  ...[more]

Similar Datasets

| S-EPMC7691392 | biostudies-literature
| S-EPMC8205299 | biostudies-literature
| S-EPMC8419856 | biostudies-literature
| S-EPMC7359296 | biostudies-literature
| S-EPMC8775317 | biostudies-literature
| S-EPMC6245488 | biostudies-other
2009-12-18 | PRD000081 | Pride
| S-EPMC6958438 | biostudies-literature