Unknown

Dataset Information

0

Can we predict firms' innovativeness? The identification of innovation performers in an Italian region through a supervised learning approach.


ABSTRACT: The study shows the feasibility of predicting firms' expenditures in innovation, as reported in the Community Innovation Survey, applying a supervised machine-learning approach on a sample of Italian firms. Using an integrated dataset of administrative records and balance sheet data, designed to include all informative variables related to innovation but also easily accessible for most of the cohort, random forest algorithm is implemented to obtain a classification model aimed to identify firms that are potential innovation performers. The performance of the classifier, estimated in terms of AUC, is 0.794. Although innovation investments do not always result in patenting, the model is able to identify 71.92% of firms with patents. More encouraging results emerge from the analysis of the inner working of the model: predictors identified as most important-such as firm size, sector belonging and investment in intangible assets-confirm previous findings of literature, but in a completely different framework. The outcomes of this study are considered relevant for both economic analysts, because it demonstrates the potential of data-driven models for understanding the nature of innovation behaviour, and practitioners, such as policymakers or venture capitalists, who can benefit by evidence-based tools in the decision-making process.

SUBMITTER: Gandin I 

PROVIDER: S-EPMC6559647 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Can we predict firms' innovativeness? The identification of innovation performers in an Italian region through a supervised learning approach.

Gandin Ilaria I   Cozza Claudio C  

PloS one 20190611 6


The study shows the feasibility of predicting firms' expenditures in innovation, as reported in the Community Innovation Survey, applying a supervised machine-learning approach on a sample of Italian firms. Using an integrated dataset of administrative records and balance sheet data, designed to include all informative variables related to innovation but also easily accessible for most of the cohort, random forest algorithm is implemented to obtain a classification model aimed to identify firms  ...[more]

Similar Datasets

| S-EPMC5480604 | biostudies-other
| S-EPMC7667644 | biostudies-literature
| S-EPMC8500439 | biostudies-literature
| S-EPMC2909217 | biostudies-literature
| S-EPMC6620704 | biostudies-literature
| S-EPMC2889937 | biostudies-literature
| S-EPMC7671830 | biostudies-literature
2019-11-13 | GSE140262 | GEO
| S-EPMC7013885 | biostudies-literature
| S-EPMC4718658 | biostudies-literature