Studia Informatica Pomerania
WNEiZ
Autor: Łukasz Radliński 107
Strony: 107-120
pdfpełen tekst

A REVIEW OF PUBLICLY AVAILABLE DATABASES OF SOFTWARE PROJECTS

Summary
A set of reliable empirical data is often required for a scientific research. For many years in the software engineering domain such datasets were usually not easily publicly available. However, recently some repositories have been established and opened for public access. This paper focuses on the analysis of databases of software projects. I have analyzed such databases grouped in four major repositories: ISBSG, PROMISE, NASA and the databases based on Bugzilla. The contents are diverse among the databases: different parameters describing projects, different level of data granularity and different number of observations. Because of that there is no single database which could be used in each type of research analysis. Rather the aim of the research and the need for specific type of data determines which database could be used. A couple of possible types of analysis can be supported by these datasets: estimation of software size, effort and defects. A few of them allow the trade-off analysis between these factors. Some also contain data about the process and people quality. The comparison of the databases may be a useful tip when the choice of the database needs to be made.