mdh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The limited impact of individual developer data on software defect prediction
AT and T Labs Research, United States .
AT and T Labs Research, United States .
AT and T Labs Research, United States .ORCID iD: 0000-0002-1660-199X
2013 (English)In: Journal of Empirical Software Engineering, ISSN 1382-3256, E-ISSN 1573-7616, Vol. 18, no 3, 478-505 p.Article in journal (Refereed) Published
Abstract [en]

Previous research has provided evidence that a combination of static code metrics and software history metrics can be used to predict with surprising success which files in the next release of a large system will havethe largest numbers of defects. In contrast, very little research exists to indicate whether information about individual developers can profitably be used to improve predictions. We investigate whether files in a large system that are modified by an individual developer consistently contain either more or fewer faults than the average of all files in the system. The goal of the investigation is to determine whether information about which particular developer modified a file is able to improve defect predictions. We also extend earlier research evaluating use of counts of the number of developers who modified a file as predictors of the file's future faultiness. We analyze change reports filed for three large systems, each containing 18 releases, with a combined total of nearly 4 million LOC and over 11,000 files. A buggy file ratio is defined for programmers, measuring the proportion of faulty files in Release R out of all files modified by the programmer in Release R-1. We assess the consistency of the buggy file ratio across releases for individual programmers both visually and within the context of a fault prediction model. Buggy file ratios for individual programmers often varied widely across all the releases that they participated in. A prediction model that takes account of the history of faulty files that were changed by individual developers shows improvement over the standard negative binomial model of less than 0.13% according to one measure, and no improvement at all according to another measure. In contrast, augmenting a standard model with counts of cumulative developers changing files in prior releases produced up to a 2% improvement in the percentage of faults detected in the top 20% of predicted faulty files. The cumulative number of developers interacting with a file can be a useful variable for defect prediction. However, the study indicates that adding information to a model about which particular developermodified a file is not likely to improve defect predictions.

Place, publisher, year, edition, pages
2013. Vol. 18, no 3, 478-505 p.
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:mdh:diva-23675DOI: 10.1007/s10664-011-9178-4ISI: 000319774300003Scopus ID: 2-s2.0-84878737201OAI: oai:DiVA.org:mdh-23675DiVA: diva2:680217
Available from: 2013-12-17 Created: 2013-12-17 Last updated: 2014-06-30Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
WEYUKER, ELAINE
In the same journal
Journal of Empirical Software Engineering
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 124 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf