https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cross-Version Software Defect Prediction Considering Concept Drift and Chronological Splitting
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Department of Electrical and Computer Engineering, Pak-Austria Fachhochschule Institute of Applied Sciences and Technology, Haripur, Pakistan.
Intelligent Systems Research Centre, University of Ulster, Londonderry, United Kingdom.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0002-3875-812X
Show others and affiliations
2023 (English)In: Symmetry, E-ISSN 2073-8994, Vol. 15, no 10, article id 1934Article in journal (Refereed) Published
Abstract [en]

Concept drift (CD) refers to a phenomenon where the data distribution within datasets changes over time, and this can have adverse effects on the performance of prediction models in software engineering (SE), including those used for tasks like cost estimation and defect prediction. Detecting CD in SE datasets is difficult, but important, because it identifies the need for retraining prediction models and in turn improves their performance. If the concept drift is caused by symmetric changes in the data distribution, the model adaptation process might need to account for this symmetry to maintain accurate predictions. This paper explores the impact of CD within the context of cross-version defect prediction (CVDP), aiming to enhance the reliability of prediction performance and to make the data more symmetric. A concept drift detection (CDD) approach is further proposed to identify data distributions that change over software versions. The proposed CDD framework consists of three stages: (i) data pre-processing for CD detection; (ii) notification of CD by triggering one of the three flags (i.e., CD, warning, and control); and (iii) providing guidance on when to update an existing model. Several experiments on 30 versions of seven software projects reveal the value of the proposed CDD. Some of the key findings of the proposed work include: (i) An exponential increase in the error-rate across different software versions is associated with CD. (ii) A moving-window approach to train defect prediction models on chronologically ordered defect data results in better CD detection than using all historical data with a large effect size (Formula presented.).

Place, publisher, year, edition, pages
Multidisciplinary Digital Publishing Institute (MDPI) , 2023. Vol. 15, no 10, article id 1934
Keywords [en]
chronological splitting, concept drift, cross-version defect prediction, software defect prediction
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:mdh:diva-64695DOI: 10.3390/sym15101934ISI: 001095251100001Scopus ID: 2-s2.0-85175426760OAI: oai:DiVA.org:mdh-64695DiVA, id: diva2:1810945
Available from: 2023-11-09 Created: 2023-11-09 Last updated: 2023-11-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Kabir, Md AlamgirRehman, Atiq UrAli, Nazakat

Search in DiVA

By author/editor
Kabir, Md AlamgirRehman, Atiq UrAli, Nazakat
By organisation
Embedded Systems
In the same journal
Symmetry
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 18 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf