Evaluation of the correlation between test cases dependency and their semantic text similarity
2020 (Engelska)Självständigt arbete på grundnivå (kandidatexamen), 10 poäng / 15 hp
Studentuppsats (Examensarbete)
Abstract [en]
An important step in developing software is to test the system thoroughly. Testing software requires a generation of test cases that can reach large numbers and is important to be performed in the correct order. Certain information is critical to know to schedule the test cases incorrectly order and isn’t always available. This leads to a lot of required manual work and valuable resources to get correct. By instead analyzing their test specification it could be possible to detect the functional dependencies between test cases. This study presents a natural language processing (NLP) based approach and performs cluster analysis on a set of test cases to evaluate the correlation between test case dependencies and their semantic similarities. After an initial feature selection, the test cases’ similarities are calculated through the Cosine distance function. The result of the similarity calculation is then clustered using the HDBSCAN clustering algorithm. The clusters would represent test cases’ relations where test cases with close similarities are put in the same cluster as they were expected to share dependencies. The clusters are then validated with a Ground Truth containing the correct dependencies. The result is an F-Score of 0.7741. The approach in this study is used on an industrial testing project at Bombardier Transportation in Sweden.
Ort, förlag, år, upplaga, sidor
2020. , s. 16
Nyckelord [en]
Software Testing, Test optimization, NLP, Dependency, Semantic Similarity, Clustering, Cosine Similarity, HDBSCAN
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:mdh:diva-48942OAI: oai:DiVA.org:mdh-48942DiVA, id: diva2:1445069
Externt samarbete
Bombardier Transportation Sweden AB
Ämne / kurs
Datavetenskap
Handledare
Examinatorer
2020-06-242020-06-222020-06-24Bibliografiskt granskad