All-uses vs mutation testing: An experimental comparison of effectiveness

Phyllis G. Frankl, Stewart N. Weiss, Cang Hu

    Research output: Contribution to journalArticlepeer-review


    The effectiveness of a test data adequacy criterion for a given program and specification is the probability that a test set satisfying the criterion will expose a fault. Experiments were performed to compare the effectiveness of the mutation testing and all-uses test data adequacy criteria at various coverage levels, for randomly generated test sets. Large numbers of test sets were generated and executed, and for each, the proportion of mutants killed or def-use associations covered was measured. This data was used to estimate and compare the effectiveness of the criteria. The results were mixed: at the highest coverage levels considered, mutation was more effective than all-uses for five of the nine subjects, all-uses was more effective than mutation for two subjects, and there was no clear winner for two subjects. However, mutation testing was much more expensive than all-uses. The relationship between coverage and effectiveness for fixed-sized test sets was also explored and was found to be nonlinear and, in many cases, nonmonotonic.

    Original languageEnglish (US)
    Pages (from-to)235-253
    Number of pages19
    JournalJournal of Systems and Software
    Issue number3
    StatePublished - Sep 1997

    ASJC Scopus subject areas

    • Software
    • Information Systems
    • Hardware and Architecture


    Dive into the research topics of 'All-uses vs mutation testing: An experimental comparison of effectiveness'. Together they form a unique fingerprint.

    Cite this