Salta al contenuto principale


Another really cool talk from OOPSLA: "An Empirical Evaluation of Property-Based Testing in Python" youtube.com/live/zoE2w2hueYQ?s…

They collect a corpus of Python projects that use Hypothesis for Property-Based Testing. Then they check how good those tests (compared to the project's regular unit tests) are at finding random artificial bugs in the source code of the project.

in reply to CF Bolz-Tereick

Interesting. Although I’m not convinced by the methodology – mutation testing is great at generating changes which don’t actually create bugs, many mutations it generates are functionally equivalent to one another, etc. Will have to read their actual paper to see if/how they controlled for this kind of potential inflation of the number of ‘bugs’ found by hypothesis
in reply to Daphne Preston-Kendal

@dpk I agree! but I still think that the comparison with the mutations caught by non-randomized "regular" unit-tests is meaningful
in reply to CF Bolz-Tereick

very interesting! The test categorization definitions remind me of this blog post via @ScottWlaschin

fsharpforfunandprofit.com/post…

in reply to CF Bolz-Tereick

As keen as I am on property-based testing, I think the 50x result is fake (even separately from critiques of mutation testing).

1. They're comparing projects with both property-based tests and unit tests, so the unit tests are often ones written assuming the property-based tests do the heavy lifting.
2. They say 55% of properties catch a bug on the first example. That means there's a unit test 55% as effective as that PBT.

in reply to David R. MacIver

@DRMacIver hm, is 2. really a counter-argument? it's unlikely that human unit-test authors would have picked the right good unit test after all. also, does 55% really match your gut feeling? I am sure that I have written a lot of hypothesis tests that are much better at bug finding than any amount of standard unit tests I could ever have written.

I find the argument 1. more convincing. of course it's quite hard to avoid this problem.

in reply to CF Bolz-Tereick

I think it's plausible to me that 55% of property based tests in the wild could be replaced with a unit test without much loss.

I think it's hard to really come up with a single number capturing the effectiveness of property based tests. I wouldn't be shocked at there being a reasonable experiment that really shows 50x on metric, I just don't think this one is it.

Questo sito web utilizza cookie tecnici e di sessione. Proseguendo la navigazione su questo sito, accetti l'utilizzo dei cookie.