Others have pointed out that QuickCheck doesn't shrink automatically. But in addition: QuickCheck's shrinking also doesn't preserve invariants (in general).
QuickCheck's shrinking is type based. There's lots of different ways to generate eg integers. Perhaps you want them in a specific range, or only prime numbers or only even numbers etc. To make QuickCheck's shrinker preserve these invariants, you'd have make a typed wrapper for each of them, and explicitly write a new shrinking strategy. It's annoying and complicated.
QuickCheck won't preserve invariants, since its shrinkers are separate from its generators. For example:
data Rat = Rat Int Nat deriving (Eq, Show)
genRat = do
(num, den) <- arbitrary
pure (Rat num (1 + den))
`genRat` is a QuickCheck generator. It cannot do shrinking, because that's a completely separate thing in QuickCheck.
We can write a shrinker for `Rat`, but it will have nothing to do with our generator, e.g.
shrinkRat (Rat num den) = do
(num', den') <- shrink (num, den)
pure (Rat num' den')
Sure, we can stick these in an `Arbitrary` instance, but they're still independent values. The generation process is essentially state-passing with a random number generator; it has nothing to do with the shrinking process, which is a form of search without backtracking.
instance Arbitrary Rat where
arbitrary = genRat
shrink = shrinkRat
In particular, `genRat` satisfies the invariant that values will have non-zero denominator; whereas `shrinkRat` does not satisfy that invariant (since it shrinks the denominator as an ordinary `Nat`, which could give 0). In fact, we can't even think about QuickCheck's generators and shrinkers as different interpretations of the same syntax. For example, here's a shrinker that follows the syntax of `genRat` more closely:
shrinkRat2 (Rat n d) = do
(num, den) <- shrink (n, d)
pure (Rat num (1 + den))
This does have the invariant that its output have non-zero denominators; however, it will get stuck in an infinite loop! That's because the incoming `d` will be non-zero, so when `shrink` tries to shrink `(n, d)`, one of the outputs it tries will be `(n, 0)`; that will lead to `Rat n 1`, which will also shrink to `Rat n 1`, and so on.
In contrast, in Hypothesis, Hedgehog, falsify, etc. a "generator" is just a parser from numbers to values; and shrinking is applied to those numbers, not to the output of a generator. Not only does this not require separate shrinkers, but it also guarantees that the generator's invariants hold for all of the shrunken values; since those shrunken values have also been outputted by the generator (when it was given smaller inputs).
No, QuickCheck very importantly does not shrink automatically. You have to write the shrinker yourself. Hypothesis, Hedgehog, proptest and a few others shrink automatically.
Good point. I suppose we should add "number of input elements equals number of output elements" and "every input element is present in the output". Translated in a straightforward test that still allows my_sort([1,1,2]) to return [1,2,2], but we have to draw the line somewhere
Just use Counter and if the objects aren’t hashable, use the count of IDs. Grab this before calling the function, in case the function is destructive. Check it against the output.
Add in checking each item is less than or equal to its successor and you have the fundamental sort properties. You might have more, like stability.
Zooming in to cartoonish levels might drive the point home a bit. Suppose you have sound waves
|---------|---------|---------|
What is the frequency exactly 1/3 the way between the first two wave peaks? It's a nonsensical question. The frequency relates to the time delta between peaks, and looking locally at a sufficiently small region of time gives no information about that phenomenon.
Let's zoom out a bit. What's the frequency over a longer period of time, capturing a few peaks?
Well...if you know there is only one frequency then you can do some math to figure it out, but as soon as you might be describing a mix of frequencies you suddenly, again, potentially don't have enough information.
That lack of information manifests in a few ways. The exact math (Shannon's theorems?) suggests some things, but the language involved mismatches with human perception sufficiently that people get burned trying to apply it too directly. E.g., a bass beat with a bit of clock skew is very different from a bass beat as far as a careless decomposition is concerned, but it's likely not observable by a human listener.
Not being localized in time means* you look at longer horizons, considering more and more of those interactions. Instead of the beat of a 4/4 song meaning that the frequency changes at discrete intervals, it means that there's a larger, over-arching pattern capturing "the frequency distribution" of the entire song.
*Truly time-nonlocalized sound is of course impossible, so I'm giving some reasonable interpretation.
The 50-cycle hum of the transformer outside your house. Tinnitus. The ≈15kHz horizontal scanning frequency whine of a CRT TV you used to be able to hear when you were a kid.
Of course, none of these are completely nonlocalized in time. Sooner or later there will be a blackout and the transformer will go silent. But it's a lot less localized than the chirp of a bird.
Imagine the dissonant sound of hitting a trashcan.
Now imagine the sound of pressing down all 88 keys on a piano simultaneously.
Do they sound similar in your head?
The localization is located at where the phase of all frequency components are aligned coherently construct into a pulse, while further down in time their phases are misaligned and cancel each other out.
Of course if you are happy with spaces instead of newlines as in your C64 code then you can even do it in one big string so you don't have to pay for concatenation.
Precalcing the entire thing and delivering absolutely nothing but a bunch of print statements is certainly an option that's completely in the spirit of the kinds of stuff one does to speed up an 8-bit computer(1) but the toy problem of "how can I speed up this for-next loop and the various logic inside it" was more fun. :)
(also the spaces-instead-of-newlines choice was made so as to avoid scrolling the screen, the c64's slow enough that having the ROM routines copying ~2k bytes of screen/color memory to scroll every second line of output makes changes in text-generation speed pretty much inconsequential. If that code's optimized for anything it's size, not speed. (2) )
(now that I think about it, there's also the fact that while the c64's BASIC interpreter is limited to about 250 tokens per line, the screen editor limits you to 80 characters per line.)
and now I am wondering how fast some form of
for i=1 to 100 step 15
print i;i+1;"fizz";i+3;"buzz fizz";i+6;i+7;"fizz buzz";i+10;"fizz";i+12;i+13;"fizzbuzz";:next
would work, okay I'm gonna fire up VICE: 49 jiffies, a ton faster than the 134 jiffies my previous best took. And if not for the 80-character input limit it could be one line. Hah. Thanks! Finding the right level to unroll a loop at is always your best bet in the 8-bit world. :)
(with the caveat that this generates fizzbuzz for 1-105 instead of 1-100, and logic to stop at 100 and/or logic to stop at any arbitrary limit would probably slow it down)
1: just this morning I was reading a series of blog posts where one of the original authors of Lemmings wrote a port of it to the ZX Spectrum Next and ended up writing some code that would read the data files from the PC version and generate lengthy chunks of hilariously unrolled assembly source that contained one routine to write every image frame to the screen in the fastest way possible, because a more traditional software sprite routine could only handle about ten lemmings at once; that's absolutely insane and absolutely beautiful: https://lemmings.info/porting-lemmings-to-the-zx-spectrum-ne...