Mundane number golf

27 November, 2024

There was an online discussion regarding ‘magic numbers’ in software – numeric constants serving an important role with no description or justification for their value – and one static analysis tool’s assertion that the only numeric constants permissible in statements are -1, 0, and 1. Although the significance of undocumented numbers in written software is often not readily apparent, sometimes the function of a number really is nothing beyond its literal value, and this sort of overzealous rule is guaranteed to complain about expressions whose intent actually is readily apparent – but none of that actually matters because someone soon ported the classic random number algorithm to its compliant form:

int getRandomNumber()
{
    return 1+1+1+1;
}

A cunning approach but it clearly can’t scale to larger numbers. This had me thinking about what other operators could be used to express bigger values – for instance, prime factors could be assembled by addition then combined with multiplication, so something like thirty becomes just (1+1)*(1+1+1)*(1+1+1+1+1) instead of an elongated trail of 1s. Then soon I realised that as ~1 is -2, and because unary operators bind most tightly, the function above could be concisely expressed as:

int getRandomNumber()
{
    return ~1*~1;
}

Thus yielding a monumental 29% saving of two whole characters!

At this point I thoroughly nerd-sniped myself. Perhaps a program could find concise expressions for arbitrary constants comprised entirely out of these supposed non-magic, henceforth ‘mundane’, numbers? The shortest expressions? Well, of course it could.

Terminal emulator is cool-retro-term.

0 doesn’t end up getting used so much, being useful entirely to represent 0 with one character and nothing else, so I soon changed the goal to expressing everything in terms of 1 only. The script easily supports any combination of terminating literals, though. But right now I can confidently say that the current year is (~1-1^-1<<(1+1<<1+1))<<1-~1, and there’s nothing magic about that. The basic approach is to build successively taller abstract syntax trees of operators eventually terminating in some set of constants, then to evaluate those trees and construct expressions using standard operator precedences[1] to minimise use of parentheses, and keep a record of whatever is shortest for its value.

Even now I’m still fiddling things, but as of writing the script is here, complete with excessive object hierarchy. The inner loop isn’t entirely efficient and definitely repeats earlier work, but is still far from brute-force and does guarantee optimal expressions given enough iterations. It also permits pretty progress bars (powered by tqdm) that wouldn’t work so well with a hypothetically tighter loop. In all, despite being written in Python, it can find hundreds of thousands of expressions somewhat faster than the ‘fractal deaths of the universe’ typical of combinatorial explosion.