A farewell to self-checksumming

When the first public preview of Synergy Advance came out in May 2005 it was the first Wincent product to incorporate a self-checksumming feature. Call it "anti-corruption" if you will, or call it "anti-piracy"; the basic premise is that the executable uses a checksum to guard against corruption, modifications or tampering.

It was something I’d started working on in late 2003 in response to the large number of Synergy cracks floating around. It’s no secret that there is no such thing as uncrackable software, but it’s also true that some protection is better than none. And in this age of trojans, spyware, and viruses, I felt that anything I could to reduce the likelihood of patched and cracked versions floating around in unofficial channels would be a good investment. I figured that a little effort spent increasing the complexity of an attack would be worthwhile. It turned out to be a very complex task and I must admit to being privately relieved when it was finally released to the public and it worked on multiple versions of the OS.

Now such a checksum isn’t really useful if it is trivial for an attacker to replace it. For example, if one were to place the checksum in the application’s Info.plist file then an attacker could patch the binary, calculate a new checksum and easily stick it in the Info.plist file. You make the task of calculating a checksum more difficult by incorporating the checksum into the executable itself. But there’s a feedback loop: in doing so you to modify the executable, thus changing its checksum and rendering the embedded checksum invalid.

The workaround for this Catch-22 situation is to embed a known placeholder (for example, all zeros) in your executable which will be used to hold the checksum. You then overwrite the placeholder with the calculated checksum. In order to verify the checksum at runtime you operate on an in-memory copy of the executable in which the stored checksum has been extracted and then zeroed out, thus reverting the executable to its initial state. This is still not unbreakable security, but it is considerably more secure than the Info.plist solution; to break it the attacker needs to reverse engineer the checksum embedding process and then make a patch that not only patches the executable to change the way it works but also applies a correct, newly-calculated checksum as well. The other alternative is to make a patch which removes the checksumming procedures entirely, but this is evidently more difficult than it would be to produce a patch in the absence of such procedures.

For added security Synergy Advance implemented a two-phase checksum check. Firstly, the framework containing the checksumming code was protected via a checksum. Secondly, the application itself was protected by a checksum. This meant that any effective attack would have to worry about neutralizing both measures. Like any good security scheme, there was scope for evolving the protection in response to successful attacks. If an effective patch was released into the wild then a subsequent version of the software could replicate the checksumming code to the application or to other frameworks, requiring increasingly complex patches. It all worked quite nicely.

But the fun doesn’t end there. There are two more problems to be overcome. Firstly, you need to be able to handle Universal Binaries, and that means your placeholder/checksum will appear twice in the executable, once for each architecture. Secondly, you need to cope with the dastardly abomination that is prebinding.

Prebinding is the worst thing that happened to code-signing in the history of personal computers. The operating system "prebinds" an executable so as to make it launch faster, but in doing so it actually modifies the data of the executable! This means that if you generate a checksum for an executable, prebind it, then re-generate the checksum you very likely will find two different checksums. You can generate a checksum on one machine, copy the executable to another (perhaps running a different version of Mac OS X), and after prebinding you can only hope that the checksums are the same.

The solution, then, for the purposes of checksum generation or executable signing, is to remove all prebinding information from the executable before generating the checksum. When you later want to verify the checksum you repeat the same process, stripping the prebinding information and then calculating the checksum.

I came up with a tool, "pbt" (prebinding tool) to do this stripping. It was a small tool that could be easily bundled inside my application and which didn’t rely on any external libraries of tools being present on the users’ systems. It worked very well for a while but somewhere along the way something broke. What’s changed since then? There have been several updates to Mac OS X (from 10.4 through to 10.4.5), the first Intel-based machines have been released, and the Xcode Tools have received a few updates. I can’t be sure but I think it was one of the Xcode updates that broke things.

The problem was that "pbt" didn’t work properly in some cases anymore. It would work on the framework but not on the application, inexplicably crashing when I tried to feed the binary to it. I never investigated too deeply why this was occurring, I just started looking for another solution. Apple’s redo_prebinding tool, however, did work on both, so I switched to using that. Nothing’s ever easy though, so to guarantee consistent results I had to manipulate the environment variables before running the tool. And to make matters worse, I later discovered that redo_prebinding is not part of the standard install on all versions of Mac OS X, breaking checksum entirely for users who didn’t have it installed (at least on my 10.4.6 PowerPC system it is listed as belonging to the Developer Tools package).

So that put an end to my self-checksumming efforts. Too many variables. Too many unknowns. Worse still, problems like the failure to launch due to a missing redo_prebinding tool (bug #387) and the abort due to a checksum mismatch (bug #389) were perceived by users as crashes. Doh! So I removed the checksumming and put out a new release.

I admit that I’d prefer it if Apple hadn’t gone down the prebinding path. Or, if they insisted, that they implement it in such a way that it did not alter an executable’s binary data, thus rendering things like checksumming and code signing awkward at best and downright impossible at worst. Perhaps they could have stored the prebinding information elsewhere, in a central registry. Ironically, in Tiger Apple says prebinding is no longer necessary for better performance. In Leopard (Mac OS X 10.5) I wouldn’t be surprised if all traces of it are swept from the system entirely.

I didn’t speak publicly about this stuff before because I was employing a dose of security-through-obscurity. But now that there’s no self-checksumming in my products I guess there’s no harm in writing about it. I’ve gone down a different path now with respect to anti-tampering. I would’ve preferred it if my checksumming could’ve remained as one additional layer of protection, but it was not to be.