The unbearable fragility of modern Mac OS X development

There I’ve done it again: I shipped a broken A Better Finder Rename release despite doubling down on build system verification, code signing requirements validation and gatekeeper acceptance checks, automation, quality assurance measures, etc.

Only in October, I had a similar issue. Luckily that time around it only took a few minutes to become aware of the problem and a few hours to ship a fix so very few users were affected. Right now I don’t know how many users were affected by the “botched” A Better Finder Rename 10.01 release.

This didn’t use to happen. Despite the fact that I did not spend nearly as much time ensuring that everything worked properly with the release management. Nor am I alone in this situation. Lots of big as well as small developers have recently shipped similarly compromised releases.

The situation on the Mac App Store is much, much worse. Nobody other than Apple knows how many Mac App Store customers were affected by the recent MAS certificate fiasco that had the distinction of making it all the way into the pages of Fortune magazine.

The truth is that Mac OS X development has become so very fragile.

The reasons for this are manifold and diverse but boil down to: too much changetoo little communicationtoo much complexity and finally too little change management and quality control at Apple.

The recent Mac App Store (MAS) fiasco that left many (1% of Mac App Store users? 100%? Nobody knows) users unable to use their apps purchased from the Mac App Store was down to Apple’s root certificate expiring. This was a planned event: certificates are used for digitally signing applications and they are only valid for a particular period of time, after which they need to be replaced with new certificates.

When the Mac App Store certificate expired, it was replaced with a new certificate but there were two problems. First, the now expired certificate was still cached by some of Apple’s servers: when Mac OS X opens an application it checks its signature, which in the end is guaranteed by Apple’s root certificate. Since this was no longer valid, Mac OS X refused to launch them and reported them as “broken”, leaving users and developers equally baffled. After far too long, Apple investigated the problem and emptied their caches which made the problem go away.

The second problem which was not solved by updating the caches, was due to Apple also replacing the certificate with a new, higher security version; of course without telling anybody. The new certificate could not be verified with the old version of OpenSSL that was used in the receipt checking code of many shipping apps.

When Apple created the Mac App Store, it provided a “receipt” that each application should check to see whether it has been properly bought on the Mac App Store. This is just a signed file that contains details about what was bought and when. Instead of doing the obvious thing, which would have been to provide developers with an API for checking the validity of the receipt against Apple’s own rules, they just publishing snippets of sample code so that each developer could “roll their own” verification code. Supposedly this was for added security (by not providing a single point of failure), but it seems more likely that they couldn’t be bothered to ship an API just for the Mac App Store.

This decision came back to haunt them, because most developers are not crypto experts and so had to rely on developer contributed code to check their app’s receipts. Once this worked properly, most developers wouldn’t dream of touching the code again.. which is how it came to pass that many, quite possibly a majority, of Mac App Store apps shipped with the same old receipt checking code in 2015 that they originally shipped with in 2010(?). This was fixed by Apple revoking the new style certificate and downgrading it to the old standard.

For once, I had been ahead of the curve and had recently updated all the receipt code in my applications (no small feat) and I have yet to hear from any customers who had problems.

Just before the Mac App Store fiasco, however, many non-MAS had also shipped with broken auto-update functionality.

Apple does not offer any auto-update facility for applications that are not on the Mac App Store, which lead to Andy Matuschak’s “Sparkle” framework becoming the de-facto standard for adding an auto-update features to Mac applications.

Driven by abuse of some HTTP communications in iOS apps, Apple decided that in iOS 9 it would by default opt all developers into using only (more secure) HTTPS connections within their applications. What is good for iOS 9 can’t be bad for Mac OS X 10.11 El Capitan, so Mac applications also got opted into this scheme.

Unfortunately, that broke Sparkle for applications which do not point to HTTPS “app casts” such as mine. I have long resisted installing my own HTTPS certificates because I was worried about messing up the expiry periods, etc.. apparently just the way that Apple did with the Mac App Store certificates.

Most developers will have been unaware of the change since Apple never announced it, but I had happened to see the WWDC conference videos that mentioned this in passing. Unfortunately, nothing “clicked” in my head when I heard this. My applications do not communicate with a server anywhere and I thus thought that this was not something I had to worry about. I had forgotten that Sparkle might use this internally.

Fortunately, I caught this at 6AM when I released A Better Finder Rename 10 final. I was just doing a normally completely redundant check through all the features of the program when I noticed that the new version failed when trying to check for updates. By 8AM, I had identified and fixed the problem so that very few people indeed could have been caught out by it. That was luck though.

The nefarious element here was that applications were opted in automatically and silently. Before 10.11 El Capitan was installed on your Mac, my applications updated just fine. Afterwards, they no longer did. Just because they were on El Capitan. Gee thanks!

Of course, this would not have happened if I hadn’t built A Better Finder Rename 10 with the Mac OS X 10.11 SDK (Software Development Kit) at the last moment.

It is somewhat crazy for a developer to change the SDK that s/he builds a brand-new version of their software against in the middle of the beta phase. Changing the SDK always introduces errors because the entire environment in which the code executes is changed. This may bring out bugs that were already present; things that should never have worked, but worked just because the API happened not to trigger the bug. It also introduces bugs that are just part of the new SDK and that you now have to work around. Changing SDKs makes existing programs fragile.

I’m very conservative when it comes to changing SDKs because I’m well aware of the risks. That’s why I’ve been building my code against older SDKs for the past 15 years. A Better Finder Rename 10 was built against the Mac OS X 10.7 SDK which is forwards-compatible with newer versions of Mac OS X.

The main reason for doing so, is that I wanted to be certain that I didn’t accidentally break A Better Finder Rename on older systems, which brings us to the next problem with Mac OS X development.

Xcode lets you specify a “deployment target”, for instance 10.7, while building with a newer SDK. This is the recommended way of developing on Mac OS X and keeping backwards compatibility. Xcode will, however, happily let you use APIs that are not compatible with your deployment target and thereby ensure that your application will crash on anything other than the latest Mac OS X.

In fact, Xcode encourages you to use the latest features that are not backwards compatible and will rewrite your code for you if you let it, so that it will crash. It will give you “deprecation warnings” for any API usage that is not in the latest SDK and resolving those warnings is likely to break backwards compatibly as well. Of course, you won’t know this until you run it on the old Mac OS X version.

Now which developer can afford to keep testing rigs with 10.7, 10.8, 10.9 and 10.10? Never mind spend the time re-testing every change on multiple platforms for each change?

Thus I happily built with the 10.7 SDK. Apple did not make this easy by not shipping the old SDKs with Xcode, but you could manually install them and they would work just fine.

Imagine my surprise after installing Xcode 7 and finding out that this no longer worked. The only workable solution was to build against the 10.11 SDK, so jumping forwards not one but 4 SDK versions. A bunch of code wouldn’t compile any longer because the libraries were gone. Luckily the receipt checking code was amongst those, so it got modernised just in time to avoid the Mac App Store receipt fiasco.

Nonetheless, now my entire code base had become fragile and largely un-tested between the last beta release and the final shipping product. Nightmare!

On top of that was it still even 10.7 compatible? or indeed 10.10 compatible? Just quickly running it on older systems wouldn’t provide more than a little additional confidence since it’s impossible to go through every code path of a complex product.

After installing virtual machines to test on, I still couldn’t be 100% certain. The solution came in the form of deploymate, a now truly essential developer tool which does what Xcode can’t do: check that API usage is compatible with the deployment target.

I have since spent many weeks trying to ensure that I won’t run into the same problems again by adding (additional) automated verification processes to my build system. My build system now runs the built product through SDK compatibility checking courtesy of deploymate, code signing validation and gatekeeper verifications on each build. I’m still working though deprecation warnings and the like and my code base will soon be bullet proofed at least until the next forced changes arrive.

You’d think that this was a long enough list of problems for one year, but this still does not account for Apple also changing the code signing rules (once again) earlier in the year (in a point update of 10.10 no less). This time it affected how resources and frameworks are signed. So applications that were signed correctly for years, now suddenly became incorrectly signed and Mac OS X would refuse to launch them because they were “broken”.

All this points to the underlying issues with the current spade of fragility of Mac applications: Apple keeps changing the status quo and neither it, nor developers have any chance of keeping up.

Apple’s own applications are full of bugs now. None more so than Xcode, which is both the lynch pin of all Mac OS X, iOS, watchOS and tvOS development and no doubt Apple most fragile app offering. Xcode is in beta at least 6 months a year and never really stabilises in between. Each new version has new “improvements” to code signing, app store uploading, verification code, etc.. and each new version breaks existing code and introduces its very own new bugs and crashes. From one day to the next, you don’t know as a developer whether your code works or not. Existing code that had worked fine on Friday evening, no longer works on Monday morning. Worse, chances are that you are not hunting for your own bugs, but those in your development tools, the operating system or Apple supplied SDKs.

All this is driven by the one-release-a-year schedule that Apple has imposed on itself. This leaves all of Apple’s software in various stages of brokenness. When Apple’s own staff cannot deal with this constantly shifting environment, how are third party developers supposed to?

Case in point: Apple’s own apps are not all iOS 9 compatible yet. Many don’t support the iPad Pro’s new native resolution yet. Some have gained Apple Watch extensions, but most haven’t.

Reliability is a property of a system that is changed slowly and deliberately and where all constitute parts are themselves reliable. The Mac and all other Apple platforms are currently undergoing the worst dip in reliability since Mac OS X was introduced.

Apple is pushing out half-baked annual releases of all its software products, as well as apparently completely unmanaged changes to policies, external rules and cloud services at an ever more frenetic pace.

These could be written off as temporary “growing pains”, but the big question is: Do all these annual updates equate to real progress?

When I switch on my Mac today, I use it for much the same things that I used it for 10 years ago. A lot has changed. Cumulatively Mac OS X 10.11 El Capitan is somewhat better than 10.6 Snow Leopard.. and yet if you discount cosmetic changes and new hardware, nothing much has changed. Certainly nothing much has actually improved.

I can’t help thinking that if we had had 2 or possibly 3 Mac OS X updates instead of 5 over those last 5 years, we’d be in a much better shape now. Apple and developers would have time to provide user benefits and rock solid reliability rather than just endlessly chasing their own tail.

The beauty of the Mac used to be that it just worked. I want to get back to that.