Ferrite's App Architecture

There’s a lot of chatter online about different application architecture-patterns (MVC, MVVM, VIPER, etc). I want to talk today about some architectural decisions that are kinda orthogonal to those, that also have pretty big effects on your application.

Previously, on Freshly Squozen Wooji Juice...

This is a follow up to my article on building a large pro app in Swift — reading that first, if you haven’t already, will give you some context about the requirements for this particular app.

I’m going to discuss this in the context of Ferrite Recording Studio‘s document & application model. By which I mean: how a user’s document is represented, persisted, updated, how undo & redo are managed, and how that’s all connected to the views that represent it on screen.

This stuff can, with poor architecture, become a very tangled mess where it can be difficult to keep track of who is responsible for updating who, or follow the flow of changes in an app. If you’ve ever messed up writing some code and (for example):

  • had undo/redo get out of sync and corrupt a document
  • had two widgets, representing different views onto the same underlying data, get out of sync
  • got an infinite loop or stack smash from a view updating a model updating a view updating a model updating a view...

…then you know what I mean.

There are lots of ways to avoid this, of course. This is just one particular strategy I’ve found useful, because it simultaneously addresses these issues, and a bunch of other stuff besides.

The Stupidest Thing That Could Possibly Work

Purely as a thought-experiment, what would be the stupidest way of handling all this, that would make life easy? If we didn’t care about speed, storage space or efficiency at all?

Well, how about this:

  • When the user creates a new project, write a blank document as “Version 1.your-document-format”
  • Every time the user makes any change to the document, write an entirely new file, identical to the old one except for the change, with the next version number in the filename.
  • Then load that new document — just as if the user had opened it for the first time
  • Any time the user hits undo/redo, just load the previous/next version

This is, of course, horrible in all sorts of ways: it’s slow, wastes storage space, wastes time unnecessarily rewriting things, wastes time unnecessarily reloading things.

But it also has some really nice properties:

  • The document is always saved — you can’t lose work due to a crash
  • Changes are atomic — you can’t lose or corrupt an existing document due to a crash mid-save
  • undo/redo is persistent, and never gets out of sync or causes corruption
  • Indeed, you never actually have to write undo/redo implementations for any command
  • There’s no code-tangle between views and models or anything like that: when the user activates a command, that triggers a new document save, which triggers a reload, which causes everything to update, clean and in-sync
  • In fact, you have a single code path to handle opening a document, undo/redo, and reflecting the user’s editing in the UI — greatly simplifying your app’s structure and removing lots of places for bugs to lurk.

Can we get these same properties, with an architecture that doesn’t suck? That’s actually fast, compact and efficient?

Intrinsically Versioned Documents

In my previous article, I mentioned, as a loose inspiration, Wil Shipley’s talk on Git as a Document Format.

Although I linked to that talk as a good overview, I think I first heard Wil’s idea on an episode of Debug — it’s long enough ago that I’m not sure, but the timeline seems right, since I started Ferrite in 2014 and that particular idea had been rattling around in my head for at least a year. Wil, in turn, credits it to Sean O’Brien.

The high-level overview is that your document format is simply a Git repository, and every time the user changes the document, you commit a new version, and undo/redo just becomes a case of checking out a particular commit. In this way, versioning becomes intrinsic to the document itself, rather than glommed on by NSUndoManager stacks or something like that.

And, turns out, it’s not too dissimilar to my “straw architecture” above — but better, because depending on how you structure your data (e.g. if each object in your model graph was a different “file” in the repofootnote 1), you don’t have to write an entirely new document each time, so it’s faster, and consumes less space.

The Thing About NSUndoManager

NSUndoManager is the traditional way of handling undo/redo on the Mac, and (particularly with prepareWithInvocationTarget:) it’s really, really clever. The kind of thing that wins over new recruits to Cocoa, and one of the reasons Mac apps have always had excellent undo/redo support.

But it’s not persistent.

Now, that makes sense given a desktop document model where “save” makes changes permanent. You can’t usually “undo”, when you reload a document at a later date — for that, you’ve got Time Machine.

But on an iOS device, you don’t have Time Machine. And your app can be backgrounded at any time. And the OS can kill your app at any time when it’s in the background, if it needs the memory for something else. And there goes your undo stack!

As a user, you basically just got press-ganged into having your changes made permanent, whether you wanted them or not. This sucks, and does not encourage experimentation, or a feeling of security when editing.

Well, you could go and rewrite your own NSUndoManager that’s serialisable. But that’s probably non-trivial, given that it bottles up method invocations — I bet there’s a lot of edge cases to get right, figuring out how to track the object instance you’re actually calling it on, etc.

And you have to write all that out when you get backgrounded, and of course you still have to write code to actually register the undo-behaviours on a command-by-command basis, making sure you don’t mess them up and corrupt the document.

Added up, it just doesn’t seem like a very desirable solution to me for modern iOS apps. And it seems notable that many apps don’t support undo/redo. Which is a sucky state of affairs.

So, this Git technique seemed pretty interesting.

But I Didn’t Actually Use Git

I still think using Git is an interesting idea, and may use it for other apps, but for Ferrite, it didn’t seem like quite the right choice.

The reason is to do with how the data is structured and changed. Git is designed primarily for source-code trees, with relatively-infrequent commits (compared to “potentially every single touch/gesture”!). Any change requires rebuilding not only the changed node, but every parent node up to the root of the graph.

When I thought about ways to structure audio editing projects, there didn’t seem to be good ways to represent them, that were efficient in the ways I needed for in-memory processing and rendering, and didn’t involve having to touch unnecessarily-large chunks of the tree when making changes. If your tree is deep, you have lots of layers of nodes to rewrite, and if your tree is shallow, you have much wider indexes to rewrite at each layer.

This doesn’t really matter much for general Git usage (usually...), but I didn’t think it would be ideal for realtime audio editing use.

There’s also the issue of cleaning out redo data when you undo a few steps then make a new change: you’ll be left with a bunch of orphaned commits you need to go and scrub out. Solvable, but another hassle to deal with.

And anyway, in Ferrite’s case, we’re almost always either writing a new version with a tiny change, or moving from one version to the previous or next. Meaning, at any given moment, we only really need the current state of the document. We’re not merging branches. We’re not diffing this version against last week’s. The full repo model seems like overkill.footnote 2

Ferrite does support scrubbing through history — and that one (generally pretty rare) operation probably runs slower than it would using Git (at least, for large scrubbing distances). But what I went with seems to be more efficient in the situations you’re running into 99% of the time. Optimise for the common case!

What I Used Instead

I still sometimes second-guess this decision, but I instead went with an SQL database (via YapDatabase), for which I wrote a plug-in that automatically tracks changes: essentially, on any database commit, it automatically inserts (as part of the same atomic transaction) information on how to reverse the change.

Ironically, it’s a bit like NSUndoManager in some respects (including the fact that performing an undo will automatically build the matching redo information through the same mechanisms), but tucked into the database layer. Essentially, it tracks (and will undo/redo) writes to rows, so it has a tiny surface area to cover. Being hooked into the DB means it never misses anything, and it can be tested thoroughly across that surface area, and then relied upon.

(You can configure it with categories of stuff to ignore — this avoids the problem Wil mentions in his talk, where if you maintain things like cursor position in the document, it would spam the undo stack. YapDatabase is designed around “collections”, so you just specify collections that shouldn’t support undo/redo, and only ever use those collections for transient state like the cursor or scroll position.)

So despite being delta-based under the covers, externally to the DB, the result is much more like the Git model. From the app’s viewpoint, it simply commits atomic transactions as it would normally with any database. Each transaction is a version. Twiddling the panning dial of a track, say, or adjusting the fade-in time of an audio clip, is just writing a single value to a single row of the DB, and undo/redo is taken care of automatically.

It’s also a lot simpler to deal with maintenance tasks like clearing out redo stacks, plus you still get all the usual SQL DB benefits — atomic transactions, fast queries for specific rows, nice clean asynchronous writes in the background,footnote 3 that kind of thing. It also has its own change-notification system, which will come in handy later in this article…

Chain Reaction

I first heard about Facebook React in a video presentation, almost certainly one of the ones on this page. I don’t remember which, but the gist of it was that they were having a hell of a time trying to get Facebook Chat to work reliably.

Most of their problems seemed to be due to a really complex web of state-changes, with local client and remote server state getting mixed up, and (in effect) what amounted to merge conflicts stomping on changes, causing messages to get lost. In the end, they decided to make things much, much simpler:

  • All changes applied on the client, became idempotent/stateless commands sent to the database
  • Database updates were pushed out to those components that care about them
  • The UI reflected this not by describing changes that needed to be made, but by describing the new state it wanted to be in
  • The client can then either tear down the old state and replace it with that new state, or do fancy diffing stuff if it prefers, but it can never get out of sync due to a delta update getting applied to the wrong initial state.

This reminded me strongly of two things:

  • John Carmack‘s interesting networking model for Quake 3
  • The part of the “straw architecture” from the start of this post where everything is a document write and updates are handled by a full reload.

All these concepts connect together pretty neatly to make the final application structure. They also tie in with Swift’s gentle push in the direction of immutable values, value-type semantics, copy-on-write, etc.

Again, I didn’t actually use React Native: over the years, I’ve grown a distaste for cross-platform layers that get slopped over the native UI. Plus, well… you know. JavaScript.

Wat.

But, this general principle? Detangling the app architecture and data model by making all updates to the database through a single clearly-defined API, having the rest of the app receive update notifications from the DB, and concentrating on final desired state rather than deltas? Yeah. That stuff is nice.

So What Actually Happens in Ferrite?

Document data, when it needs to be in-memory, is represented by immutable types. You can make mutable copies, and tinker with those as much as you like — but you can’t do anything else with those except commit them to the database. So there’s none of the usual stuff where View objects and Model objects are tied together — nothing like Cocoa Bindings, or RxSwift, or some manual thing like updating the model baed on a UIControl‘s .ValueChanged events.

(If you go back into the blog archives, you’ll find a post where I hinted that I was tinkering with my own Swift-native Bindings system. Why did I never follow up on that? Because, although I did get that system up and running, and it was kinda cute, between this alternative app architecture and the use of Futures, it was mostly unnecessary — I completely took it out of the codebase in the end.)

The Document object in Ferrite provides a single method for updating the contents: applyChanges(). It takes a single parameter — a block. It constructs a temporary proxy object, which it passes into the block. The proxy has all the methods for actually changing document state, you might use it something like this:

Swift

Example: Cutting a clip in two
var firstHalf = originalClip.mutableCopy() var secondHalf = originalClip.mutableCopy() firstHalf.endTime = cutAtTime secondHalf.startTime = cutAtTime document.applyChanges { proxy in proxy.deleteClip(originalClip) proxy.insertClip(firstHalf) proxy.insertClip(secondHalf) }

This is simplified — but only slightly. Ferrite’s objects for clips on the timeline are more complex than this because of things like fades, and (un)cropping. It supports cutting multiple clips at once on different tracks, and we’d need to update the selection too. But still: the real code is actually very much like this.

So, the proxy object is responsible for creating, and then committing, an atomic write transaction to the database. The insertClip()/deleteClip() and other methods it provides, add the actual changes to the transaction.

Nothing here updates the in-memory model, view state, or anything like that.

Instead, everything is driven by the database. YapDatabase issues notifications when data changes as the result of a commit, and these ripple out (in the form of once-more immutable objects, safe to hand off to background threads) from the Document to code that has registered an interest. Which can then apply the new state; for certain operations, diffing is applied, but in many cases, it can simply write new state on top.

We can also wrap up the “data push” in an animation block. This is particularly nice for undo/redo as you can see clips scoot around the timeline, watch faders update like a servo-automated mixing desk, and so on. It’s not just cute — it’s also really useful when you’re coming back to a document after some time has passed and/or going back into the history, because it helps you cue in on what is changing, and how, much more clearly than non-animated changes.

But the big advantage from the developer’s standpoint is that it cuts through the potential rats-nest of events flowing between components. The only state-changes are database writes, routed through the change proxy. Otherwise, all updates flow in a single direction, from the database out to subscribers which read immutable types and update themselves to match.

And finally, the last big win is that having versioned projects has been invaluable throughout development. On a day-to-day basis, it helps with testing code against specific known states. But on at least a couple of occasions during beta testing, it was a complete life-saver when a tester had an issue with a project behaving strangely. I could simply rewind time until the point where the problem went away, and then step forward to see exactly what triggered the issue. Pure gold!

Mix and Match

The reason why this is orthogonal to MVC, MVVM, VIPER et al, is that none of this specifies the fine details of how your model, view and controller (or presenter, or interactor, or whatever) objects are linked. You can almost certainly hybridise this approach with whichever architecture-pattern you prefer. Something still needs to get those database updates to the views, for example — how you go about it, how you divide up the labour of the app in general, is up to you.

What’s important is that:

  • The model is stored in “a database” — whether that’s SQL, or Git, doesn’t matter, as long as its reliable, atomic, etc
  • Changes are funnelled through a single API that writes to the database
  • Outside of the database, the model is immutable
  • The flow of data is always one-way, from the database out to the views (or audio renderer)
  • That flow is through a single code path, regardless of whether you’re loading, making edits, or applying undo/redo
  • Commands to the database, and notifications from it, always convey information as the final desired state, not as a change (eg “the volume should be 90%”, not “decrease the volume by 10%”, or “the track with this UUID should no longer exist”, not “delete the track at index 3”).

The end result is something that’s by-and-large as clean and simple and easy to reason about as the stupid architecture from the top of the article, but fast and efficient, and also supporting cute animations and such besides. Lots of win!

Diffs vs Loading

Now, you might have read this and scratched your head and thought, “Wait a minute, aren’t you pulling a fast one? The whole point was to load simple, static state, but instead it’s full of diffs and deltas!”

Thing is:

  • They are localised to very specific parts of the app and always dealt with internally to a specific component, not as part of the API contract between components
  • They are optional performance optimisations or UI enhancements, not an inherent part of the design. You can (& I occasionally do, for debugging) turn them off.

YapDatabase sends out a notification for each commit, from which you can find out every row that’s changed. How your components respond is up to them. It’s easy to hook into that at the top level, go “Oh, something somewhere changed? Let’s just reload everything the naive way!” whether that’s because you’re just getting something up and running, or because you want to verify that it works.

Or you can care about which DB rows/keys changed, and reload the data only for those keys, which gets you 95% of the efficiency gains while still being a relatively naive approach.

Or, you can run specific diffs on the changed data against the old; Swift helps here again — generics mean you can write diffing once and just kepe reusing it; immutable types and value semantics make it easy and safe to compare state without worrying that someone forgot to copy a value somewhere so it actually got accidentally updated to reflect the new state so your diff fails to detect the change.

Whether you conceptualise it as, “The mental-model is like we’re always reloading the document from scratch — although we’ve profiled, identified the hotspots and we’re actually applying deltas for efficiency in the places where they’re needed”, or whether you conceptualise it as, “we’re always sending changes from the database to the views; when we first load the document we’re just sending changes from an initial state of ‘completely empty’, and sometimes the ‘change’ is ‘replace all of that old state with all of this new state’”, doesn’t really matter. The end result is the same: clarity of code and and a reliable app that’s still highly performant.


1. Of course, you need to decide for yourself what level of granularity to use for a given project

2. It seems at first like it might help with sync, if you were going to put documents in iCloud or something. But in fact, it doesn’t really, because in most cases your document content isn’t amenable to merging that way; you probably need something much more like Operational Transform, which is a whole different ball game, and generally characterised by being a massive pain in the arse.

3. YapDatabase’s threading model fits my brain nicely