iOS Design Patterns: Part II

Here are three more iOS development patterns that fly somewhat in the face of answers you might see on Stack Overflow touted as “best practices.” Two of these are rock solid and a third is on probationary status, which I’m throwing out there as a discussion point.

Use structs to define appropriate architectural boundaries

It’s really easy to blur architectural boundaries in an iOS app. That’s partly thanks to the Delegate pattern, which encourages concerns to spread across multiple classes with varying roles and responsibilities. When we lose sight of that boundary between view and controller we inevitably neglect the fact that appropriate and tightly defined boundaries are the backbone of a well-designed and maintainable architecture. An extremely common and simple example of where this can happen is configuring UITableViewCells in a UITableView’s data source.

Often we’ll end up with one of two extremes: either a controller that knows a lot about the internal view structure (directly setting UILabel strings and colors and other blatant Demeter violations), or a view object that takes a domain record (such as an NSManagedObject) and is responsible for translating that high level object into specific pieces of display data, itself. In either case, we have parts of our app that are tightly coupled to things from which they should be insulated. The contagion can easily spread, for instance by moving view-specific display logic to that high level domain record in order to “clean things up.”

Somewhere between tweaking UILabels in your view controller and passing NSManagedObjects all over the place is the sweet spot of just enough data for the view to render itself, with a minimum of logic required to do so. Minimizing logic is a key goal, here—the view system is one of the more complex and opaque parts of an iOS app, and one of the last places you want to have code dependent on high-level semantics, if you can avoid it. Any code in your view that takes a high-level concept and translates it into things like colors, text strings, and the like is code that is significantly harder to test than if it were elsewhere.

Immutable structs are fantastic for exactly this purpose. A struct provides a single value that can be passed across the boundary while encapsulating a potentially unlimited amount of complexity. Immutability simplifies our code by ensuring a single entry point for the configuration logic, and helps keep the logic for generating the struct’s member values in one place. They can be as high-level or low-level as is appropriate given the view and the data. For basic table views that are primarily text I might simply have a struct of String, Bool and UIColor values mapped to each visual element. On the other hand, I have a view for drawing graphs that takes a general description of the graph to be drawn, where there is less of a direct connection between the values I pass and what ultimately gets set as the final configuration.

(In the latter example, the view makes use of other classes to interpret the input and produce the final display values—to what extent you continue to rely on the view-side of things being able to interpret data in complex ways will vary. In my case, the controller “collates” the data into a general form, and the view is responsible for turning that into a renderable form)

In either case, there is one correct way to cross the boundary between controller and view, and provided you keep your view outlets private (as you should) you’ll have confidence that your controllers and views remain both loosely coupled and synchronized in their effects.

Use awakeFromNib as a form of dependency injection

Google around for how to get at your NSManagedObjectContext from your view controllers, and you’ll get two answers:

Set it on your root view controller in your AppDelegate, then pass it to each view controller you present

One downside to this solution is that, at least as of the last project I began, the AppDelegate is no longer necessarily involved in bootstrapping the storyboard. You can get at your root view controller via the window, getting a chain of optionals leading to your controller and then setting your managedObjectContext property, but it is exceedingly slapdash, at best. Another problem is all the laborious glue code involved in ensuring an unbroken chain of passing along the context, bucket brigade style, between your root and any controller that might need it. All of this is in service of avoiding globals, as advocated by the next solution:

(UIApplication.sharedApplication().delegate as! AppDelegate).managedObjectContext

Anytime this solution is mentioned, comments about avoiding globals or Apple having rejected this approach surely follow. In general, yes, globals are bad, for varying reasons (some of which have less to do with “global” and more to do with general pitfalls of reference types.) In this case, the global is bad because it bakes an external dependency into our code. In my opinion, a global is bad in this situation for the exact same reason that using a class constructor can be bad—absolutely nothing would be improved here if AppDelegate were a constructor we could call, rather than a property.

What this all is crying out for is a form of dependency injection—which is why the first solution is often preferred, being a poor man’s dependency injection solution. Too poor, in my view, since it ties a class’s dependencies to the classes it might eventually be responsible for presenting. That’s craziness, and even worse than just using UIApplication.SharedApp... inline, if followed to its natural conclusion.

Thankfully, because we’re using storyboards, we can have the best of both worlds. First, yes: your methods should be dependent on a managedObjectContext property on your class, not directly referring to the global. Eliminate the global from inline code. Second, no: passing objects bucket brigade style from controller to controller isn’t the only form of injection available to us. Storyboards can’t set arbitrary values on the classes it instantiates—unfortunately—but it gives us a hook to handle in code any setup that it can’t: awakeFromNib.

The fact that awakeFromNib is in our class and not somehow external to it is a complete technicality. To the extent that we’re being pushed into doing the least unreasonable thing we can, using global or top level methods in awakeFromNib is fair game—this code is only ever run by the storyboard, at instantiation time. To be fair, awakeFromNib is a blunt instrument, but we needn’t live with its dictates, as plenty of other hooks are called before the controller is actually put to use. Ultimately, I view using awakeFromNib in this way as no different than specifying a concrete class to instantiate in a storyboard and connect to a view controller via a protocol-typed outlet.

(In this specific case, one additional thing I would do is have my own global function to return the managed object context, and call that in awakeFromNib, as a single point of contact with the “real” global. I’ll also note that I avoid having my view controllers directly dependent on NSManagedObjectContext as much as possible, which is another pattern I’ll be discussing.)

One last thing: why awakeFromNib and not initWithCoder? First, awakeFromNib is called in any object instantiated by the storyboard, not just views and view controllers. Second, it reinforces the special-cased nature of the injection, over the more general case of object instantiation. Third, outlets are connected by the time awakeFromNib is called, in case that’s ever a concern. Fourth, initializers are very clearly a proper part of their class, but awakeFromNib is, arguably, properly part of the storyboard/nib system and only located on the class for convenience, giving our class-proper code design a bit of distance from what goes on therein.

Handle view controller setup in UIStoryboardSegue subclasses

This one might be a bit more controversial. I’m going to see how it shakes out, long-term, but from a coupling-and-responsibilities perspective it seems a no brainer. In short: configuring a new view controller isn’t necessarily—or even usually—the responsibility of whatever view controller came before it. If only we had a class that was responsible for handling the transition from one view controller to another, where we could handle that responsibility. Wait, there is such an object—a segue. Of course, segues aren’t a perfect solution, since using them conflates animations with nuts-and-bolts setup. They are, however, a natural, lightweight mechanism for getting random crosstalk code out of our view controllers, and the field for setting a custom UIStoryboardSegue class is right below the field for setting the identifier.

If there’s one underlying theme throughout these patterns it’s “stop using view controllers as junk drawers for your code.”

iOS Design Patterns: Part I

I’m working on a brand-spanking-new iPhone app, for the first time in a while, and I’m trying to take a fundamentals-first, good-design approach to development, rather than simply regurgitate the patterns I’ve used/seen in the past. Here are three “new” approaches I’m taking this time around. Each of these patterns are broadly applicable regardless of your language or platform of choice, but with iOS development, and XCode, they can take a form that, at first, might look odd to someone used to a particular style.

Clean Up View Controllers With Composition

Ever popped open a class and seen that it conforms to 10 protocols, with 20-30 mostly unrelated methods just piled on top of one another? This is a mess: it makes the class harder to read and debug, it makes individual lines of logic harder to test and refactor, it can mean an explosion of code or subclasses you don’t actually need, and it precludes sane code reuse.

By applying the single responsibility principle—and the principle of composition-over-inheritance—we can mitigate all of those problems, moving code out into individual classes for each protocol/responsibility. You’ll win gains pretty quickly when you realize, for instance, that a lot of your NSFetchedResultsController-based UITableViewDataSource code is nearly identical, and a single class can suffice for multiple view controllers.

That goes for view code, too: If you’re poking around in the view layer it’s probably a good idea to do it in a UIView subclass. The name of the game isn’t to minimize the number of classes in your project, and separating code by function appropriately is the basis of good code design. For that matter, the name of the game isn’t merely “code reuse” either—whether or not you’ll ever take two classes and use them independently isn’t the mark of whether they should just be smooshed into one giant class.

Cut Your Managed Objects Down to Size

What’s the responsibility of your NSManagedObject subclasses? To coordinate the persistance of its attributes and relationships. That’s it. Taking that data and doing various useful things with it is not part of that responsibility. Not only do all those methods for interpreting and combining the attributes in various ways not belong in that specific class, but by being there they are manifestly more difficult to test and refactor as needed. If you’re looking at a bit of code to—I don’t know—collect and format the names and expiration dates of someone’s magazine subscriptions, why should that code be dragging all of core data behind it?

At a minimum, most of those second-order functions can be split out into a decorator class or struct. A decorator is a wrapper that depends simply on being able to read the attributes of its target object, and can then do the interesting things with reading and displaying that data—without involving core data at all. How do we eliminate core data entirely? By using a protocol to reflect the properties of the NSManagedObject subclass. Testing any complex code in your decorator is now a cinch—just create a test double conforming to that protocol with the input data you need.

This is a super simple example of a decorator I use to encapsulate a Law of Demeter violation. This illustrates the form, but the usefulness pays increasing dividends as the code gets more complex. Note, also, that you needn’t have a single decorator for a given model… different situations and domains might call for differentiated or completely orthogonal decorators. In that way, decorators also provide a way to segregate interfaces appropriately.

protocol ExerciseAttributes {
  var weightType: NSNumber? { get }
  var movement: Movement? { get }
}
extension Exercise: ExerciseAttributes {}

struct ExerciseDecorator {
  private let exercise: ExerciseAttributes?

  init(_ exercise: ExerciseAttributes?) {
    self.exercise = exercise
  }

  var name: String? {
    return exercise?.movement?.name
  }
}

Storyboards Can Help Manage Composition

I used to have a knee-jerk reaction to storyboards. They felt like magic and as if all they did was take nice, explicit, readable code and hide it behind a somewhat byzantine UI. Then I realized what they really do: they decouple our classes from each other. The storyboard is a lot like a container. It lets us write generic, lightweight classes and combine them together in complex ways without having to hardcode all those relationships, because there’s another part of our app directing traffic for us.

After you’ve moved all your protocol and ancillary methods from your view controllers, you’ll probably end up instead with a bunch of code to initialize and configure the various objects with which the view controller is now coordinating. This is an improvement for sure, but you still end up with classes that mostly exist just to strongly couple themselves to other classes. That glue code is also so much clutter, at best. At worst it has no business being in your view controller class at all, but for a lack of anywhere else to put it. Or is that so?

Amidst the Table View, Label, and Button components in the Interface Builder object library is the simply named “Object” element. The description reads:

Provides a template for objects and controllers not directly available in Interface Builder.

“Not directly available in Interface Builder?” Then what’s the point, if we can’t do anything with it? Ah, but we can do things with it: we can hook up outlets and actions, and configure the objects with user-defined runtime attributes. We can, simply put, eliminate large swaths of glue code by letting the storyboard instantiate our coordinating classes for us, configuring them with connections to each other and to our views, and even allow us to tweak each object on a case-by-case basis. All with barely any code cluttering up our classes.

You might have a strong intuition that a lot of that belongs in code, as part of “your app.” If so, ask yourself: if this belongs in code, does it belong in this class? Truly? Cramming bits of orphan code wherever we can just to have a place to put them is a strong code smell. Storyboards help eliminate that. Embrace them.

Huge drawback: The objects you instantiate this way have to be @objc, and as such you can’t have @IBOutlets for a protocol type that isn’t also @objc. This means you lose the ability to pass and return non-Objective-C types such as structs, enums, or tuples. This is really frustrating and a significant limitation on using the technique to clean up your view controllers more generally.

Xcode Playgrounds and Incorrect Image Dimensions

For a bit of fun I’m going through the DARPA Shredder Challenge puzzles. The challenge ended 5 years ago (and I’m not a computer scientist, besides), so I’m sticking with the tools and technologies I use professionally, despite their potential inefficiency or inappropriateness for the task.

My first problem, right off the bat—trying to load the puzzle image into a playground kept coming up with the wrong dimensions, by almost an order of magnitude. The full-size image is important since I’m basically working with the image on a pixel-by-pixel basis, and the details I needed for analysis was getting blown away. Pretty much nothing I did that involved NSImage in any way would work, and since I just needed to get at the raw pixel data, I skipped it entirely:

let bundle = NSBundle.mainBundle()
let url = bundle.URLForResource("puzzle1", withExtension: "png")!
let dataProvider = CGDataProviderCreateWithURL(url)
let image = CGImageCreateWithPNGDataProvider(dataProvider,
  nil, false, .RenderingIntentDefault)!
let bitmap = NSBitmapImageRep(CGImage: image)

(Some lines split out to reduce horizontal scrolling)

Note that using NSData(contentsOfURL:) in conjunction with NSBitmapImageRep(data:) did not work, having the same dimensions problem as the simpler solutions.

When Agile Isn't

About three years ago I left an NYC app startup—which I will not name here—after just over a year there. The immediate cause was personal: the emotional stress from an increasingly perilous interpersonal environment on top of an unsustainable and severe “crunch time” schedule brought me to my breaking point. Before too long the root causes that ultimately underlaid my own departure brought the entire company to the breaking point, as well. For a while there was guilt—I should’ve done more, worked longer, had better ideas, been less of a perfectionist—and then there was a lot of anger. Lately, there’s something like acceptance of what happened and my role in it. I don’t know how closely the mental process I went through followed the stages of grief, but most of them were in there somewhere.

A lot of things were conceived of wrongly, planned wrongly, executed wrongly, and finally went wrongly. This isn’t a story of how a short-funded startup with a popular niche app (at one point featured in an early iPhone “there’s an app for that” TV advertisement) gets put into the ground. Instead, this is a cautionary tale of what can happen when Agile goes wrong. It’s easy to claim to be “Agile” when what you really mean is that you’re just too small to have built up a bureaucracy around software development. In my case, there were some big warning signs that I completely ignored or didn’t know to look for, until it was too late.

The Complete Rewrite

What really attracted me to the company at first was the somewhat unique combination of a popular, seemingly simple app with a loyal user base that also had an atrocious user experience. It was well into the second version, both written by a contract studio, and had accumulated a fair bit of cruft around an initially awkward navigation scheme. It was a fantastic opportunity—fix up the design, make a popular app even better, and put a nice feather in my cap while simultaneously ridding the company of a chain around its ankle. I know I went into the position knowing we were going for “complete redesign,” but I’m pretty sure we didn’t decide on the complete rewrite until later. I don’t think there was any serious discussion of NOT doing a complete rewrite. That was a mistake.

The rewrite is basically pressing a reset button on your app. Almost everything gets thrown out, code-wise. You usually also decide to take advantage of newer tools and technologies, so some knowledge, experience, and time gets thrown out as well, in effect. Any testing, code confidence, or support history is gone. You’re starting completely from scratch. It can be pretty appealing, especially if you have no sentimental attachment to the current code base. It can also be disastrous.

In our case, a rewrite meant cutting ourselves off from Our App 2.0, our users, and all of our success up to that point. Rather than iterating feature-by-feature, cleaning up and improving the app by degrees, and letting our users come along with us while we improved a stable code base into the app we wanted we instead effectively stopped development—from the outside perspective—and got lost in an increasingly hellish year of trying to recreate what we already had. Finally, when the app was released, an explosion of user anger completely rejected the rewritten app, which plummeted to single star territory in the App Store.

The Grand Vision

At the root of the guilt I felt after the company folded was my role at the very start of the process. I don’t remember if this happened after we decided on the rewrite or if it was part of the discussion. Either way, I went into a planning meeting for 3.0 to pitch my vision, and it was a doozy. In hindsight, it probably could have been a 3-5 year vision, or even just a concept around which to build reasonable, real-world plans. In actuality, everyone either loved it or accepted it without much comment and it became our 6-month blueprint. Words and phrases I used—often simply as concept or metaphor—were explicitly applied to features and concrete elements in the app. Some of them even ultimately appeared on the marketing website after release.

Even Agile, which eschews specifications and up-front planning, has an ultimate objective that, at its core, will remain more-or-less fixed, barring catastrophe. The Vision was at once too much and too little—too ambitious and over-specified to give us the flexibility to adapt in order to preserve our core objective, and too underspecified to let us anticipate, plan for, and handle the problems we were going to hit. It got to the point where I would cringe every time someone would use one of those words I threw out during the pitch, which were increasingly sacrosanct, even as development dragged on and things just weren’t coming together.

The Business Case

I have to first say that I have the business awareness of a fruit fly—I once accidentally got going on a rant about “vulture capitalists”—while talking to a venture capitalist. If you need me outside of the dev shop for anything it’d better involve coaching and clear expectations of what it is I’m needed to wear, do, and say. That said, I understand the point of a business is to make money, and ultimately the responsibility of the CEO is to see to that. I have tremendous sympathy for a guy trying to turn an early, surprising success in the app space into a going concern with just a few months of time and with less than a million dollars raised.

What that meant for the developers is that the concept that was pitched and embarked upon for 3.0 quickly became an iron-clad part of the business plan. I was in no way involved in or privy to the money-raising or deal-making, but I saw how quickly what should have been a “stretch goal” became a hail mary, a hill for the company to die on for lack of anything else to do. It’s hard to be “agile” when your “minimum viable product” is determined not by your users or the features you want and can implement quickly, but by the need to meet unyielding business case requirements. Our “minimum viable product” was the company-saving potential of a “Wow!” release that would knock everyone’s socks off.

That reality, spoken or unspoken, set the tone for the majority of the development process, but it also manifested in very concrete ways, particularly during the end. We were writing code, designing interfaces, and implementing features for at least one business development deal that could be described, at best, as “ancillary.” Another was in the talking-and-planning stages but, thankfully, didn’t progress to the point of actually bogging down the development work. Agile, like any methodology, must be put to use in service of the company’s broader business goals—but it is also easy for it to be sabotaged by specific business goals. With money running out and everyone on edge it can be hard to see the difference.

The Unending Marathon

For me the most important tenet of Agile is “release early, and release often.” This is the distillation and union of three simple ideas: the minimum viable product, the sprint, and producing “usable software”.

The “MVP” can mean a lot of things to different people—there’s a graphic out there somewhere ridiculing the idea that the MVP for an automobile is a push scooter, rather than, for instance, a bare chasis with 4 wheels and an engine. Whatever actual form it takes, in order to be meaningful it has to be two things: (1) Comprised mostly or entirely of code and features that will continue to be relevant for the entire remaining development process and (2) A fairly small fraction of the entire development process. Simply put: you need both something small and something you can build around.

A sprint is a well-defined period of time (1 or 2 weeks), at the beginning and end of which you should have a high quality piece of software. This is the “usable software” part—sprints are meant to be a sustainable, cyclical process of evolving a piece of software incrementally while maintaining high standards for the product at each junction.

For various reasons—implicit cultural reasons and unspoken business reasons—we weren’t ever going to consider releasing ANYTHING until we had 3.0 wrapped up in a bow. Unfortunately, it’s really easy for “MVP” and “usable software” to become a joke when nothing actually has to get released to anyone. Sprints become simple deadlines, which are blown through or extended as convenience requires. Our idea of “usable software” was whether the damn thing compiled and passed tests, not whether we had a piece of high-quality, releasable software at the end of each sprint.

On the flip-side, we went through a lot of changes with each sprint. Sprints felt like an excuse or opportunity to make or propose changes in a pretty ad-hoc fashion. Every time a sprint ended and I looked at the awkward, half-finished app there was another big tweak I wanted to make to steer it back towards the Vision. The sprint imposed upon our rigid big-picture plan a lattice which brought the lack of small-picture plan into stark relief. It was easy to see where things on a high level weren’t working, and all I could do to fix that was muck with them on the immediate scale of the sprint and whatever features we were working on at the time.

The closer we got to the finish line, the further away it seemed to be. At the end it felt like we were standing still. It was very distressing to find myself, after 10 grueling months, staring across the canyon between where we were, and where we needed to be a month ago—with no path across. It was increasingly paralyzing, and the only plan anyone else had was just to churn through the remaining holes to get us up to the near edge of the cliff. Rather than let the MVP, each sprint, and our releases guide our development we had random-walked right into oblivion.

And So It Goes

Like I said at the beginning, a lot of things went wrong. I don’t know how things would have turned out if we had been more Agile, in a meaningful way. I’d like to think we’d at least have put out some pretty good software releases along the way. I’d also like to think that, regardless of the outcome, my confidence as a developer and designer would have held up much better. Whether we’d have kept the amity of our users, or satisfied our investors… who knows. Maybe there wasn’t any winning—we were short on time, money, and room to maneuver in our industry. Maybe, in the end, we were always fated to be one of the 90% of startups that fail. The only thing I really know for certain is that whatever we were, from start to finish, we weren’t Agile.

Classes, Methods & Functions—Oh My!

Lately I’ve been thinking a lot about responsibilities, and when a given responsibility should be a class, when it should be a method on a related class, and when it should be a function. Methods are almost always a convenient and straightforward option, but they are also inappropriate for a great many of the things you want to do with/to an object. Refactoring, after all, very often involves restructuring a warren of methods on a single class into a constellation of objects that work together via composition. There are obvious examples of things that are simple to bang out as methods but really shouldn’t be done—saving to particular file formats, generating reports, business transactions—but what big picture rules are there to guide us?

I’m currently writing a series of classes to take a set of data and ultimately render it as a graph in a UIView. The controller is responsible for collating the data into a Graph struct, which is passed to GraphView. We cross the controller-view barrier with a medium-level object that describes the output we want, but we leave the particulars to the view. The first step in getting something that can be displayed (specifically, a CGPath to render using CAShapeLayer) is to use a GraphVectorizer object to generate a description of the graph as a path. GraphVectorizer is a protocol—so that different styles can be implemented as separate classes—with the GraphView being agnostic as to which one is actually used.

GraphVectorizer does not return a CGPath. CGPath is an opaque data type, and while technically it can be introspected in a limited fashion it isn’t really amenable to being compared to other CGPath values all that easily. GraphVectorizer isn’t simply doing grunt conversion work, however—a lot of our important logic about how things get displayed lives in these classes, with the potential for edge and corner cases. In order to facilitate easy testing, we instead return our own transparent Path type, which is essentially an array of CGPathDoSuchAndSuch function calls stored as enums. For each style we can vectorize a Graph, compare the returned array, and be confident that we’re going to end up with the CGPath we want to display.

The question now is what form does the code to turn my transparent Path type into a CGPath take? Pragmatics dictates that it simply be a method on my Path type—this will only ever conceivably be used as an intermediary for generating CGPaths, and we’ve already decided to couple to the CGPath interface fairly tightly. But step back for a second and consider that we might have other drawing system possibilities at play—perhaps something OpenGL based, or the slightly higher level UIGraphics. I often feel stuck seesawing between the unsatisfying options of a very simple—often single method—class, or a top-level function, floating off by itself. A third option—static methods on a bucket struct—is equally unsatisfying.

I’ve been ruminating on some rules to help guide myself in these situations, and others. These are just possibilities, and nothing I’ve set in stone:

  1. Instance methods can receive and return values of the same type, or a lower-level type. Equals should meet only in a neutral place. Thus, for instance, a PNG could take a UIColor and return a count of pixels close to that color, but it could not take a JPEG and return an estimate of how similar the images are, nor could it return a JPEG from a conversion method.
  2. Instance methods should never return a higher-level type.
  3. When two different types that are “equal enough” need to interact, the default should be a full class, for flexibility of implementation.
  4. If the implementation devolves into a single method, it should be removed to a free function.

Protocols and Extensions

As should be obvious from the above, I’m writing an iOS application. Not so obvious is that I’m using Swift, and not Objective-C. Swift allows the extension of types with locally visible additions. Random new methods could be added to a type, or protocol conformance could be added. It’s a very powerful feature, if a bit uncomfortably close to monkey-patching for my taste.

Is this a situation where a CGPathConvertible protocol could be declared, and an extension to my Path type provided to implement that conversion? It depends. My rule on extensions is that, if they’re not exceedingly low-level additions, then they should be exceedingly simple. An extension might be a good place for code that hits rule #3 to end up, provided it doesn’t violate rule #2. One can imagine a Rails-esque extension to Int along the lines of - number(int: Int, ofThing: Thing) -> [Thing], and weep.

How Do I Unit Test This?

Hang out in IRC, Slack, or Gitter rooms for OS projects for a few days and before too long you’ll see someone ask how to unit test some part of their app. It’s particularly common with large frameworks that encourage inheritance over composition, which usually results in a great deal of environmental setup standing in the way of efficient, automated testing on a unit basis. It sometimes makes me feel bad, but usually my answer is: you can’t.

If you’ve lashed your code so tightly to your framework that you need to jump through hoops to test it, then you’re almost certainly not unit testing it. Testing code that’s in a subclass of ActiveRecord::Base is an integration test. Testing how an Angular component renders using the framework’s templating system is an integration test. It’s hard to write a unit test when your app is forcing you to write an integration test.

Why do we even test at all?

When it comes to testing—any testing—one must always keep in mind that the actual point of the testing is to help us write better software, not to meet some quota for code coverage or tests written. So many devs are content to write bullshit, space-filling tests just to keep up appearances, or out of a sense of obligation. The emphasis in some communities (cough-ruby-cough) on “test-driven” design or development is particularly problematic here, since too often there’s an over-emphasis on writing the test first as the only hallmark of TDD, and a complete ignorance of how to let the test drive the code—the actually important part.

“Best practices” or “being idiomatic” aren’t magical outs here, either. Design patterns and best practices are great, insofar as they actually result in good code design. It is self-evident, however, that if the way a developer codes and tests is predetermined by cookie-cutter-style conventions then that developer is not letting the tests drive anything other than the clock. While this often puts the lie to claims of test-driven development, it isn’t just a concern for aspiring practitioners of TDD—Awkward, jury-rigged, and brittle tests should be setting off alarms and clueing us into code smells and technical debt whether we write tests before the code or after.

Just calling it a unit test doesn’t make it a unit test

When it comes to unit testing in particular—where TDD is most natural and effective—there are two rules to follow in order for something to be a unit test, in a meaningful sense:

  1. You need to be able to mock any dependencies of the unit
  2. You need to own all the dependencies of the unit

These rules, just like the rule of writing tests at all or writing the test first, are in service of a higher goal: allowing the process of writing tests to make it clear to us where our design needs to change. This is the most basic way tests “drive” development—by encouraging design choices that make it possible to test in the first place.

Two approaches to mocking

One approach to the first rule is to figure out how to reduce and simplify your dependencies. Just by chopping up one class into several—each with one or two dependencies—the code almost magically becomes much more easily tested, refactored, and extended. This is a classic example of the test driving the improvement of your code by encouraging the separation of responsibilities.

The second approach is to look at your oodles of dependencies and piss and moan about all this mocking you have to do. Slog through it for a few hours. Pop into a chatroom. Let someone tell you that you can just test directly against the database. Write an integration test disguised as a unit test. Finally, call it a day for the rest of your career.

One reason so many developers insist on the tests adapting to fit their design, rather than the other way around, is because it isn’t actually their design at all. Frameworks that encourage code to be piled into a handful of classes that fit a set of roles determined by some development methodology do developers a disservice. Frameworks aren’t bad, necessarily, but when it’s considered “best practice” for the developer to forfeit all responsibility for their app’s architecture and their code’s design, it makes it impossible for the developer’s tests to inform the development process.

If you start off by subclassing someone else’s code you’ve almost certainly fallen afoul of the first rule right from the start. You’ve introduced a massive, irresolvable dependency into the very foundation of your code. Sometimes you’ll have little choice but to rely on scaffolding provided by the dependency in order to test your own code, integration style.

The two operative words in the first rule are “you” and “able.” The rule isn’t “It must be theoretically possible for someone, with unbounded knowledge of the dependencies, to mock the dependencies,” or even “You need to have mocks for the dependencies, from wherever you can get them.” If you can’t look at the class and immediately know what needs to be mocked and how to mock it, that should be a huge red flag.

Only mock what you own

The second rule is a consequence of a third rule: only mock what you own. You own your project’s pure classes, and to the extent that you subclass you own whatever logic you’ve added. You don’t own the base classes, despite their behaviour being incorporated via inheritance. This is another rule where the face value isn’t so much the point of it as the consequences: by only mocking your own classes, you’re pushed into building out facade and bridge classes to formalize the boundaries between your app and any external systems.

Tests are much more confidence-inspiring when the mocks they depend on are rock-solid doubles of tiny classes each with a single responsibility. Tests that instead stub one or two methods on a huge dependency are brittle, are prone to edge cases, increase coupling, and are more difficult to write and tweak with confidence. Tests of classes that themselves have to be stubbed are almost worthless.

Thinking outside of the class

Following these three rules can help put the focus back on writing well-structured, maintainable code. It’s not always obvious, however, what changes need to be made. If a developer is staring at a class that descends from ActiveRecord::Base, and which includes a couple of plugins, along with a raft of methods that all need to be tested it’s understandable to look askance at the notion that AR and those plugins should be expunged in order to test the class. After all, without AR they don’t even have a class to begin with, right? The path of least resistance all too often is just to write an integration test using the entire stack.

In these situations one must keep in mind that “unit” and “class” are not identical, and to ask not “how can I possibly remove these dependencies from my class” but “how do I remove my code from this class, which I don’t really own?” By moving those methods off to other classes as appropriate (formatting, serializing, and complex validations are things that might be on an AR class that can easily be broken out into their own plain-old-ruby classes) we’ve accomplished the same thing. So much ActiveRecord-dependent code can be refactored to depend only on a hash (or OpenStruct) of attributes.

It’s possible to use monolithic frameworks and still care about good design. Finding ways to take ownership of our code away from the framework is crucial. Your tests should be a searchlight, pointing out places where your code is unnecessarily tangled up in someone else’s class hierarchy.

Preventing bad testing habits

Developers often begin their professional life with a few high-level heuristics that are, unfortunately, continually reinforced. A few relevant ones:

  1. Minimize the number of classes to write and test
  2. “DRY” code up by relying on libraries as much as possible
  3. MVC means my app is made up of models, views, and controllers

It’s not difficult to see how these lead to large, fragmented classes tightly coupled to oodles of dependencies. The resulting code is going to be difficult to test well in any circumstance, and will bear little resemblance to anything that was “test-driven.” I’d like to suggest some replacements:

  1. Minimize the number of dependencies per class.
  2. Minimize the number of classes dependent on an external dependency.
  3. Write the code first, worry what “category” each class falls into later.

The first will result in more classes, but they’ll be more easily tested, refactored, and maintained. The second encourages dependencies to be isolated into bridge, adapter, or facade classes, keeping the dev’s code dependent on interfaces he or she owns. The third breaks the MVC (among others) intuition pump that says every class we write has to fit one of two or three possible roles. A dev utilizing these heuristics will find themselves asking “how do I unit test this?” far less frequently.

Now, “how do I integration test this” is a different question entirely… more on that later.

Monads and Ruby

Monads have a weird and varied reputation outside of the FP universe. For Rubyists, in particular, monads and functional programming can look alien and nearly unparseable. Ruby is aggressively object oriented—it doesn’t even have first-class functions, technically—and the foreign nature of a lot of the background necessary to grok monads leads to indifference at best and hostility at worst.

On that score, I once overheard (after mentioning monads at a Ruby meetup) someone define a monad as “something assholes talk about to seem smart.” There is way too much knee-jerk rejection by some in the Ruby community of things they don’t immediately understand or find comfortable, but that’s another post—or multi-year psychological survey—entirely. This isn’t an article about why monads are awesome and why Ruby devs should love them.

Other than the indifferent and the hostile, there’s also a weird middle group of Ruby developers who are enthusiastic about monads, but who drastically overthink their implementation. I recall coming across a project that was mostly just an ersatz implementation of algebraic data types and type checking in Ruby. I definitely appreciate the benefits of those things, but Ruby just does not have either, and besides that we don’t need them to use monads—in Ruby or any other language.

What is a monad? A refresher

At their core, monads are just another design pattern, like the command or visitor patterns. Here’s a simple definition of a monad, or at least I think so, courtesy of Jed Wesley-Smith, and paraphrased by myself:

A type T which encapsulates value a (T a), and for which there exist
functions such that:

a T a
T a (a T b) T b

What’s interesting about this definition is that, in a philosophical sense, something is a monad regardless of whether you actually write down and implement those functions in code. Either the functions exist, and someone somewhere could write them and use them in their code, or they don’t.

This is the concept of mathematical realism, which underlies the notion that we “discover” mathematics as opposed to invent it. Max Tegmark, an MIT physicist, extends this into his hypothesis that the universe itself is essentially mathematical, and, as a consequence, all possible mathematical structures exist, in some meaningful sense. Here the idea is much simpler: if it is possible for an object to be a monad, then it is a monad, whether that was your intention, or not, and regardless of the extent to which it looks like a monad in another language.

As Rubyists, things shake out even more simply since we don’t have much in the way of typing to worry us. If we implement #bind (the second function) there’s no mechanism for defining or enforcing type signatures, so #bind and #map have the exact same signature in Ruby. As a result of duck typing the only real type signature is arity, but that being the case remember that a monad in the general case is a mathematical entity—not a type or a class—and as such it is and remains a monad only so long as we use it as one.

Ruby almost has a built-in monad already

We’ve already recognized the similar shape of #bind and #map, but what about that first function, usually called #return? #return, being a method that takes a value and returns an instance of a type, is, in Ruby, just a constructor. Actually, it isn’t strictly identical: with return, there’s a universal interface, while initializers have hard-coded and peculiar names. This is a direct consequence of dynamic typing and the differing natures of OO vs Functional Programming.

Array, of course, has both an initializer and #map. Can Array#map function as Array#bind? Unfortunately, not quite. Let’s look at that signature again.

T a (a T b) T b

In terms of Array, this looks like

Array a (a Array b) Array b

So, #bind takes a block that returns an Array of a given type, and then itself returns an Array of that given type. #map doesn’t work that way. If you tried to use #map like #bind you would get the following:

[1, 2, 3].map { |x| [x + 1] }
=> [[2], [3], [4]]

Clearly, not what we wanted. #map gave us an Array of Arrays, not an Array of Integers like we’d expect from #bind. Luckily for us, it looks like there’s a simple transform from one to the other. We’re just one #flatten call away from having an Array monad, in theory:

refine Array do
  def bind(&block)
    map(&block).flatten
  end
end

[1, 2, 3].bind { |x| [x + 1] }
=> [2, 3, 4]

With just five lines of code we seemingly now have a monad in Ruby, no complex type enforcement necessary. It remains, however, incumbent on us, the developer, to maintain fidelity to the monad requirements, as with all other informal contracts in our code. With our implementation above we could ignore the laws and use #bind exactly the same as we’d use #map, and it, surprisingly or not, would work just fine:

[1, 2, 3].bind { |x| x + 1 }
=> [2, 3, 4]

This works thanks to the specifics of our implementation, and we all know not to depend on knowledge of an implementation rather than interface when we rely on a library method, right? In fact, in this case it’d be an even bigger mistake, because the implementation is flawed. It works for Arrays of numbers, strings, your own classes, etc… but it doesn’t work for Arrays of Arrays.

[[1], [2], [3]].bind { |x| [x + [4]] }
=> [1, 4, 2, 4, 3, 4]

The monadic law has been broken: #bind has given us an Array of Integers, instead of an Array of Arrays of Integers. We can tweak our implementation to fix this, but in doing so we’d break (and have to fix) any uses of #bind that ignore the law and treat it like #map:

refine Array do
  def bind(&block)
    map(&block).reduce(:+)
  end
end

[[1], [2], [3]].bind { |x| [x + [4]] }
=> [[1, 4], [2, 4], [3, 4]] # Good!

[1, 2, 3].bind { |x| x + 1 }
=> 9 # Bad!

[1, 2, 3].bind { |x| [x + 1] }
=> [2, 3, 4] # Good!

So that leaves us with an Array monad, which is of limited usefulness without the other List goodness in Haskell. A far more universally useful monad is Maybe (or Optional for Swift devs).

Call me (maybe)

Maybe represents the possibility of there being a value, or there not being a value, without using nil. This means we can call methods on the result of an operation without worrying about which situation we’re dealing with. If we actually do have a value, calling #bind (or other related methods) operates on the value. If we don’t have a value, #bind short-circuits and simply returns the empty Maybe. It’s basically rails’s #try on steroids.

b = foo(5) #<None>
b.map { |x| x * 2 }
=> #<None>

c = foo(2) #<Some @value=2>
c.map { |x| x * 2 }
=> #<Some @value=4>

Maybe and Optional are the names of types for this monad in Haskell and Swift, respectively, but that doesn’t mean there has to be a corresponding class in Ruby. Haskell and Swift’s implementation uses algebraic data types, which are great, but they’re not objects and Ruby doesn’t have anything similar. So when we talk about Maybe in Ruby, we’re not actually talking about anything called Maybe in our code, but the coupling of two types that we can implement: Some and None. In a sense that’s all Maybe/Optional are, as well: a combination (called a tagged union) of two other types.

Here are the Haskell-ish type signatures for Maybe (Haskell uses Just and Nothing rather than Some and None):

Maybe a = Some a | None
bind :: Maybe a (a Maybe b) Maybe b

#bind is the same as before: it takes a function that receives the value and returns another Maybe, and itself returns a Maybe. So, a block passed to #bind has to return either Some a with a new value or None without a new value. There are no other choices. Ruby obviously will let us return anything we feel like, or even however many different kinds of anything we feel like. We can’t rely on type checking to help us here anymore than we ever can, and trying to build some ersatz type enforcement just for this special case makes no more sense than it ever would. So, what do we do? Well, we return either Some a or None. It’s as simple as that.

Some = Struct.new(:value) do
  def bind(&block)
    block.call(value)
  end
end

None = Class.new do
  def bind
    self
  end
end

That’s it. That’s all we need in order to conform to the laws. There is, almost literally, nothing to it when written down in actual code. Concerned that #bind will let you return anything you damn well please? So will almost any other Ruby method, so don’t sweat it too much. I don’t mean “don’t test your code,” or anything so laissez-faire, but don’t get too caught up in the lack of type checking. That’s a red herring, and either way your monadic code isn’t any worse off than the rest of your Ruby code in that regard.

One very real downside to forcing your code to care about return types is that you lose the benefit of duck typing, and couple your use of a monad to your specific implementation. Theoretically, if you were to use a library or other shared code with methods that returned a Maybe, its return values should be interchangeable with your implementations. Some#bind will work as expected, None#bind will short-circuit as expected, and so on. Now, there might be other differences you care about (particularly around what utility methods are implemented/exposed), but when it comes to the monad type, the behaviour of #bind is the only thing that matters.

One more thing to be careful of: #bind has to return a monad of the same kind. Returning Some from an Array#bind call, or [] from a Some#bind are both monads, and will both respond to #bind in turn, but they aren’t valid invocations. You can nest #bind calls, of course, but when it comes time to return, make sure you’re returning the same kind of monad as you started with.

arr = [2, 5, 7, 9, 4]

greater_than_5 = ->(x) { x > 5 ? None.new : Some.new(x) }
even           = ->(x) { x.even? ? Some.new(x * 2) : None.new }
increment      = ->(x) { Some.new(x + 1) }

composed       = ->(x) { Array(Some.new(x).bind(&greater_than_5)
                                    .bind(&even)
                                    .bind(&increment)) }

arr.bind(&composed) - [None]
=> [#<Some @value=5>, #<Some @value=9>]

I have, in my own projects, a pair of module methods—Monad.bind and Monad.compose—for simplifying monad composition. compose actually is just a bit of sugar on top of reduce and bind, which does the heavy lifting. By using Procs and composition it’s trivial to build up a set of simple transforms into more complex operations. They’re very easily tested, as individually they’re just procs which respond to #call, same as always.

Either

The Either monad is similar to Maybe, except instead of just one value or nothing you get either a Left with a value or a Right with a value. Left often represents an error, and Right a succesful result. The implementations are equally simple:

Right = Class.new(Some) # Right is identical to Some

Left = Struct.new(:error) do
  def bind
    self
  end
end

Conclusion

Does the rigid type enforcement of a language like Haskell or Swift help catch bugs in your use of monads that you might trip over in a permissive environment like Ruby? Absolutely. That doesn’t mean monads have no value to Rubyists, or that we have to turn the language on its head to mine that value. We can build some nice APIs on top of these basic implementations, of course, to add some safety or convenience, but at their core there’s nothing about monads that’s incompatible with Ruby or that even qualifies as nonidiomatic Ruby.

Using Github OAuth For Sign-in

This weekend I went to implement sign-in with Github (specifically, since my app is dev-focused). Having used it to sign into apps in the past, I had the vague idea that Github was an OpenID provider. It’s not, it’s just a plain-old OAuth2 server, so even more than usual it’s incumbent on the developer, i.e. me, to ensure it is used securely. There are a few major considerations.

It should be well-known—but probably isn’t—that Oauth 2, by itself, is an authorization protocol, not an authentication protocol. In authorizing a user, we want to gain access to some protected resource. In authenticating a user, we want to affirmatively establish their identity. OAuth specifies a rock solid flow for the former, but mostly ignores the latter. This might seem counterintuitive if you’re used to thinking of OAuth as an authentication mechanism, but when you consider that the point of the protocol is to enable password-less access to data owned by a third party service it actually makes a lot of sense that authentication for your app is out of scope.

It’s still possible to use OAuth2 as part of a secure authentication process. In fact, there are extensions that build on OAuth2 to create such an authentication protocol, such as OpenID Connect or Facebook Login. OAuth isn’t inherently insecure, it just doesn’t go that last mile to provide the functionality we want by default. Absent implementation of an additional authentication layer spec, there are steps that an application developer can take to use OAuth for authentication.

Properly configured SSL is absolutely essential

SSL can be an incredible pain to setup: acquiring the certificates, processing them in obscure ways, installing them, configuring it all, and then dealing with potential errors and issues in your code. OAuth 2 relies heavily on SSL to provide security, however, and as such eschews techniques such as token signing to provide fallback security. Improper use of SSL can expose client secrets in addition to communications between your server and the browser. When faced with certain errors it can be tempting to toggle obscurely named settings that seem to resolve the error. Oftentimes this can have the effect of severely crippling SSL, for instance by disabling certificate verification.

Only use the server-side authorization grant

Using the implicit grant is inherently insecure, as a “bad actor” app developer can attain an access token for someone who has logged into their app and has also logged into your app. That token can then be injected into your app. Your app would have no way of knowing that the access token provided was issued to another app entirely. If a user is logged in based solely on verifying this access token, then your user’s account can be completely compromised. Github does not support the implicit grant.

Block cross-site scripting attacks

OAuth2 provides a ‘state’ parameter that, when provided as part of the authorization request, will get returned as part of the callback. This parameter should be an unguessable string that can be verified, usually by tying it to the session. Cross-site attackers won’t be able to guess the ‘state’ and thus won’t be able to inject arbitrary access tokens into your app. This sort of attack is obviously significantly more dangerous when it comes to apps that export data.