Distributed computing fallacies and REST

One item to keep in mind when investigating consuming or exposing REST APIs are the fallacies of distributed computing:

  1. The network is reliable.
  2. Latency is zero.
  3. Bandwidth is infinite.
  4. The network is secure.
  5. Topology doesn’t change.
  6. There is one administrator.
  7. Transport cost is zero.
  8. The network is homogeneous.

If you’re consuming a REST API that you don’t own, or are exposing a REST API to other systems that you don’t own, in either case the distributed computing fallacies still apply. The one I see lost the most is #1 – that the network is reliable.

If I’m relying on a 3rd party REST service to be “up”, and my business depends on that other side being up, I better have some sort of mitigation plan other than an error page for a user. You might be able to have higher bars if you own both sides of the equation, but when you don’t?

Instead of looking at things in terms of “IF”, think instead in terms of “WHEN”. WHEN the other side is not available, how can I still provide a viable experience for the end user? Does their experience require the other side to be up, or can I look at concepts such as Eventual Consistency and Asynchronous Messaging to improve their experience (and improve scalability/reliability)? Have I looked at the coupling of my system to others, in terms of Temporal and Behavioral coupling?

If I’m building mashups of multiple systems to quickly build something of potential interest, perhaps these issues aren’t necessary to look at yet. But if I’m building business-critical applications and my ability to serve customers depends on 3rd party REST APIs, I should probably revisit the above problems.

In my systems that interact with 3rd party REST APIs, it’s pretty straightforward. Each read caches data locally, and each write is wrapped in an NServiceBus message. I can scale out very easily and am able to mitigate many of the distributed computing fallacies.

If I’m going to expose to a 3rd party a REST API for a business-critical system, and they’re not able to mitigate these issues, it’s almost better NOT to expose an API directly and instead go through other implicitly asynchronous models, such as straight-up FTP file drops. It’s not ideal, but if the 3rd party doesn’t know how to integrate to a REST API and not complain when I take the server down for maintenance, then I go the file drop route.

It’s better at that point to pick an integration strategy that accidentally addresses the distributed computing fallacies than one that 3rd parties don’t know how to address.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in Architecture, DistributedSystems. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • +1 I think that this is the biggest inhibitor to the integration of cloud technologies (AWS, Azure, maybe even including outsourced email) for business-critical operations.

  • I totally agree.  The fallacy that not only is the service there, but more about the other bullet points.

    (I know you could look at all the return status codes, but imo that is alot of defensive coding that creates a mess on the client)

  • These are great principles, but I don’t see why you think that moving away from HTTP is necessary because of them. Don’t ignore the robustness built in to HTTP through proper use of caching and polling/retries. When a client needs a higher reliability SLA than a server is currently built to provide, the solution is to implement a cache. This can be either client side or server side (or potentially in an intermediary, eg: CDN). This can be combined with eager fetching to seed the cache. Eg: support a collection resource that provides a feed of recent changes and poll the hell out of it. Leverage etags so that this polling is extremely efficient.

    I don’t buy your FTP file drop point at all. Why exactly can you not use HTTP instead of FTP to implement the file drop? It almost seems like you’ve forgotten that it’s OK to server static content via HTTP. When you do this properly,  your client can leverage HTTP’s caching and change detection capabilities (etags, conditional GET). Even if your content is dynamic, if you front your app with a cache, you can force a cache refresh just before your maintenance. One strategy for this is to use archived atom feeds for your business events. Because an archive never changes, it is extremely cacheable static content. Only the most recent page is actually volatile, and so you can persistently cache each archive for a very long duration. To seed the cache before maintence, GET the volatile page and roll forward until you hit a page that was already cached. This is basically adding your file drop right in front of your dynamic app. If you notice, having clients poll and archived ATOM feed is essentially the eventually consistent, asynchronous messaging solution you mention, but I haven’t moved away from the standard HTTP stack.

  • I do CSV via SFTP unless someone gives me a very compelling reason why it needs to be any more complicated. So far, after about 5 different jobs and about 30 or so different integration scenarios, nothing’s beat CSV via SFTP.  Developers wanted to do fancy SOAP or REST or whatever APIs and go “real time” or integrating through sprocs or log shipping or some other complicated, over-engineered error-prone solution, but CSV via SFTP takes a few minutes to set up and works every time.  I know I sound like a curmudgeon, but I’ve tried it the other ways and they have tons of problems and cost a lot more. 

  • Pingback: The Morning Brew - Chris Alcock » The Morning Brew #862()

  • Chris Brandsma

    Then if you go mobile you get  a new #1: The network is everywhere.
    It is common for people to think that data connections can be obtained anywhere, and they can’t.   Also, once you go to a 3g/4g network, the latency and cost of sending data doubles (at best).  Latency alone is often at least half a second (that is a good day).

    It should go without saying, but this is true for all phones.

  • Pingback: Distributed Weekly 105 — Scott Banwart's Blog()

  • Pingback: REST: Links, News and resources (1) « Angel “Java” Lopez on Blog()

  • Pingback: Distributed Computing: Links, News and Resources (1) « Angel ”Java” Lopez on Blog()