API-first architecture, or the fat vs. thin server debate

jacques_chester · on Oct 13, 2013

I remember when this sort of thing used to be called SOA. Before that there was n-tier, before that client-server, before that people used to talk about terminals.

    What has been will be again,
    what has been done will be done again;
    there is nothing new under the sun.

The best place to park different parts of the logic varies according to the algo-economics of the day. Right now people have fast client environments, so the balance is tipping back to heavy clients.

1qaz2wsx3edc · on Oct 13, 2013

SOA (Service-oriented architecture) is a pattern to split an application into discrete pieces of software providing application functionality as services to other applications.

Fat vs Thin is more over a debate about where to place logic. Thin implies controllers are small, and models have a wider responsibility. Fat is the inverse, where controller do the majority of work, and models are a small abstraction.

My point is this: They are not the same thing. SOA helps in designing scalable, robust systems. Fat vs Thin helps you decide where logic should go.

I'm proponent of SO design and thin controllers.

PaulHoule · on Oct 14, 2013

Well, unless you design them for high performance from the very beginning, SOA also helps in building slow systems that your customers will hate and that will fail in the marketplace if the people who buy the software actually use it.

(I was at Altassian's San Francisco office a few months ago and I said half-jokingly that I quit my last job because I couldn't stand waiting for Confluence, and all the other off-brand webapps my employer used, to load.)

Now I use Github, which offers true interactive performance even over crappy DSL connections. I'm not going to work for anybody who makes me use some off-brand webapp that makes me wait 30 seconds to put time on a ticket or to edit a Wiki page.

SOA can be part of the solution rather than the problem, but it takes a clear architectural view from day one and a commitment to avoid stupid and obvious mistakes, as the article points out.

Gigablah · on Oct 14, 2013

Not sure why you're blaming SOA there. Atlassian apps are standalone instances so, assuming your former employer self-hosts, it has more to do with your own server and network performance.

In fact GitHub is most likely closer to SOA architecture than Confluence is.

PaulHoule · on Oct 14, 2013

They weren't self-hosting.

And whether it is SOA or not, the moral is that many many systems are designed and sold as if performance didn't matter.

I had another job where the team toiled for years to develop an SOA system that, when we first stacked it all up, took 45 minutes to boot up an RIA because it did 50,000 requests to initialize it's state.

Sure, this is stupid design, but people do this all the time.

I was able to speed the boot time to 20 seconds by packing much of the data with the application file and creating one "initialize" call that downloaded all of the data necessary to initialize a particular instance of he app in one shot.

Had performance been built into the app from day one that app might have seen the light of day.

Jormundir · on Oct 14, 2013

Fat vs. Thin does not apply to a specific topic. Fat vs. Thin can apply to controllers and models, but that's an exhausted argument, in models code is more reusable, fat controllers are, in my experience, the most obvious symptom of not understanding DRY principles and of not having good conventions and architecture. In the client-server realm, fat vs. thin is about how much code to run on the client side, i.e. how much javascript to deliver, how much of the UI is handled by the client before the server is bothered.

You can have a thin server with fat controller methods, or vice-versa. The point is the fat vs. thin battle happens in a lot of different areas, not just controllers.

fit2rule · on Oct 14, 2013

I came in here to find exactly this position, and I agree with you 100%.

What I think, as an aged software developer who has watched it all evolve, multiple times over, is that there really and truly is a software generation gap, whereby students of newly minted Comp-Sci'ish "Status-"Education Curricula are suddenly thrust upon the world, to do things new.

The problem is that too much studying is going on, and thus ignorance of actual industry predisposes re-invention. If Comp-Sci students started at 18 as junior programmers (like the good ol' days, before there were such things as Computer sections in bookstores..) and just got immediate practical experience across a broad swath of industrial computing applications, there'd be a lot more comprehension, on the part of the student, of just how much has been implemented in the Big World.

Its not a bad thing that there is all this sudden New School re-invention of long-abandoned tools and widgets, its just that there's an awful lot of grommets involved in the cultivation of it all, and sometimes .. yes kids, things do get 're-discovered' and pitched as 'newly invented and described here for the first time, ever, on this blog!' when perhaps a little actual investigation would lead the adventure to realize they were in fact already in well charted territories.

Its amusing, though, to see "API-first is what we call it" generate yet another word in the big, tempestuous, incestuous cloud that is Computing ..

vdaniuk · on Oct 14, 2013

Too much studying, you say?

fit2rule · on Oct 14, 2013

The truism is this: those who ignore history are liable to repeat it. This works in code thus: if you haven't got an implementation, implement. Dodgy stuff, that 'got'.

NathanKP · on Oct 13, 2013

One massive advantage of the API first architecture that wasn't mentioned in this article is that it makes proper unit testing much easier because of the clear distinction between the server side API code and the client side JavaScript/Mobile app.

Of course it is possible to test a "fat" server that does view and presentation logic, but it is much easier and more efficient to test a thin server and have a separate suite of tests for the client side app.

vojant · on Oct 13, 2013

Totally agree, when I first time used this architecture it was first time when I had over 80% of the server side tested.

PLejeck · on Oct 13, 2013

I've always been curious how people do integration tests on things with multiple servers (like SOA stuff), it seems like it'd be dreadful, unless you use Vagrant to (quite slowly) spawn a new VM every time you want to test.

Anyone with experience care to enlighten me?

saryant · on Oct 13, 2013

We heavily utilize Akka clustering for our backend and we do this using Akka's multi-node testing feature.

http://doc.akka.io/docs/akka/2.2.1/dev/multi-node-testing.ht...

For the most part we only spin up multiple JVMs on the same machine but it's not a huge leap to use multiple machines. We don't do so regularly because, as you said, Vagrant takes forever to spin up a new VM. Our Jenkins server does spin up multiple JVMs for our integration tests though.

tunesmith · on Oct 14, 2013

I've used mocking for this purpose. Not in the same way you would for unit testing, of course. But if you want to integration test the (entire) workings of a consumer of a service, then your test starts with its entry point (whether a form submission, an API request, or whatever). Then if part of your consumer's business logic relies on a call to an external SOA service, you mock that service. In other words you say, "Given this particular form of request that would be sent to the SOA service, respond with this particular service response". Then your consumer will continue on with its business logic, all the way up until it returns its response to your original request, and you can apply your assertions then.

It's not a true end-to-end test, which would require all your services to be up, but it is an integration test since it tests the entire workings of the consumer. The assumption is that other test suites would test each of the services, and maybe a ping test to test the availability of the services themselves.

PLejeck · on Oct 14, 2013

I've never been able to fully grasp mocking and stubbing.

tunesmith · on Oct 14, 2013

The most common objection I've heard to mocking is along the lines of, "If you're mocking, what is the point of the test?" The answer lies in having a clear understanding of what you are testing.

For instance, method A might call another class's method B. If you're writing a unit test for method A, you don't also want to test method B. (Other unit tests would test B in isolation, instead.) So in A's test, you make A call a mock B rather than the real B, and you ensure B returns a particular kind of response. In other words, you want to make sure method A behaves correctly when it gets certain responses from B.

So in english, a test is basically saying, "When B gives this type of response (which I'm forcing it to respond with), make sure A behaves as I expect", and a second test says, "When B gives this other type of response (which I'm forcing it to respond with), make sure A behaves as I expect.

So when you write a unit test for A, you mock B by making it send back a canned response and make A use it. Then you call A with some parameters, and then you make sure that A's return type is as you expect.

One advantage of this is that if a future code enhancement breaks one of the methods deep in the code, then the failing unit test will tell you exactly where the problem is. Without mocking, a failed unit test will make you have to examine several layers of stack trace to find the bug.

justincormack · on Oct 14, 2013

Also getting the right kind of response out of a back end server may be hard, eg getting certain kinds of error or unusual responses, so mocking is the only way to get them reliably.

They are not a substitute for end to end tests though, as you cannot tell if some other assumption is broken.

nl · on Oct 14, 2013

Mock = Responding to requests with no (or minimal) handling of logic. In SOA it can be as simple as putting a set of files containing the expected responses (in XML or JSON form) on a webserver and using mod_rewrite to make sure it responds with the correct file to a given request.

twic · on Oct 14, 2013

I work on a system split into 4-6 applications, depending on what you consider to be part of the system. We have a tier of integration tests that run across several of them. I can confirm that it is indeed often dreadful.

At the moment, we don't use VMs, we just fire up multiple servers on the same box. They use different ports, so it's no big deal. The tests are framed as being for one particular application in the system (eg a test that asserts that the front-end server displays the right profit and loss is a test for the front-end server, even though it uses the price server and the calculation server). Tests for a given application will use the current checkout of that application's code, and obtain packaged versions of the other applications from our internal artifact store. Tests always uses the latest versions of those available; the assumption is that the latest version will have been deployed by the time the code under test is deployed.

There are plenty of things that the multiple-apps-on-one-box approach doesn't test, mostly around networking, so we are trying to get our VM tooling up to speed to run multiple VMs instead. We already use the tooling to deploy VMs for apps in production, but it needs some tweaking to work well on desktops and in CI. It's not clear that this approach will completely replace the multiple-apps-on-one-box approach, because of the overhead in starting the VMs. We'll play it by ear.

Some of the dreadfulness of this approach comes from mismatches between test and production around versioning and networking. Most of it comes from the fact that in practice, it is much harder to track down the source of an error than when testing a single monolithic application.

Now, it could be the case that this pain, and the other pain resulting from splitting a system into multiple small applications, is worth it, because that splitting brings other advantages. I don't have a parallel version of our system implemented as a monolith, so i can't tell you if that's true. I have serious doubts about it, though.

AznHisoka · on Oct 14, 2013

I don't have such a problem, but in my current project, I am integrating with various APIs such as Twitter + Facebook, and unit testing my code can be a nightmare at times since I'm dependent on them being stable.

nl · on Oct 14, 2013

Docker might work pretty well.

(But generally I prefer the Mock method anyway)

nl · on Oct 14, 2013

Wow.. downvotes?

I guess I need to add an explanation or something?

Docker allows very quick startup of containers (and for testing purposes the distinctions between containers and VM aren't important. Additionally, the easy command line and app ecosystem Docker gives you over just using Linux containers is a big advantage here.).

I prefer the Mock method for two reasons:

1) It's extremely fast - much faster than even docker would be.

2) The data you use for mocks can form part of your documentation (if done properly).

coconutrandom · on Oct 14, 2013

+1 docker and jenkins and is a happy medium between speed and isolation.

joshfraser · on Oct 13, 2013

One of the biggest benefits I see to API-first is that it encourages you to document everything well, and that process usually leads to better organization and greater consistency.

As far as fat vs. thin, I worry that this article could be a bit misleading. It's great if you have an API that responds in 20ms as very few people can achieve that. But it doesn't matter much in a world where mobile network latency can easily be 200-500ms. If you're worried about app performance, I'd focus first and foremost on the number of requests you are making and making sure none of them are blocking. The network is the slow part. I know a lot of companies with blazing fast API's and sluggish apps, because they're optimizing the wrong thing.

PLejeck · on Oct 13, 2013

When developing an API, I like to practice "documentation-oriented design." The first thing I do is write documentation of the API, and then I write APIs to fit that documentation (iterating both in parallel once I start writing the API).

I figure that the documentation is the first thing a developer faces when using an API, so it should also be the first thing I write.

dreamfactory · on Oct 14, 2013

A related approach is client-driven development (just made it up). Write client code (or controller) in the most thin, elegant, and obvious brain-dead simple way, then build an API (or model) to service it.

PLejeck · on Oct 14, 2013

It's okay, I totally just made up "documentation-driven development" too. And yeah, that's the best way I find. Always build a consumer and then build the supplier, you'll usually end up with a great experience.

PLejeck · on Oct 13, 2013

The article makes the implication that MVC is inherently slow. It's not, in fact it can be quite fast if you use the right framework.

I'm assuming they only know that Rails is slow (which it can be, especially for a case where caching can't help much like in an API) and assume that's because of MVC and not because Rails is monolithic and Ruby-based.

Before I used Rails, I used an MVC consisting of Mongoose + Handlebars + Express in Node.js, and I regularly had pages render in under 10ms. The biggest bottleneck was consistently the communication with MongoDB, something I had very little control over.

So to conclude, stop blaming the architecture, start blaming the frameworks.

twic · on Oct 14, 2013

Yeah. I boggled a bit at:

"Still when one took the MVC approach and is now lacking an API, the easiest thing to do is to add a few views that output JSON and call it a “restful API”. [...] it does not scale and is horribly slow"

Why would rendering a view be any slower than serialising some objects to JSON or whatever? The only reason i can think of is that you've incompetently chosen to use some very slow rendering framework. So don't do that.

meowface · on Oct 14, 2013

>The biggest bottleneck was consistently the communication with MongoDB, something I had very little control over.

Interesting you say that, because one of Mongo's few redeeming features is that it's supposed to be pretty fast for both reading and writing. Exactly which part was the bottleneck? Was your app more read-heavy or write-heavy?

PLejeck · on Oct 14, 2013

The problem I ran into was that I needed to perform roughly 20 interdependent queries in a short period of time. At most, I could do 5 queries in parallel due to the interdependent nature, but I still ran into trouble with the database being slow.

In the end, I added Redis as a caching layer, which cut time-per-query down a few ms per request, which was significant in this case.

I can only imagine how bad it could've been if I had block for each request instead of using the asynchronicity of node.

meowface · on Oct 14, 2013

Right, caching is usually the easiest way to solve problems in those situations.

>I can only imagine how bad it could've been if I had block for each request instead of using the asynchronicity of node.

All modern web servers/frameworks are going to be asynchronous in one way or another, whether they spawn a new thread or process for each request (and for each database query), or whether they use polling and an event loop like Node. Node and HTTP servers like it are only a boon when you expect to have so many concurrent requests that threads/processes will begin to hog too many resources.

So unless your server was really being hammered with queries for hundreds or thousands of concurrent users every second, I imagine using Node was neither a significant advantage or disadvantage. And if it's the database or database driver that can't handle concurrency, then Node would do absolutely nothing to help there.

pearjuice · on Oct 13, 2013

Please note that the thin server should never be too thin. Logic should always happen in the server domain, handle the client as merely a presentation unit.

yeukhon · on Oct 13, 2013

Not all logic. You can do simple input verification on client side. You can do in pace filter and yielding from client side. Bu anything tht requires extracting db and saving o db must go bough server side to ensure security is met.

sjwright · on Oct 13, 2013

Input verification and pace filtering must be duplicated on the server side as well. You can do this on the client side as well, but only to aid good user interaction, not to actually verify input.

yeukhon · on Oct 13, 2013

hence why i said anything that has to go through db and save to db has to go through server.

joesb · on Oct 14, 2013

Which is the point of pearjuice's post about the logic not being too thin. Also duplicating same verification logic on both client and server does not make the server thinner, it only makes the client fatter.

meowface · on Oct 14, 2013

>Logic should always happen in the server domain, handle the client as merely a presentation unit.

Angular/Ember users would disagree with you.

There is a fair bit of logic that can be done on the clientside, sometimes even raw computational logic. It really just depends on the kind of application, and also what the developer thinks is best.

AYBABTME · on Oct 13, 2013

This article suggest that MVC implies a fat server, and that using newer patterns with thin servers breaks away from MVC.

Using a thin server with a Javascript heavy app basically means keeping the Model/Controller on the server, then having Views/Controllers on the client.

So in the end, you're still with a MVC pattern, you just spread your Controller over two machines with RPC in between.

dreamfactory · on Oct 14, 2013

To me a controller is just another thin client and models provide an (internal) API. What breaks this and leads to fat clunking controllers is ActiveRecord and 'model as table row' mindset (particularly when a model doesn't handle collections).

The missing part here is turning that API into an external one, which requires a layer in any case to cover things like authorisation, caching, queuing, reporting, QoS, orchestration.

ubersoldat2k7 · on Oct 14, 2013

In my last three projects we've aimed at an API first approach, and the flexibility of having the same API serve content for web or native mobile apps is the biggest win. We used Play on two of this projects and Jersey for another and the speed is very slow at first since the browser has to load a bunch of JS (AngularJS in this case) but then, the whole user experience is seamless. Our biggest problem with this approach were localization and authentication/session handling. Tackling this problems were hard at first and I've never felt confident about the solutions we did for them, specially for authentication.

jamesmccann · on Oct 13, 2013

This is a good description of the difference between available architectures, but more detail is probably required to backup some of the discussion points, beyond the fact that reducing presentation is beneficial for response times. An API server is still required to render JSON responses?

tunesmith · on Oct 14, 2013

A problem in general is that all these terms are highly overloaded.

Models: For some, models are backend structures that are closely related to database entities. For others, models are data structures that are passed around closer to the front, or structures that controllers pass to views. Some also equate models with domains that have behavior as well as state.

Servers/services: These are often used interchangeably when they are really not. For instance, your API call has to go to a web accessible server, which might in turn rely on a separate (SOA) service that is not web accessible.

Client: This can mean the browser or mobile device (client/server), but it can also mean the consumer of a service (client/service). This consumer might also be sending serialized data structures or page responses to a browser or mobile device.

Controller: This can refer to the logic in a mobile device that is making requests and unpacking json requests. Similarly, it can refer to a part of the javascript in a dynamic frontend that is doing the same. Or, it can also refer to the application that is receiving the network request (and in turn invoking business logic in other layers or making service calls to remote/private services) and then responding with json.

So, imagine a browser or mobile device that makes a request to a remote "server" and receives a javascript application. The javascript has its own MVC architecture. Ensuing requests to the remote server will receive json responses. The js app might unpack the json in the js app's controller, and repack it into different models (data structures) that get passed to the js app's views. In turn, the remote "server" could have a controller that receive the initial request and returns a "view" that is the javascript app, but also another controller (more accurately a "resource") that would receive the RESTful requests and respond with json. These controllers/resources could in turn have a bit of business logic to help them make calls - via REST or SOAP - to internal/private "services". These services (on different servers) could have additional business logic to interact with private databases through daos that act on models (which somewhat-but-not-exactly represent the database schema because of ORM).

I guess my point is that past a certain point, "fat" versus "thin" doesn't really capture the essence of the tradeoffs. It's more a series of questions of what is consuming what, what the consumer shouldn't have, what the consumer needs, and what the consumer is going to do with it. The big reason "controllers" (?) are getting "thicker" (?) is because there is a push to make the consumer-facing front-end more interactive and dynamic. I think that what the original article is saying is that if you want that, you should do what you can to return your API data structures (whether json or whatever) in as direct a fashion as possible, rather than manually constructing them through a templating technology and a bunch of views that would normally be used to return html.

But more generally, I'd say that unless you're talking with a team that knows exactly what you're referring to (from previous discussions), it's best to just completely define what you mean by client/service/server/controller/model/etc in each case, because these days, it seems like everyone is using them differently.

DrJones1098 · on Oct 14, 2013