CRUD apps (which I'll argue are the vast majority of applications) are easier to write, easier to understand, and easier to make have the behavior users expect when there's only one datastore and you never have to deal with eventual consistency or distributed or half-applied migrations.
As an anecdote, at a previous employer, we provided user accounts and OAuth for connecting these accounts to our API. We made separate user and OAuth services, with separate databases.
What resulted was an unnecessary amount of complexity in coordinating user deactivation, listing OAuth tokens for users, delegation, and doing all that while authenticating requests between microservices. Our API could not safely expose a method to deactivate a user and all of their OAuth tokens in a single DELETE. A single instance that handled both would have been easier to build and wasted less time up front dealing with complexity that we didn't need or make good use of for our scale at the time.
To solve this, we eventually merged all the data back into a single database, so we could expose sane invariants at the API level without needing to build an eventually-consistent message queue.
Perhaps, but even a single database alone doesn't ensure atomicity.
For instance: suppose you're deactivating a user. You've got a user table, with an id column and a "deactivated" column. Then you've got an OAuth tokens table, with a foreign key column to userid.
If you had two microservices hitting the same database, then you make a DELETE to the user service, which now needs to send a DELETE to the OAuth service. Now, regardless of which transaction you logically have go first, you have a race: two services with separate transactions are modifying the same DB. Whichever transaction commits, the other could fail, leaving your external view inconsistent.
One way to solve this is to have the OAuth service check the user table to see if the user is disabled, and treat all of the tokens for that user as deactivated, but then your OAuth service is tightly coupled to your user service's schema, which means you can no longer modify the two services separately. My impression is that this isn't really what people mean when they say "microservices".
Another option is to have the OAuth service ask the user service if the user is still live, but now you have a circular dependency between services, and either one failing can effectively put the other out of commission.
OAuth is better considered a microservice that grew a little too big. Strong consistency in identity management is a well studied problem. This is why people pay money for Active Directory consultants.
edit: The trick to your particular dilemma is to design your operations more carefully. Don't allow sensitive operations for any service via long lasting authentication tokens.
I will be the guy bringing Erlang (or more precisely BEAM -the Erlang VM- and the various languages that run on top of it) into this thread... But it's amazing how much Erlang got right so long ago. Erlang applications are effectively fine-grained, Service Oriented, with a very low overhead.
There is, of course, one drawback: you loose the low overhead advantage as soon as you step outside of the Erlang world. That's a major drawback given that a non-negligible part of the Service Oriented Architectures appeal is the possibility to use the best possible tool for a given job, no matter what language/VM/hardware that tool is built with...
There is also a funny tangent on it. Garrett Smith (the person responsible for such movies as "MongoDB is Webscale" and "Erlang II: The Movie" http://www.gar1t.com/blog/tags/videos.html) made a comment that Erlang has already been doing nanoservices before microservices were cool.
In a certain way he is right. Each gen_server lightweight process is a small independent service that starts up with only a few KB of memory, has an isolated heap and can be called from another Erlang node from half-way across the world like a you'd call a local process.
Let me take your point even further: it is some principles of functional programming that make these languages a great fit for "refactoring" and "breaking apart the monolith".
Myself I'm more into Haskell. I see similar benefits in both Haskell and Erlang. A nice thing of Erlang is that you have to structure you application as a bunch of services from the start.
Ah, it seems we have moved to the deprecation phase of the buzzword lifecycle. Consultants and pundits would be well-advised to begin seeking the next thing.
Meanwhile, those of us who actually build systems will probably continue to make all our services as small as possible, and to compose systems from discrete processes that may or may not require more than one computer.
Keep it simple and YAGNI aggressively - the easiest code to refactor is the code you didn't write yet. Follow ordinary good design practice (single responsibility in particular). Beyond that, don't worry about it yet - one hour's refactoring in the future when you know exactly what you're doing with the services is worth ten now while things are uncertain, so better to save the time.
I've seen a couple of codebases now that have coded themselves into corners and
made it nearly impossible to go microservices without a major rewrite.
- avoid code sharing wherever possible. It creates an implicit dependency.
(If you have a common library or util file, you've done it wrong. try again.)
- don't merge your trees:
(ie: each component in the code should have it's own readme and tests folder.)
- write far more unit tests than functional/integration tests.
(if you separate out components later, those tests can't move with them)
- if you use an ORM or similar, you are probably going to have a very bad time about it
- avoid code sharing wherever possible. It creates an implicit dependency. (If you have a common library or util file, you've done it wrong. try again.)
Can you clarify on this a bit? Isn't part of the point of classes and libraries to encapsulate common functionality so that it doesn't need to be separately maintained in multiple places?
You listed an ORM as an example, what is a better solution to managing database interaction across an application?
Code sharing is not the same as using the same library in different places. That is managed by a proper dependencies manager and not as issue. The problem is when people take code from the codebase and spread it all over. In one pretty awful scenario, I saw two programmers create a wrapper around the ORM then try and use that all over the place. That made testing awfully difficult because the wrapper was full of errors itself.
> Code sharing is not the same as using the same library in different places.
Sounds like you're using a different definition than AdrianRossouw.
> In one pretty awful scenario, I saw two programmers create a wrapper around the ORM then try and use that all over the place. That made testing awfully difficult because the wrapper was full of errors itself.
That sounds like an issue with the wrapper, not so much the code sharing per se.
That said, I've seen issues with... "overly shared" code. The example that comes to mind was some UI code on a game - we had various menus, and they all tried to share a huge chunk of programmatic UI layout & control flow logic.
However, as so many menus were unique and special snowflakes with their own unique designs and layouts (even justifiably!), this meant the shared code ended up with a lot of special cases. Worse, trying to "fix" the layout to work right in one place usually broke it in another. QA eventually caught almost everything, but it was hell to fix and debug.
I would've very gladly eaten a 5x total size increase to the relevant code, full of copy & paste duplication, rife with bugs that were fixed in 7 places but missed in 3 others, just to detangle those menus from each other.
I've had to deal with this exact same problem of unrelated UIs sharing UI layout & control flow logic, messily packaged deep in 8 layers of inheritance... A nightmare. It was a GWT + GXT codebase. The worst is that I did partake into creating the mess in the first place.
Now when it comes to UIs I favor decoupling at the expense of a little code duplication. Because two otherwise unrelated menus happen to share the same layout/structure/flow at a given time, doesn't mean this structure needs to be abstracted away: the code is likely to diverge later anyway as the unrelated views evolve.
Abstracting away composable patterns for GUIs is hard. Especially for me given that I mostly write backend code...
I've seen what you described too. It's crazy. The wrapper I mentioned was about making code sharing easier. Never taking into account that Django already provided a fairly clean interface to work with. These guys were wrapping Django ORM query code with funciones that took an infinite amount of parameters through Python's kwargs parameter. Nuts.
Err, wut? It sounds like you are saying that pretty much the one reliable guideline in the history of software has become a bad idea.
(I have come across places in my career where code sharing was not worthwhile, but they have been rare. And I frequently regretted the decision later.)
if you are going to try and separate them out into services later, you need to have the code as compartmentalized as possible.
Otherwise when you eventually split them out, you will end up being in a situation where each service needs copies of all the files it required before.
This then is a situation where you will have to extract into a library, and make general enough to be used and versioned that way.
People have been building multiple applications from a single codebase for a very long time. Put all the targets on a continuous build, add tests, and you're set.
The only sense in which versioning matters at all is that if the protocol the applications talk to each other over has to be versioned, and an application must understand the oldest version of the protocol they could potentially be spoken to over.
But things like Thrift and Protocol Buffers have this feature built in. You shouldn't need to version the actual code...
If you have multiple components that are depending on the same code, if you try to split them into multiple services that are being independently deployed, you need to keep the code they are depending on in sync.
So making a change to the code shared by multiple services, means you have to deploy multiple services for a single change.
I completely agree with you about the protobuf/thrift angle though. If you're in that situation, you are already doing it right.
once again, the question is about how to write a system that can be more easily split into microservices later.
I would separate business logic into a single service and have all other modules call that using a standard protocol like REST or thrift or something.
The issue is that if that logic is being executed all over the system already, it's going to be difficult to break it into a separate service that doesn't do that
While the overhead of SOA is pretty well documented and understood, I feel like this is the crux of it: "Adopting a microservice architecture does not automatically buy you anti-fragility."
Not only do you have new, hard problems -- but you didn't just innately get a solution to your previous problems. I've seen this in practice, and it is pain, all around. There's immense organizational overhead to having your core product distributed across 4x as many projects/repos/components as you have developers. And if you haven't actually solved all the fragility and failover problems, you are very fucked.
A common pattern I've seen with companies sprinting head-first into a microservices architecture is something I call "femtoservices" -- when your system is new and doesn't do much, it's easy to split things up into components that are way too small. I've seen systems that essentially have a microservice per database table, effectively destroying any reasonable way to perform non-trivial queries. Fixing this is far harder than splitting out a monolith that has been designed well.
It seems like an endless cycle of some new trend coming out (microservices), then the reactionary comments which create even more new trendy buzzwords. Application design and development is not this black and white. SOA is not GOOD or BAD. It is a design structure that some people like, and some people seem to hate. There are good ways of doing things and bad ways of doing things. In my experience, having small manageable code bases that are responsible for one domain makes things so much easier to develop and maintain. But, that is my opinion not something I think is a universal truth. Architecture/Design/Implementation of any complex system is not an exact science with perfect formulas for creation. It would be nice to have more articles about new and exciting ways of doing things, instead of the same old tearing down of others for being stupid/wrong/etc because they think functional/OO/SOA/microservices/containers/etc is the wrong way to do things.
> In my experience, having small manageable code bases that are responsible for one domain makes things so much easier to develop and maintain.
I think everybody agrees with that.
The issue is that architecting your system like that on a code / repo level means that you gain the complexity of having to manage dependencies between your components, and it complicates the deployment story.
microservices then further complicates things by requiring you to be able to independently deploy each service. this is a lot of complexity to manage for something that doesn't directly solve any problems in your business domain.
This reaction you are seeing is because people have now had enough time to actually work on and hear about projects that were built using microservices. Many of them have found that they don't deliver on their promises, because of the reasons mentioned in the article.
You just rephrased the OP in a confrontational tone. It didn't say that SOA was bad, just overused in some places where it's not appropriate. It looks like the closest thing you have to a disagreement is that you can probably still get the some of the small code-base benefits while not getting into the distributed-systems stuff the article talked about (making sure your processes are all on the same machine or something, I don't know).
Can you explain that a bit more? What added security benefits do micro services provide that couldn't be accomplished using role- or claim-based authorisation?
Machine separation for one (OS level exploits, escalation of privileges in one component compromising another). Network partitioning with strong protocol filters between security tiers can be invaluable.
I'd guess that what he means is that an attacker who compromises one micro service may not end up compromising all of them, whereas if they compromise a monolith, they have access to everything.
It seems to be a common fault of youth to assume that you can solve every problem with just another level of abstraction. I call myself happy that I have used a lot of Java in my university days. The nice thing about Java is that you have so many tools and frameworks that do a lot of great things but always try to be as flexible as possible, ending with you having the power to do the same stuff in the library that you could have also done with pure Java: everything. I'm lucky because I never head to spend years writing such a tool that can do everything but nothing really well. Thanks to many tools who already made that mistake I could learn it by using it just a few weeks.
> The better strategy is a bottom-up approach. Start with
> a monolith or small set of coarse-grained services and
> work your way up. Make sure you have the data model
> right.
the part about "or a small set of coarse-grained services"
I like this advice, but this is effectively starting with SOA! - just coarse grained. So the drive of this post should be so start with a simpler SOA, don't pre-optimize etc, not build a monolith.
While you're technically correct, bear in mind that the central difficulty with microservices is managing dependencies.
The worst-case for dependencies across service boundaries in a system with n services is (n - 1)*n. If n is 3, you have a theoretical maximum of 6 direct dependencies. If n is 10, that number is 90.
I think the biggest issue is to realize whether we are designing for humans or machines.
In all my work building big systems, microservices have never really made sense. The vast amount of developer time is spent in code where things are already nicely separated with solutions, projects, namespaces, IoC, etc.
When it's all deployed and running, who cares if it all goes out as one big monolith? The machine doesn't, it's all just code. It's way easier to deploy everything with a master version on the package and it guarantees everything works together. Easier to monitor, configure and scale. It'll probably run faster too. You can still use queues/message bus/whatever to talk to other systems but outside of that, I feel a lot of the reasons for SOA are over stated and often unnecessary. As the overall business gets bigger, maybe certain large systems can get isolated but a lot of this microservice/tiered stuff is way too granular and ultimately unproductive.
One area where microservices are very useful is in constrained environments, where being able to swap out idle services at runtime makes a lot of sense. This was actually the original driver for OSGi.
If you read original Service Oriented Architecture papers from IBM or the original Microservices bliki that Martin Fowler and his team wrote, you'll see a lot of really smart ideas that sound really groundbreaking and feasible.
If I recall correctly, the original SOA vision focused a lot on creating a services catalog and ecosystem, with service location transparency, and loose-coupling between services, yellow pages for advertising and discovering services, and the calling interface formed a contract between caller and provider.
It seemed like a new combination of ideas that could change how software was delivered, but the vision was really tough to realize ... like, how do you bind to a service when its location can change? There was a lot of blind alleys to run down trying to solve that, and ultimately the vendors trundled out their Enterprise Service Bus products. These provided mediation, transparent failover, and quality of service. ESB servers were a new single point of failure, and deploying to ESB's was incredibly painful since you couldn't commit the configurations to source control easily, but you could get that transparency as originally advertised.
And it seemed like everything else in SOA was like that. A lot of solutions begging for problems, or incredible complication that inevitably required massive vendor products to implement - which also led to atrocious runtime performance/instability due to shovelware vendor code.
The Microservices bliki read to me like a reboot of SOA - but a really good reboot. Gone were the service catalogs. And in was discussions of how to organize teams around code/features. There was the whole novel versioning scheme, where multiple versions of a microservice could be deployed simultaneously. And deployed code was immutable - you would never redeploy or modify the code of a version, you'd just deprecate and release another version.
All this stuff screamed "PRODUCTIVE" to me, and was more a return to patterns and not leaving gaping voids in the description for vendors to fill with more shovelware.
It just turns out that software components are tougher to work with and maintain the further away they are from each other.
You can have two components A and B, where A depends on B. A and B are in the same code repository, just kept in separate folders. Doing this is handy because you can do refactorings against A and B and it's easy to keep them in sync. Perhaps A and B are built separately with different build scripts, but they get deployed together and they are versioned together.
Now let's move A and B into their own code repos. Now refactoring is a bit more difficult because now you have to make sure to keep A and B in sync. Why did we put A and B into their own repos? Because the boss is convinced that this separation was critical because he worked at Oracle awhile back (huh?) So, versioning is now an issue too, A has to refer to a version/range of B that it wants. Needless busy work, but alright!
Now that we have our sea legs, we can go ahead and build and deploy A and B as independent microservices: A expects particular versions of B to be deployed, and A calls B via HTTP RESTful service calls. Refactoring is now officially painful, it requires a lot of negotiating between teams. B is required for A to function, the A team has to produce a client library for calling B that issues proper maintenance alerts when B is unavailable. Does B really need to be a microservice and are HTTP calls to it justified? PAIN PAIN PAIN! Hey, couldn't B just be a library that gets called by A and not get its own microservice? (NO! Because Oracle!) That is where the whole "Monolith" discussion starts.
I fault Fowler for painting a rosy SOA'esque picture initially, but I can't fault him too much. The original Microservices discussion he helped launch was really refreshing. It felt doable, it felt scalable, and it felt patterny instead of tooly.
Maintaining service contracts and encouraging intra-team friction is going to hurt agility, and it's a miss that he didn't call that out originally. So, I'm calling you out Martin Fowler: your good idea wasn't a panacea!
CRUD apps (which I'll argue are the vast majority of applications) are easier to write, easier to understand, and easier to make have the behavior users expect when there's only one datastore and you never have to deal with eventual consistency or distributed or half-applied migrations.
As an anecdote, at a previous employer, we provided user accounts and OAuth for connecting these accounts to our API. We made separate user and OAuth services, with separate databases.
What resulted was an unnecessary amount of complexity in coordinating user deactivation, listing OAuth tokens for users, delegation, and doing all that while authenticating requests between microservices. Our API could not safely expose a method to deactivate a user and all of their OAuth tokens in a single DELETE. A single instance that handled both would have been easier to build and wasted less time up front dealing with complexity that we didn't need or make good use of for our scale at the time.
To solve this, we eventually merged all the data back into a single database, so we could expose sane invariants at the API level without needing to build an eventually-consistent message queue.