Slicing Monoliths into Microservices
If you've researched Microservices you'll know that very few authors and commentators are able to explain how real-world applications can and should be divided into separate services.
Having just completed a 3-year migration I'd like to share how we did it and which rules were made for slicing.
Basic Principles
The number one principle of microservices is that they must be independent. In practice, this means that if Service A is down for whatever reason, the rest of the solution (i.e. all other services) must "work". This means that services depending on A must degrade as gracefully as possible. It is unacceptable that any other service crash because of A. We'll see how this is achieved in a second...
Sharpen your Knife
We decided to cut services primarily along business entities (nouns) in the business domain. The idea was that service names should be easily understood - if the "Email" service is down, we have no email communications. The anti-pattern here to avoid would be: if a service named Wally (bad naming, function not obvious) is down, the product catalog crashes (so that's what Wally does!) and our scheduler (relation not obvious) crashes too. This choice resulted in the following services:
Images - Anything to do with visual assets, pictures, videos, scaling, CDN etc.
Sales - A history of purchases, offers, returns etc.
Documents - Downloadable material, media, brochures, product info, manuals etc. (Media Layer)
Content - Our abstraction (CMS layer) for text/markdown content and some visual assets.
Products - Our catalog of products (PIM layer)
Events - Date-related information, typically presented in a single calendar widget
Customers - Customer and contact person data and functionality (CRM layer)
Users - User data and functionality (CIAM Layer)
Maintenance - Specific application for our industry
Commerce - Cart, checkout and shopping functionality (ERP Layer)
Our secondary consideration was related functionality. All services need Email functionality, all services need a scheduler/cron functionality. Instead of being dogmatic we pragmatically agreed to centralize some functionality into the following services:
Email - Handle email templating, queuing, subscribers and sending
Scheduler - Centralized CRON Layer for all scheduled events
One final consideration should be that architecture follows organization (whether we like it or not) and it makes sense to slice functionality along existing team lines if those boundaries already exist and make sense.
Macroservices
As is evident we don't have 100's of Microservices - we actually only have a dozen (though we have close to 100 instances over 3 environments) and we're quite happy. Managing lots of services is a real challenge and we had to build custom tools to help us. Yes, Kubernetes doesn't care how many you have - but your brain does and the next step to slicing services into functionality blocks (ProductList, ProductSearch, ProductDetail...) should not be done unless strictly necessary.
Regarding tools, we build a "Deployment Overview" dashboard which shows us all services in a grid across all environments and is able to present and drill down to compare status (up/down), version, uptime, configuration values, certificate expiry and a great deal of other information. We also used PowerShell to automate changes across microservices to save our keyboards from burning out.
Robustivity
Our services are robust because they cannot "kill" each other. Yes, we have some dependencies but they are never existential. Service A can always do "most of" its Job, even if it requires Service B for some functionality. If Service B dies, Service A may only have read-only access to the data of B via a cache. That is: each service MUST cache ALL read access to other services and use this as a fallback in case of failure. So if the Customers service reads user data, it must cache it for the case that Users is unavailable. Obviously time-critical information may be unsuitable for such a cache and there may therefore be exceptions where certain Customer functionality is unavailable if Users is down.
Another aspect of being robust is the concept of single-point of failure. Each Microservice must have its own database! This may involve replicating data from other services to be robust - you have to break DRY in Microservices - deal with it. It's tempting to normalize and centralize stuff like this but you will cry when the DB goes down and all hell breaks loose - distributed monoliths are not fun to fix.
Conclusion
Make your service names simple, short and clear "nouns" in the business domain.
Compromise on non-critical "verb" services for processes that are good candidates for centralizing in a functional service.
Above all, keep those services as independent of each other as possible!
