Microservices Architecture: When to Split and How to Communicate

Microservices are sold as the silver bullet for scalability, but the real benefit is organizational: independent deployability, team autonomy, and bounded failure domains. The cost is operational complexity, network latency, and data consistency headaches. The decision to split a service should never be taken lightly.
Finding the Right Service Boundaries
Domain-driven design's bounded contexts are the most reliable guide to service boundaries. A service should own a complete business capability and its data. If two features share a database table or require synchronous transactions to maintain consistency, they probably belong in the same service.
Signs it's time to split:
- Deployment coupling: Changing the user profile page requires deploying the order service
- Team contention: Two teams queue up to modify the same codebase
- Scaling asymmetry: The user service needs 3 replicas but the image-processing service needs 30
- Data contention: Different parts of the application have conflicting performance requirements on the same database
When in doubt, start monolithic. Monolith-first is a valid architectural strategy. Extract services only when the monolith concretely hurts development velocity or operational reliability.
Synchronous Communication Patterns
HTTP/REST and gRPC dominate synchronous service-to-service communication. REST is simpler for public APIs—it's debuggable with curl, has broad tooling support, and works through any infrastructure. gRPC offers strict contract definitions with Protocol Buffers and superior performance for internal service calls:
service UserService {
rpc GetUser(GetUserRequest) returns (User);
rpc ListUsers(ListUsersRequest) returns (ListUsersResponse);
}
message GetUserRequest {
string user_id = 1;
}
message User {
string id = 1;
string name = 2;
string email = 3;
google.protobuf.Timestamp created_at = 4;
}
gRPC generates client stubs in all major languages, eliminating the guesswork from service contracts. The trade-off: debugging gRPC traffic requires additional tooling (grpcurl, gRPC reflection).
Asynchronous Communication with Message Queues
For operations that don't require immediate response, async communication decouples services and improves resilience. When the order service publishes an "order created" event, it doesn't wait for the inventory, notification, and analytics services to process it:
# Publisher (order service)
def create_order(order_data):
order = Order.create(order_data)
event = {
"event_type": "order.created",
"data": {
"order_id": str(order.id),
"user_id": str(order.user_id),
"total": order.total,
"items": [{"product_id": i.product_id, "quantity": i.quantity}
for i in order.items],
},
"metadata": {
"timestamp": datetime.utcnow().isoformat(),
"version": 1,
},
}
kafka_producer.send("orders", key=str(order.id), value=json.dumps(event))
return order
# Consumer (notification service)
def handle_order_created(event):
order_data = json.loads(event.value)
user_id = order_data["data"]["user_id"]
user = user_service.get_user(user_id)
email_client.send(
to=user.email,
subject=f"Order {order_data['data']['order_id']} confirmed",
template="order_confirmation",
context=order_data["data"],
)
The event schema should include a version field. When services evolve, different consumers may need different event shapes. Versioning events (not publishing a breaking schema) lets consumers migrate independently.
API Gateway as the Front Door
An API gateway routes external requests to internal services, handling cross-cutting concerns in one place:
# Kong/NGINX-style gateway config
services:
- name: user-service
url: http://user-svc:3001
routes:
- paths: ["/api/users", "/api/auth"]
- name: order-service
url: http://order-svc:3002
routes:
- paths: ["/api/orders"]
- name: product-service
url: http://product-svc:3003
routes:
- paths: ["/api/products"]
Common gateway responsibilities: rate limiting (100 req/s per API key), authentication (verify JWT before forwarding), request/response transformation, caching of read-only responses, and aggregated logging.
Don't put business logic in the gateway. It should remain a routing and enforcement layer, not a service mesh for business rules.
Data Consistency Across Services
Distributed transactions (two-phase commit) are slow, fragile, and best avoided. The Saga pattern breaks a distributed operation into a sequence of local transactions with compensating actions:
Order Saga:
1. Order Service: Create order (PENDING)
2. Payment Service: Reserve payment
3. Inventory Service: Reserve items
4. Order Service: Mark order CONFIRMED
Compensating transactions:
- If payment fails: Cancel order (no charge)
- If inventory fails: Release payment → Cancel order
- If confirmation fails: Release payment → Release inventory → Cancel order
Implement sagas with orchestration (a dedicated coordinator service) or choreography (each service listens for events and decides locally). Choreography is simpler for small service graphs; orchestration scales better as the service graph grows.
Monitor Microservices with SoniNow
Microservices unlock independent scaling and team autonomy, but they demand disciplined communication patterns and robust observability. SoniNow architects and deploys microservice systems that your team can operate confidently.
Related Insights

API Rate Limiting Strategies: Token Bucket, Leaky Bucket, and Sliding Window
A guide to implementing API rate limiting including token bucket, leaky bucket, sliding window, and distributed rate limiting with Redis for production APIs.

API Security Best Practices: Authentication, Rate Limiting, and Input Validation
Best practices for securing APIs including API key management, OAuth token validation, rate limiting, input sanitization, CORS configuration, and request signing.

Building AI Agents That Actually Work: Architecture and Orchestration Patterns
Learn production architecture patterns for building reliable AI agents including task planning, tool use, memory systems, reflection loops, and human-in-the-loop workflows.