How services talk to each other — interaction styles, REST, gRPC, messaging, and strategies for availability and reliability.
In a monolith, modules call each other through language-level method calls. In a microservice architecture, services run in separate processes or on separate machines. They must use an interprocess communication (IPC) mechanism to work together. Choosing the right IPC mechanism is one of the most important architectural decisions you will make.
Interaction styles are classified along two dimensions. The first dimension is how many services receive each request: one-to-one or one-to-many. The second dimension is timing: synchronous (the client waits for a response) or asynchronous (the client does not block).
| One-to-One | One-to-Many | |
|---|---|---|
| Synchronous | Request / Response | — |
| Asynchronous | Asynchronous Request / Response, One-way Notification | Publish / Subscribe, Publish / Async Responses |
This taxonomy guides the IPC choice. The two main pattern categories are the Remote Procedure Invocation (RPI) pattern for synchronous communication and the Messaging pattern for asynchronous communication.
Author's recommendation: Prefer asynchronous messaging for communication between services. Use REST for external-facing APIs.
Key insight: The choice between synchronous and asynchronous IPC directly impacts availability. Synchronous calls reduce it; asynchronous messaging improves it.
An API is a contract between a service and its clients. Getting this contract right is important because changes later can break other services.
Use an API-first design approach: define the API before you write any implementation code. Share the draft with client teams, get their feedback, and iterate. This avoids costly rework after coding starts.
The interface definition language (IDL) you use depends on the IPC mechanism:
APIs change over time. In a microservice architecture, you cannot force all clients to update at once because each service is deployed independently. You need a strategy to evolve APIs without breaking existing clients.
Use Semantic Versioning to communicate the nature of changes:
The Robustness Principle says: "Be conservative in what you send, be liberal in what you accept." Services should ignore unknown fields in requests and responses. This makes backward-compatible evolution much easier.
When a breaking change is unavoidable, you must support multiple API versions simultaneously for a transition period. Two common approaches:
/v1/orders, /v2/orders)The format you choose for encoding messages affects efficiency, readability, and compatibility. There are two broad categories.
| Aspect | Text (JSON / XML) | Binary (Protocol Buffers / Avro) |
|---|---|---|
| Readability | Human-readable, self-describing | Not human-readable |
| Size | Verbose | Compact |
| Parsing speed | Slower | Faster |
| API-first | Optional | Forced (schema required) |
| Evolution | Easy with Robustness Principle | Easy (Protocol Buffers); Avro needs schema to read |
Protocol Buffers uses tagged fields, which makes API evolution easier because the reader can skip unknown tags. Avro requires the writer's schema to read data, making evolution slightly harder.
Principle: Always use a cross-language format (like JSON or Protocol Buffers) even if your project currently uses only one language. This keeps you flexible for the future.
In the Remote Procedure Invocation (RPI) pattern, a client sends a request to a service and waits for a response. This is the most familiar IPC style. Two popular implementations are REST and gRPC.
REST is an IPC mechanism that uses HTTP. It models the domain as resources (e.g., /orders, /customers/123) and uses standard HTTP verbs (GET, POST, PUT, DELETE) to manipulate them.
Leonard Richardson defined four levels of REST maturity:
curl or PostmanAlternative: Technologies like GraphQL and Falcor let clients fetch exactly the data they need in a single request, which is more efficient than REST for complex data fetching.
gRPC is a framework for making synchronous and streaming remote calls. It uses Protocol Buffers as its IDL and binary message format, and runs over HTTP/2.
| Aspect | REST | gRPC |
|---|---|---|
| Protocol | HTTP/1.1 (typically) | HTTP/2 |
| Format | JSON (text) | Protocol Buffers (binary) |
| Operations | Limited to HTTP verbs | Any operation defined in .proto file |
| Streaming | Not native | Bidirectional streaming |
| Browser support | Excellent | Limited |
| Firewall compatibility | High | Can be problematic |
In a distributed system, a service you depend on may be slow or unavailable. This is called a partial failure. If you do nothing, the problem cascades: your service blocks waiting for the broken service, your threads get exhausted, and you become unavailable too.
Danger: Cascading failure is one of the biggest risks in synchronous microservice communication. A single slow service can bring down an entire chain of dependent services.
Use all three mechanisms together to handle partial failure:
Analogy: A circuit breaker in software works like one in your home's electrical panel. When there is too much current (too many errors), the breaker trips and cuts the circuit. After a cooldown, you can try turning it back on.
When a remote call fails or the circuit breaker is open, the service can:
In a cloud environment, service instances are created and destroyed dynamically. Their network addresses change constantly. Service discovery is the mechanism that lets a client find the current network location of a service instance.
At the center is a Service Registry: a database of the network locations of all service instances.
The services themselves handle registration and discovery:
Technologies: Netflix Eureka (registry) + Ribbon (client-side load balancing) + Spring Cloud.
Works across platforms, but requires a discovery library for each programming language you use.
The deployment platform handles registration and discovery:
No discovery code needed in your services, but it ties you to a specific platform.
| Aspect | Application-Level | Platform-Provided |
|---|---|---|
| Registration | Self Registration | 3rd-party Registration |
| Discovery | Client-side | Server-side |
| Code required | Yes (per language) | None |
| Platform dependency | Cross-platform | Single platform |
| Examples | Eureka + Ribbon + Spring Cloud | Kubernetes, Docker |
Recommendation: Use platform-provided service discovery when possible. It requires no code changes and works automatically with your deployment infrastructure.
In the Messaging pattern, services communicate by exchanging messages asynchronously. The sender does not wait for a response. Messages flow through channels, which may be managed by a message broker or handled directly between services (brokerless).
A message has two parts:
There are three types of messages:
Messaging is flexible enough to implement all the interaction styles from the taxonomy:
Unlike REST (Open API) or gRPC (Protocol Buffers), there is no widely adopted standard for documenting messaging-based APIs. You typically create informal documentation that describes:
A message broker is an intermediary through which messages flow. Services send messages to the broker, and the broker delivers them to the intended recipients.
Most production systems use a broker. Popular brokers include ActiveMQ, RabbitMQ, Apache Kafka, Amazon Kinesis, and Amazon SQS.
In brokerless messaging (e.g., ZeroMQ), services communicate directly without an intermediary.
| Aspect | Broker-Based | Brokerless |
|---|---|---|
| Coupling | Loose (sender and receiver decoupled) | Tighter (need direct addresses) |
| Buffering | Yes (receiver can be offline) | No (both sides must be up) |
| Latency | Higher (extra hop through broker) | Lower (direct connection) |
| Guaranteed delivery | Built-in | Hard to achieve |
| Operations | Broker must be managed | Simpler infrastructure |
| Examples | ActiveMQ, RabbitMQ, Kafka, Kinesis, SQS | ZeroMQ |
To scale message processing, you run multiple instances of a service that all consume from the same channel. These are called competing receivers. The problem is that concurrent consumers may process messages out of order.
Problem: If two messages for the same entity (e.g., "create order" then "update order") are processed by different instances concurrently, the update may be processed before the create, causing errors.
The solution is sharded channels (also called partitioned channels). Each message has a shard key (e.g., the order ID). The broker hashes the key to assign the message to a specific shard. Each shard is consumed by exactly one instance.
In Kafka terminology, this is called Consumer Groups: the broker assigns each partition to exactly one consumer in the group.
This guarantees per-shard-key ordering while still enabling horizontal scaling.
Analogy: Think of a post office with multiple counters. Instead of random assignment, each counter handles a specific ZIP code range. All letters for the same ZIP always go to the same counter, so they are processed in order.
Message brokers typically guarantee at-least-once delivery: a message will be delivered, but it might be delivered more than once if a failure occurs. Your service must handle duplicates.
If your message handler is naturally idempotent (processing the same message twice has the same effect as processing it once), duplicates are harmless. This only works if the broker also preserves ordering on redelivery.
Record each processed message ID in a PROCESSED_MESSAGES table as part of the same database transaction that performs the business logic. If a duplicate arrives, the INSERT into the tracking table fails (duplicate key), and the handler skips it.
Best practice: When your logic is not naturally idempotent, use message tracking. It guarantees exactly-once processing as long as the tracking and business logic share the same transaction.
A common need is to atomically update a database and publish a message. For example, when creating an order, you want to insert the order into the database and publish an "OrderCreated" event. If either operation fails alone, the system becomes inconsistent.
Problem: Distributed transactions (two-phase commit) across the database and message broker are not a viable solution for modern microservices. They are complex, slow, and many brokers do not support them.
The MessageRelay component reads messages from the OUTBOX table and publishes them to the broker. Two approaches:
Polling Publisher: The relay periodically queries the OUTBOX table for unpublished messages. Simple to implement, but the polling adds load to the database. It may not work well with NoSQL databases that lack efficient querying.
Transaction Log Tailing: The relay reads the database's transaction log (also called commit log or WAL). When an INSERT into the OUTBOX is found in the log, the relay publishes the corresponding message. This is more sophisticated but very performant because it uses the database's built-in change capture.
Technologies for log tailing: Debezium, Eventuate Tram, DynamoDB Streams.
| Aspect | Polling Publisher | Transaction Log Tailing |
|---|---|---|
| Complexity | Simple | More complex |
| Performance | DB polling is expensive at scale | High performance |
| NoSQL support | May not work well | Works (via native streams) |
| Technologies | Custom implementation | Debezium, Eventuate Tram, DynamoDB Streams |
Rather than using broker client libraries directly, use higher-level frameworks that abstract away the details of messaging infrastructure. These frameworks handle message serialization, channel management, and transactional outbox concerns so you can focus on business logic.
The IPC mechanism you choose has a direct effect on your system's availability. This section explains why synchronous communication hurts availability and how to fix it.
When service A calls service B synchronously, A is available only if B is also available. If B also calls service C, then A is available only if both B and C are available. The overall availability is the product of all dependent service availabilities.
Example: If each service is 99.5% available and you have a chain of three synchronous calls, the combined availability drops to 0.995 × 0.995 × 0.995 ≈ 98.5%. With more services, it gets worse quickly.
There are three strategies to eliminate synchronous dependencies and improve availability:
Replace synchronous request/response with asynchronous messaging throughout the system. The service that receives the initial request sends it as a message and responds immediately. Downstream processing happens asynchronously.
This is not always possible, especially for external REST APIs that clients expect to return a result immediately.
Each service maintains a local copy of the data it needs from other services. It subscribes to events from those services and keeps its copy up to date. When it needs the data, it reads from its local copy instead of making a synchronous call.
The service creates the entity in a PENDING state and returns a response immediately to the client. It then validates and completes the operation asynchronously using a saga. When the saga finishes, the entity moves to a final state (e.g., APPROVED or REJECTED).
Analogy: Think of placing an order at a busy restaurant. The waiter does not go to the kitchen and wait for your food before confirming your order. Instead, the waiter writes down your order (PENDING), confirms it immediately, and brings the food later when it is ready.
| Strategy | How It Works | Limitation |
|---|---|---|
| Async end-to-end | Replace all sync calls with messaging | Not always possible for external APIs |
| Replicate data locally | Subscribe to events, keep local copy | Large data storage; does not solve writes |
| Respond immediately + saga | Create in PENDING state, process via saga | Client must handle PENDING state |