Chapter 3 — Interprocess Communication | Microservices Patterns Study Guide

3.1 Overview of Interprocess Communication

In a monolith, modules call each other through language-level method calls. In a microservice architecture, services run in separate processes or on separate machines. They must use an interprocess communication (IPC) mechanism to work together. Choosing the right IPC mechanism is one of the most important architectural decisions you will make.

Interaction Styles

Interaction styles are classified along two dimensions. The first dimension is how many services receive each request: one-to-one or one-to-many. The second dimension is timing: synchronous (the client waits for a response) or asynchronous (the client does not block).

	One-to-One	One-to-Many
Synchronous	Request / Response	—
Asynchronous	Asynchronous Request / Response, One-way Notification	Publish / Subscribe, Publish / Async Responses

This taxonomy guides the IPC choice. The two main pattern categories are the Remote Procedure Invocation (RPI) pattern for synchronous communication and the Messaging pattern for asynchronous communication.

Author's recommendation: Prefer asynchronous messaging for communication between services. Use REST for external-facing APIs.

Key insight: The choice between synchronous and asynchronous IPC directly impacts availability. Synchronous calls reduce it; asynchronous messaging improves it.

Defining APIs in a Microservice Architecture

An API is a contract between a service and its clients. Getting this contract right is important because changes later can break other services.

Use an API-first design approach: define the API before you write any implementation code. Share the draft with client teams, get their feedback, and iterate. This avoids costly rework after coding starts.

The interface definition language (IDL) you use depends on the IPC mechanism:

REST — Open API Specification (Swagger)
gRPC — Protocol Buffers definition files
Messaging — informal documentation of message channels and message types

Evolving APIs

APIs change over time. In a microservice architecture, you cannot force all clients to update at once because each service is deployed independently. You need a strategy to evolve APIs without breaking existing clients.

Semantic Versioning

Use Semantic Versioning to communicate the nature of changes:

MAJOR — breaking, incompatible change
MINOR — new backward-compatible feature
PATCH — backward-compatible bugfix

The Robustness Principle

The Robustness Principle says: "Be conservative in what you send, be liberal in what you accept." Services should ignore unknown fields in requests and responses. This makes backward-compatible evolution much easier.

Handling Breaking Changes

When a breaking change is unavoidable, you must support multiple API versions simultaneously for a transition period. Two common approaches:

URL versioning — embed the version in the URL (e.g., /v1/orders, /v2/orders)
Content negotiation — use HTTP headers to specify the version

Message Formats

The format you choose for encoding messages affects efficiency, readability, and compatibility. There are two broad categories.

Aspect	Text (JSON / XML)	Binary (Protocol Buffers / Avro)
Readability	Human-readable, self-describing	Not human-readable
Size	Verbose	Compact
Parsing speed	Slower	Faster
API-first	Optional	Forced (schema required)
Evolution	Easy with Robustness Principle	Easy (Protocol Buffers); Avro needs schema to read

Protocol Buffers uses tagged fields, which makes API evolution easier because the reader can skip unknown tags. Avro requires the writer's schema to read data, making evolution slightly harder.

Principle: Always use a cross-language format (like JSON or Protocol Buffers) even if your project currently uses only one language. This keeps you flexible for the future.

3.2 Synchronous Remote Procedure Invocation Pattern

In the Remote Procedure Invocation (RPI) pattern, a client sends a request to a service and waits for a response. This is the most familiar IPC style. Two popular implementations are REST and gRPC.

Pattern: Remote Procedure Invocation (RPI)

Problem How does a client invoke a service synchronously?

Solution The client sends a request and blocks until it receives a response.

Implementations REST, gRPC

Requirements Circuit Breaker + Service Discovery are needed for reliable RPI.

Using REST

REST is an IPC mechanism that uses HTTP. It models the domain as resources (e.g., /orders, /customers/123) and uses standard HTTP verbs (GET, POST, PUT, DELETE) to manipulate them.

REST Maturity Model

Leonard Richardson defined four levels of REST maturity:

Level 0 — clients invoke a single endpoint using POST (just HTTP as transport)
Level 1 — introduces resources with distinct URLs
Level 2 — uses HTTP verbs correctly (GET, POST, PUT, DELETE)
Level 3 — HATEOAS: responses include links that tell the client what it can do next

Benefits of REST

Simple and familiar to most developers
Easy to test with tools like curl or Postman
Firewall-friendly (uses standard HTTP/HTTPS)
No intermediate broker needed

Drawbacks of REST

Only supports request/response interaction style
Fetching data from multiple resources often requires multiple round trips
Mapping complex operations to HTTP verbs can be awkward

Alternative: Technologies like GraphQL and Falcor let clients fetch exactly the data they need in a single request, which is more efficient than REST for complex data fetching.

Using gRPC

gRPC is a framework for making synchronous and streaming remote calls. It uses Protocol Buffers as its IDL and binary message format, and runs over HTTP/2.

Benefits of gRPC

Supports rich operation definitions (not limited to HTTP verbs)
Compact and efficient binary Protocol Buffers format
Supports bidirectional streaming
Works across many programming languages

Drawbacks of gRPC

JavaScript clients in browsers have limited support
HTTP/2 can cause problems with older firewalls and load balancers

Aspect	REST	gRPC
Protocol	HTTP/1.1 (typically)	HTTP/2
Format	JSON (text)	Protocol Buffers (binary)
Operations	Limited to HTTP verbs	Any operation defined in .proto file
Streaming	Not native	Bidirectional streaming
Browser support	Excellent	Limited
Firewall compatibility	High	Can be problematic

Handling Partial Failure Using the Circuit Breaker Pattern

In a distributed system, a service you depend on may be slow or unavailable. This is called a partial failure. If you do nothing, the problem cascades: your service blocks waiting for the broken service, your threads get exhausted, and you become unavailable too.

Danger: Cascading failure is one of the biggest risks in synchronous microservice communication. A single slow service can bring down an entire chain of dependent services.

Three-Mechanism Defense

Use all three mechanisms together to handle partial failure:

Network timeouts — never block indefinitely; always set a timeout on remote calls
Limit outstanding requests — cap the number of concurrent requests to a service; reject immediately if the limit is reached
Circuit Breaker — track the error rate and stop sending requests when failures exceed a threshold

Pattern: Circuit Breaker

Problem How to prevent a failing service from causing cascading failures?

Solution Track the error rate of calls to a remote service. When it exceeds a threshold, "trip" the breaker (open state) and reject all calls for a timeout period. After the timeout, allow a trial call. If it succeeds, close the breaker. If not, stay open.

States Closed (normal) → Open (rejecting) → Half-Open (trial)

Technologies Netflix Hystrix (JVM), Polly (.NET)

Analogy: A circuit breaker in software works like one in your home's electrical panel. When there is too much current (too many errors), the breaker trips and cuts the circuit. After a cooldown, you can try turning it back on.

Recovery Strategies

When a remote call fails or the circuit breaker is open, the service can:

Return an error to the client
Return cached data from a previous successful call
Return a default or fallback value
Omit the non-essential data from the response

Using Service Discovery

In a cloud environment, service instances are created and destroyed dynamically. Their network addresses change constantly. Service discovery is the mechanism that lets a client find the current network location of a service instance.

At the center is a Service Registry: a database of the network locations of all service instances.

Application-Level Discovery

The services themselves handle registration and discovery:

Self Registration — each service instance registers itself with the registry on startup and deregisters on shutdown
Client-side Discovery — the client queries the registry and load-balances across available instances

Technologies: Netflix Eureka (registry) + Ribbon (client-side load balancing) + Spring Cloud.

Works across platforms, but requires a discovery library for each programming language you use.

Platform-Provided Discovery

The deployment platform handles registration and discovery:

3rd-party Registration — the platform automatically registers instances
Server-side Discovery — the platform provides a built-in load balancer (e.g., Kubernetes Service, Docker DNS)

No discovery code needed in your services, but it ties you to a specific platform.

Aspect	Application-Level	Platform-Provided
Registration	Self Registration	3rd-party Registration
Discovery	Client-side	Server-side
Code required	Yes (per language)	None
Platform dependency	Cross-platform	Single platform
Examples	Eureka + Ribbon + Spring Cloud	Kubernetes, Docker

Recommendation: Use platform-provided service discovery when possible. It requires no code changes and works automatically with your deployment infrastructure.

3.3 Asynchronous Messaging Pattern

In the Messaging pattern, services communicate by exchanging messages asynchronously. The sender does not wait for a response. Messages flow through channels, which may be managed by a message broker or handled directly between services (brokerless).

Pattern: Messaging

Problem How do services communicate without blocking?

Solution Services exchange messages asynchronously through message channels.

Benefit Loose coupling, improved availability, supports all interaction styles.

Overview of Messaging

Message Structure

A message has two parts:

Header — metadata: message ID, return address, shard key (for ordering)
Body — the actual data

Message Types

There are three types of messages:

Document — contains data only, the receiver decides what to do with it
Command — tells the receiver to perform an operation (like a remote method call)
Event — notifies that something happened in the sender's domain

Channel Types

Point-to-point — delivers each message to exactly one consumer (used for commands)
Publish-subscribe — delivers each message to all subscribed consumers (used for events)

Implementing Interaction Styles Using Messaging

Messaging is flexible enough to implement all the interaction styles from the taxonomy:

Request / Response — the client sends a command message and includes a return address; the service sends a reply message to that address
One-way Notification — the client sends a command or document message with no reply expected
Publish / Subscribe — the service publishes event messages to a publish-subscribe channel; all interested services receive them
Publish / Async Responses — the client publishes a message with a reply channel header; interested services process it and send responses back on that channel

Creating an API Specification for a Messaging-Based Service API

Unlike REST (Open API) or gRPC (Protocol Buffers), there is no widely adopted standard for documenting messaging-based APIs. You typically create informal documentation that describes:

The message channels the service uses (both incoming and outgoing)
The message types and their structure
The message format (JSON, Protocol Buffers, etc.)

Using a Message Broker

A message broker is an intermediary through which messages flow. Services send messages to the broker, and the broker delivers them to the intended recipients.

Broker-Based Messaging

Most production systems use a broker. Popular brokers include ActiveMQ, RabbitMQ, Apache Kafka, Amazon Kinesis, and Amazon SQS.

Benefits: buffering (the sender does not need the receiver to be available), loose coupling between services
Drawbacks: the broker can become a performance bottleneck and a single point of failure; it adds operational complexity

Brokerless Messaging

In brokerless messaging (e.g., ZeroMQ), services communicate directly without an intermediary.

Benefits: lower latency, no broker single point of failure
Drawbacks: services need to discover each other, availability is reduced (both sides must be running), guaranteed delivery is harder to implement

Aspect	Broker-Based	Brokerless
Coupling	Loose (sender and receiver decoupled)	Tighter (need direct addresses)
Buffering	Yes (receiver can be offline)	No (both sides must be up)
Latency	Higher (extra hop through broker)	Lower (direct connection)
Guaranteed delivery	Built-in	Hard to achieve
Operations	Broker must be managed	Simpler infrastructure
Examples	ActiveMQ, RabbitMQ, Kafka, Kinesis, SQS	ZeroMQ

Competing Receivers and Message Ordering

To scale message processing, you run multiple instances of a service that all consume from the same channel. These are called competing receivers. The problem is that concurrent consumers may process messages out of order.

Problem: If two messages for the same entity (e.g., "create order" then "update order") are processed by different instances concurrently, the update may be processed before the create, causing errors.

Sharded (Partitioned) Channels

The solution is sharded channels (also called partitioned channels). Each message has a shard key (e.g., the order ID). The broker hashes the key to assign the message to a specific shard. Each shard is consumed by exactly one instance.

In Kafka terminology, this is called Consumer Groups: the broker assigns each partition to exactly one consumer in the group.

This guarantees per-shard-key ordering while still enabling horizontal scaling.

Analogy: Think of a post office with multiple counters. Instead of random assignment, each counter handles a specific ZIP code range. All letters for the same ZIP always go to the same counter, so they are processed in order.

Handling Duplicate Messages

Message brokers typically guarantee at-least-once delivery: a message will be delivered, but it might be delivered more than once if a failure occurs. Your service must handle duplicates.

Strategy 1: Idempotent Handlers

If your message handler is naturally idempotent (processing the same message twice has the same effect as processing it once), duplicates are harmless. This only works if the broker also preserves ordering on redelivery.

Strategy 2: Message Tracking

Record each processed message ID in a PROCESSED_MESSAGES table as part of the same database transaction that performs the business logic. If a duplicate arrives, the INSERT into the tracking table fails (duplicate key), and the handler skips it.

Best practice: When your logic is not naturally idempotent, use message tracking. It guarantees exactly-once processing as long as the tracking and business logic share the same transaction.

Transactional Messaging

A common need is to atomically update a database and publish a message. For example, when creating an order, you want to insert the order into the database and publish an "OrderCreated" event. If either operation fails alone, the system becomes inconsistent.

Problem: Distributed transactions (two-phase commit) across the database and message broker are not a viable solution for modern microservices. They are complex, slow, and many brokers do not support them.

Transactional Outbox Pattern

Pattern: Transactional Outbox

Problem How to atomically update the database and publish a message?

Solution Insert the message into an OUTBOX table as part of the same local ACID database transaction that performs the business update. A separate MessageRelay process reads the OUTBOX and publishes messages to the broker.

MessageRelay Implementations

The MessageRelay component reads messages from the OUTBOX table and publishes them to the broker. Two approaches:

Polling Publisher: The relay periodically queries the OUTBOX table for unpublished messages. Simple to implement, but the polling adds load to the database. It may not work well with NoSQL databases that lack efficient querying.

Transaction Log Tailing: The relay reads the database's transaction log (also called commit log or WAL). When an INSERT into the OUTBOX is found in the log, the relay publishes the corresponding message. This is more sophisticated but very performant because it uses the database's built-in change capture.

Technologies for log tailing: Debezium, Eventuate Tram, DynamoDB Streams.

Aspect	Polling Publisher	Transaction Log Tailing
Complexity	Simple	More complex
Performance	DB polling is expensive at scale	High performance
NoSQL support	May not work well	Works (via native streams)
Technologies	Custom implementation	Debezium, Eventuate Tram, DynamoDB Streams

Libraries and Frameworks for Messaging

Rather than using broker client libraries directly, use higher-level frameworks that abstract away the details of messaging infrastructure. These frameworks handle message serialization, channel management, and transactional outbox concerns so you can focus on business logic.

3.4 Using Asynchronous Messaging to Improve Availability

The IPC mechanism you choose has a direct effect on your system's availability. This section explains why synchronous communication hurts availability and how to fix it.

Synchronous Communication Reduces Availability

When service A calls service B synchronously, A is available only if B is also available. If B also calls service C, then A is available only if both B and C are available. The overall availability is the product of all dependent service availabilities.

Example: If each service is 99.5% available and you have a chain of three synchronous calls, the combined availability drops to 0.995 × 0.995 × 0.995 ≈ 98.5%. With more services, it gets worse quickly.

Eliminating Synchronous Interaction

There are three strategies to eliminate synchronous dependencies and improve availability:

Strategy 1: Use Asynchronous Interaction Styles End-to-End

Replace synchronous request/response with asynchronous messaging throughout the system. The service that receives the initial request sends it as a message and responds immediately. Downstream processing happens asynchronously.

This is not always possible, especially for external REST APIs that clients expect to return a result immediately.

Strategy 2: Replicate Data Locally

Each service maintains a local copy of the data it needs from other services. It subscribes to events from those services and keeps its copy up to date. When it needs the data, it reads from its local copy instead of making a synchronous call.

Benefit: No synchronous dependency at query time
Drawback: Can require storing large amounts of data; does not solve the problem for write operations

Strategy 3: Return Response Immediately, Finish Processing Later

The service creates the entity in a PENDING state and returns a response immediately to the client. It then validates and completes the operation asynchronously using a saga. When the saga finishes, the entity moves to a final state (e.g., APPROVED or REJECTED).

Benefit: High availability because the initial call has no synchronous dependencies
Tradeoff: The client must handle the PENDING state. It needs to either poll for the final result or receive a notification when processing completes.

Analogy: Think of placing an order at a busy restaurant. The waiter does not go to the kitchen and wait for your food before confirming your order. Instead, the waiter writes down your order (PENDING), confirms it immediately, and brings the food later when it is ready.

Strategy	How It Works	Limitation
Async end-to-end	Replace all sync calls with messaging	Not always possible for external APIs
Replicate data locally	Subscribe to events, keep local copy	Large data storage; does not solve writes
Respond immediately + saga	Create in PENDING state, process via saga	Client must handle PENDING state

Key Takeaways

Interaction styles are classified by cardinality (one-to-one vs. one-to-many) and timing (synchronous vs. asynchronous). This taxonomy guides your IPC choice.
Use API-first design and Semantic Versioning to keep APIs stable. Apply the Robustness Principle for backward-compatible evolution.
Prefer cross-language message formats (JSON or Protocol Buffers) even in single-language projects.
REST is simple and familiar but limited to request/response. gRPC supports rich operations and streaming but has weaker browser support.
Always combine network timeouts, outstanding request limits, and a Circuit Breaker for synchronous calls.
Use platform-provided service discovery (Kubernetes, Docker) when possible to avoid per-language discovery libraries.
Asynchronous messaging supports all interaction styles and improves availability by decoupling services.
Use sharded (partitioned) channels to maintain message ordering while scaling consumers horizontally.
Handle duplicate messages with idempotent handlers or a PROCESSED_MESSAGES tracking table.
Use the Transactional Outbox pattern to atomically update the database and publish messages. Prefer Transaction Log Tailing over Polling Publisher for performance at scale.
Synchronous calls multiply availability risk. Eliminate them with end-to-end async messaging, local data replication, or a "respond immediately, finish later" saga approach.

Chapter 2 Chapter 4