Why gRPC Matters
Why gRPC Matters
The Problem: REST + JSON is wonderful for humans — you can curl it, paste it into Postman, and read it. But for service-to-service traffic at scale, JSON is 3–10x larger on the wire than it needs to be, parsing burns CPU, every team invents its own field naming, and there is no machine-checkable contract. The cost of “readable” shows up in your latency budget and your AWS bill.
The Solution: gRPC is a schema-first RPC framework: you define services and messages in a .proto file, code-generate clients and servers in 11+ languages, and ship binary Protocol Buffers over HTTP/2. The contract is the source of truth, the wire is small, and breaking changes are caught at compile time instead of 3 AM.
Real Impact: Google runs trillions of internal gRPC calls per second. Netflix, Square, Dropbox, and Cloudflare moved core internal traffic from JSON to gRPC and reported double-digit drops in CPU and latency.
Real-World Analogy
Imagine two warehouses that need to exchange inventory updates. There are two ways:
- JSON style: Every clerk writes a free-form note — “the blue widget”, “widget #4 (blue)”, “BLU-WID”. The other end has to guess what each clerk meant. Sometimes the guess is wrong.
- gRPC style: Both warehouses agree on a shared dictionary — product 4271 always means “widget, blue, large”. Every message uses the dictionary. There is no “what does this field mean” conversation, ever.
That dictionary is the .proto file. Once you have a shared schema, the wire format gets smaller, parsers get faster, and ambiguity disappears.
Microservices multiply the cost of every protocol decision. A request that crosses 8 services on its way to a database pays the JSON tax 8 times. Multiply that by your QPS and the difference between “fine” and “painful” is measured in racks of servers. gRPC was built at Google specifically because at their scale the savings were unignorable, and the same math applies to anyone whose internal east-west traffic is in the millions of RPS.
What gRPC actually gives you
- A contract. The
.protofile is checked into git, reviewed in PRs, versioned, and used to generate clients and servers in every language you support. - A small wire format. Protocol Buffers are typically 3–10x smaller than equivalent JSON.
- Streaming, first-class. Unary, server-streaming, client-streaming, and bidirectional — all on the same connection.
- Deadlines, cancellation, metadata, and back-pressure baked into the protocol, not bolted on per-service.
- Multi-language polyglot. Go, Java, C++, Python, Ruby, Node, C#, Rust, Kotlin, Swift, Dart all generated from the same source.
The Building Blocks
Why Three Layers
The Problem: RPC frameworks of the past (CORBA, SOAP, Thrift) shipped their own transport, their own format, and their own service IDL. Every layer reinvented something the network already had.
The Solution: gRPC reuses standards. HTTP/2 carries the bytes. Protocol Buffers describes the messages. The service block describes the methods. Each layer is independently swappable and well-understood.
gRPC is a stack of three things that fit together cleanly. Knowing what each layer does makes debugging dramatically easier — you can say “the framing is fine but the proto schema drifted” instead of waving at the whole stack.
| Layer | What It Does | What You Get |
|---|---|---|
| HTTP/2 | Multiplexed binary transport | Many concurrent RPCs on one TCP connection, header compression (HPACK), server push, flow control |
| Protocol Buffers | Schema and wire format for messages | Compact varint encoding, codegen in 11+ languages, forward and backward compatible if you follow the rules |
| gRPC service IDL | service and rpc declarations | Strongly typed methods, four streaming patterns, generated stubs and skeletons |
HTTP/2 in one paragraph
HTTP/1.1 opens one TCP connection per concurrent request (or pipelines poorly). HTTP/2 opens one TCP connection and multiplexes thousands of independent streams over it. Headers are compressed with HPACK so you don’t pay for repeating Authorization on every call. Streams have flow control, so a slow consumer can’t drown a fast producer. For an internal mesh making millions of calls between the same two pods, this is the difference between “TCP setup is half my latency” and “TCP setup is invisible.”
Protocol Buffers in one paragraph
Protobuf encodes each field as a small integer tag plus a varint-encoded value. There are no field names on the wire, no quotes, no whitespace. A 1 KB JSON object is often 200 bytes of protobuf. The schema is mandatory at compile time but invisible at runtime — both sides need the .proto to make sense of the bytes. That mandatory schema is the source of half the value.
Protocol Buffers Crash Course
Why Field Numbers Matter
The Problem: Without a stable identifier per field, you can’t evolve a schema without breaking old clients.
The Solution: Each field gets a permanent integer tag (the field number). The name can change, the type can change in narrow ways, but the tag is sacred — that is what protobuf actually serializes.
This is the minimum proto3 you need to read and write production schemas. Save it as greeter/v1/greeter.proto:
syntax = "proto3";
package greeter.v1;
option go_package = "github.com/example/greeter/v1;greeterv1";
// Reserve numbers and names from removed fields so they can never be reused.
message Greeting {
reserved 4, 7;
reserved "old_field";
string name = 1;
string language = 2; // e.g. "en", "ja"
int32 enthusiasm = 3; // 1..10
repeated string tags = 5; // arbitrary labels
Mood mood = 6;
oneof contact {
string email = 8;
string phone = 9;
}
optional string nickname = 10; // proto3 explicit presence
}
enum Mood {
MOOD_UNSPECIFIED = 0; // always reserve 0 for the default
MOOD_HAPPY = 1;
MOOD_NEUTRAL = 2;
MOOD_GRUMPY = 3;
}
message SayHelloRequest { Greeting greeting = 1; }
message SayHelloResponse { string message = 1; }
service Greeter {
// Unary
rpc SayHello(SayHelloRequest) returns (SayHelloResponse);
// Server streaming
rpc SayHelloRepeatedly(SayHelloRequest) returns (stream SayHelloResponse);
// Client streaming
rpc SayHelloToCrowd(stream SayHelloRequest) returns (SayHelloResponse);
// Bidirectional
rpc ChatGreetings(stream SayHelloRequest) returns (stream SayHelloResponse);
}
Things this snippet shows you
- Field numbers (1, 2, 3…) are the only stable identifier. Field 1 is encoded with tag
0x08for varint types — that’s what actually goes on the wire. repeatedis protobuf’s name for “list of”. Lists are zero-or-more.oneofmeans “exactly one of these fields will be set” — a tagged union. Setting a new oneof field clears the others.optionalin proto3 brings back “was this field explicitly set or not” — useful for partial updates.enumvalues are integers. The 0 value is the implicit default, so it must mean “unspecified”.reservedblocks future authors from re-using a removed field number or name. Always reserve when deleting.
Scalar types worth knowing
| Proto Type | Wire Encoding | When to Use |
|---|---|---|
int32 / int64 | Varint | Default integer; cheap for small values, expensive for large negatives |
sint32 / sint64 | Zigzag varint | Integers that are often negative |
fixed32 / fixed64 | Fixed 4 / 8 bytes | IDs, hashes — values that are usually large |
bool | 1 byte | Booleans |
string | UTF-8 length-prefixed | Text. Always UTF-8 — binary goes in bytes |
bytes | Length-prefixed | Binary blobs, embedded images, opaque tokens |
google.protobuf.Timestamp | Well-known type | Time. Don’t roll your own — libraries already know how to convert |
Never reuse a field number
Once a .proto with field 5 meaning repeated string tags has shipped to a single client — even an old mobile app you forgot about — that tag is forever. If you re-use field 5 for an int32, the old client’s bytes will silently decode into the new field as garbage. That is a corruption bug, not an error. The only safe move is reserved 5; and pick a new number.
The Four RPC Patterns
Why More Than Just Request/Response
The Problem: Plenty of real workflows aren’t one-shot. Tailing logs, uploading a file in chunks, a multiplayer chat — each wants something different from “send a request, get a response”.
The Solution: HTTP/2 streams give you bidirectional flow for free, so gRPC exposes four patterns instead of one. Each pattern uses the same underlying machinery.
| Pattern | Shape | Use Case |
|---|---|---|
| Unary | 1 req → 1 resp | Most CRUD calls. The default. |
| Server streaming | 1 req → N resp | Tail logs, server-sent events, pagination as a stream, push updates |
| Client streaming | N req → 1 resp | Chunked uploads, telemetry batching, file ingest |
| Bidirectional | N req ↔ N resp | Chat, collaborative editing, control planes, full-duplex pipelines |
Go server implementing all four
package main
import (
"context"
"io"
"log"
"net"
"google.golang.org/grpc"
greeterv1 "github.com/example/greeter/v1"
)
type server struct {
greeterv1.UnimplementedGreeterServer
}
// 1. Unary
func (s *server) SayHello(ctx context.Context, req *greeterv1.SayHelloRequest) (*greeterv1.SayHelloResponse, error) {
return &greeterv1.SayHelloResponse{Message: "hello " + req.GetGreeting().GetName()}, nil
}
// 2. Server streaming
func (s *server) SayHelloRepeatedly(req *greeterv1.SayHelloRequest, stream greeterv1.Greeter_SayHelloRepeatedlyServer) error {
for i := 0; i < 5; i++ {
if err := stream.Send(&greeterv1.SayHelloResponse{Message: "hello again"}); err != nil {
return err
}
}
return nil
}
// 3. Client streaming
func (s *server) SayHelloToCrowd(stream greeterv1.Greeter_SayHelloToCrowdServer) error {
var names []string
for {
in, err := stream.Recv()
if err == io.EOF {
return stream.SendAndClose(&greeterv1.SayHelloResponse{
Message: "hello to " + joinNames(names),
})
}
if err != nil {
return err
}
names = append(names, in.GetGreeting().GetName())
}
}
// 4. Bidirectional
func (s *server) ChatGreetings(stream greeterv1.Greeter_ChatGreetingsServer) error {
for {
in, err := stream.Recv()
if err == io.EOF {
return nil
}
if err != nil {
return err
}
if err := stream.Send(&greeterv1.SayHelloResponse{
Message: "echo: " + in.GetGreeting().GetName(),
}); err != nil {
return err
}
}
}
func main() {
lis, _ := net.Listen("tcp", ":50051")
s := grpc.NewServer()
greeterv1.RegisterGreeterServer(s, &server{})
log.Fatal(s.Serve(lis))
}
Python client
import grpc
from greeter.v1 import greeter_pb2, greeter_pb2_grpc
def main():
# Channels are long-lived. One per (host, port) per process.
with grpc.insecure_channel("localhost:50051") as channel:
stub = greeter_pb2_grpc.GreeterStub(channel)
# Unary
resp = stub.SayHello(
greeter_pb2.SayHelloRequest(
greeting=greeter_pb2.Greeting(name="Robbie", language="en"),
),
timeout=2.0,
metadata=[("x-request-id", "abc-123")],
)
print(resp.message)
# Server streaming
for r in stub.SayHelloRepeatedly(
greeter_pb2.SayHelloRequest(greeting=greeter_pb2.Greeting(name="Robbie"))
):
print(r.message)
# Client streaming
def gen():
for name in ["Ada", "Linus", "Grace"]:
yield greeter_pb2.SayHelloRequest(
greeting=greeter_pb2.Greeting(name=name),
)
print(stub.SayHelloToCrowd(gen()).message)
Schema Evolution
Why Schema Evolution Is the Hard Part
The Problem: Real systems can’t deploy clients and servers atomically. There is always a window where v1 callers talk to v2 servers and v2 callers talk to v1 servers. Get the rules wrong and that window is silent data corruption.
The Solution: A small set of rules that protobuf was specifically designed to support — if you follow them, both sides remain compatible across many releases.
Protobuf’s wire format was designed for the “tolerant reader” principle: unknown fields are silently kept and re-emitted, missing fields take the default value, and field numbers are the only identity. That gives you a long list of safe changes:
| Change | Safe? | Notes |
|---|---|---|
| Add a new field with a new number | Yes | Old clients ignore it; new clients see default for old payloads |
| Rename a field (keep number) | Yes on the wire | Source-incompatible — consumers using the old name in code will break |
| Delete a field | Yes if you reserved it | Reserve both number AND name to prevent reuse |
| Change a field’s type | Almost never | A few narrow conversions are wire-compatible (e.g. int32↔uint32); most are not |
| Reuse a deleted field number | Never | Silent corruption with old clients |
| Add a value to an enum | Yes | Old clients see UNSPECIFIED; design code to handle unknown values |
Move a field into a oneof | No | Wire-compatible but presence semantics change |
Bump package greeter.v1 → v2 | Yes (full break) | Run both side-by-side; migrate callers; retire v1 |
The Tolerant Reader, in practice
Servers and clients alike should:
- Treat unknown fields as “don’t panic, keep the bytes, re-emit on serialization”. The library does this for you.
- Treat missing fields as the documented default. Don’t encode “the field is missing” as a magic value.
- Treat unknown enum values as
FOO_UNSPECIFIEDand have a sensible fallback — never crash. - Use required = nothing. Proto3 has no
requiredkeyword on purpose. If your code requires a field, validate at the application layer with a clear error.
Buf for schema CI
Buf is the tool you almost certainly want for managing .proto at scale. The relevant pieces:
buf lint— consistent style: package naming, file layout, field-number gaps.buf breaking— compares HEAD to main and refuses PRs that break wire compatibility (renumbering a field, removing one without reserving, etc.).buf generate— replaces hand-rolledprotocinvocations.- Buf Schema Registry (BSR) — npm/pypi for protos, with versioned modules other repos can depend on instead of copying
.protofiles around.
Interceptors and Cross-Cutting Concerns
Why Interceptors Exist
The Problem: Auth, logging, metrics, retries, deadline propagation, request IDs — you don’t want any of this in your business handlers, and you certainly don’t want to copy-paste it into every method.
The Solution: Interceptors wrap every RPC at a single point. There’s a server-side flavor and a client-side flavor; you stack them like middleware.
An interceptor sees every request and response on its way through. The four canonical uses:
- Auth. Validate a JWT in metadata; attach the caller identity to
context. - Observability. OpenTelemetry trace span per RPC, latency histograms, error counters by status code.
- Deadlines. Refuse already-expired requests; propagate the remaining budget to downstream calls.
- Retries / hedging. Built into the gRPC client config — declarative, not handwritten.
Client interceptor adding auth metadata (Go)
func AuthInterceptor(token string) grpc.UnaryClientInterceptor {
return func(
ctx context.Context,
method string,
req, reply interface{},
cc *grpc.ClientConn,
invoker grpc.UnaryInvoker,
opts ...grpc.CallOption,
) error {
ctx = metadata.AppendToOutgoingContext(ctx,
"authorization", "Bearer "+token,
"x-client-version", build.Version,
)
return invoker(ctx, method, req, reply, cc, opts...)
}
}
conn, _ := grpc.Dial("orders:50051",
grpc.WithUnaryInterceptor(AuthInterceptor(token)),
grpc.WithStatsHandler(otelgrpc.NewClientHandler()), // OTel built-in
)
Server interceptor enforcing deadlines and tracing
func DeadlineInterceptor(min time.Duration) grpc.UnaryServerInterceptor {
return func(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
deadline, ok := ctx.Deadline()
if !ok {
return nil, status.Error(codes.InvalidArgument, "deadline required")
}
if time.Until(deadline) < min {
return nil, status.Error(codes.DeadlineExceeded, "insufficient time budget")
}
return handler(ctx, req)
}
}
s := grpc.NewServer(
grpc.ChainUnaryInterceptor(
otelgrpc.UnaryServerInterceptor(), // trace + metrics
DeadlineInterceptor(50*time.Millisecond), // budget guard
AuthServerInterceptor(jwtVerifier), // authn
LoggingInterceptor(logger), // structured logs
),
)
Deadline propagation is the unsung hero
If service A is given 200 ms to respond, and it calls B which calls C, the deadline must shrink along the chain — B should not be allowed to spend more than the time A has left. gRPC does this automatically when you pass ctx through. Most distributed timeout incidents come from someone using context.Background() mid-chain and resetting the budget to infinity.
gRPC vs REST
Neither is universally better. The honest answer is “different tools, different boundaries.”
| Dimension | gRPC | REST + JSON |
|---|---|---|
| Wire size | Small (binary varint) | 3–10x larger (text) |
| CPU to encode/decode | Low | Higher (string parsing, allocation) |
| Schema | Mandatory .proto | Optional (OpenAPI, JSON Schema) |
| Language support | Excellent in 11+ languages | Universal |
| Browser support | Needs gRPC-Web or Connect | Native |
| Streaming | First-class (4 patterns) | SSE / WebSocket bolted on |
| Debuggability | Needs grpcurl, Wireshark proto plugin | curl + your eyes |
| Caching at HTTP layer | Hard (POST-shaped) | Easy (GET + ETag) |
| Public API ergonomics | Steep onboarding | Familiar to every developer |
| Internal mesh fit | Excellent | Good but pricier |
The boundary heuristic
- Internal east-west traffic (service ↔ service): gRPC. The performance and contract benefits compound.
- Public APIs for arbitrary developers: REST. Lower onboarding cost; integrates with everything.
- Browser-facing APIs you own end-to-end: Connect or gRPC-Web is now realistic. JSON if you want zero ceremony.
- Mobile clients: gRPC pays off — battery, bandwidth, and latency all improve.
Browser and Edge
Why the Browser Was the Holdout
The Problem: Browsers don’t expose raw HTTP/2 frames or trailers to JavaScript. Plain gRPC is unreachable from fetch().
The Solution: Two protocols that adapt gRPC to what browsers can actually do — gRPC-Web (the original, needs an Envoy proxy) and Connect (the newer, wire-compatible alternative from Buf that runs natively on standard HTTP).
Three options for hitting gRPC from a browser
- gRPC-Web. A subset of gRPC translated by a proxy (typically Envoy or a sidecar). Streaming is server-side only. Mature and widely deployed.
- Connect. Buf’s protocol — same
.proto, but supports three wire formats: gRPC, gRPC-Web, and Connect’s own JSON/protobuf-over-HTTP variant. Servers built withconnect-goorconnect-esspeak all three on the same port. No proxy required. - gRPC-gateway. Generates a REST/JSON facade in front of a gRPC service from annotations in the
.proto. Great when you want a public REST API and an internal gRPC API from one source of truth.
TypeScript Connect client
import { createPromiseClient } from "@connectrpc/connect";
import { createConnectTransport } from "@connectrpc/connect-web";
import { Greeter } from "./gen/greeter/v1/greeter_connect";
import { Greeting } from "./gen/greeter/v1/greeter_pb";
const transport = createConnectTransport({
baseUrl: "https://api.example.com",
interceptors: [
(next) => async (req) => {
req.header.set("authorization", `Bearer ${getToken()}`);
return await next(req);
},
],
});
const client = createPromiseClient(Greeter, transport);
// Unary — works in any modern browser via fetch
const res = await client.sayHello({
greeting: new Greeting({ name: "Robbie", language: "en" }),
});
console.log(res.message);
// Server streaming — iterate over async results
for await (const reply of client.sayHelloRepeatedly({
greeting: new Greeting({ name: "Robbie" }),
})) {
console.log(reply.message);
}
Connect vs gRPC-Web in one line
If you’re starting a new browser-facing API today, use Connect — it works without Envoy, debugs as plain HTTP in the browser’s network tab, and stays wire-compatible with gRPC for your internal mesh.
Real-World Examples
Google built Stubby in the early 2000s as the internal RPC framework that glued together every service in the company. gRPC, open-sourced in 2015, is Stubby with the Google-isms removed and the proto schema published. Internally, every microservice call at Google — trillions per second — flows through this stack.
Netflix migrated significant chunks of its internal communication to gRPC for the same reason it built Hystrix and Eureka before it: at Netflix scale, percent-level CPU savings translate to actual money and headroom. They contribute to the gRPC ecosystem and use it heavily for service mesh data planes.
Square rolled gRPC out across its mobile and backend teams, and built Wire, an alternative protobuf runtime optimized for Android. The driving force was mobile bandwidth: smaller payloads make the app faster on bad networks.
Dropbox wrote about their move from a homegrown RPC system to gRPC, citing the language polyglot support (they use Python, Go, Rust) and the existing tooling for retries, deadlines, and observability as the killer features — not raw performance.
Buf publishes Connect, the modern alternative protocol, and runs the Buf Schema Registry. Their bet is that the future of cross-org RPC isn’t plain gRPC but a wire-compatible superset that includes the browser path natively.
Cloudflare, Lyft, Uber, and Slack all use gRPC heavily for service-to-service communication. The thread is consistent: high-QPS internal traffic where the constants matter.
Best Practices
The short list
- Version your packages. Always
foo.v1, never barefoo. Runv2alongsidev1when you need to break things. - Reserve deleted fields. Both number and name. Add
buf breakingto CI so you can’t forget. - Always have an
UNSPECIFIED = 0for enums. The zero value is the on-the-wire default, and “unspecified” is the only safe meaning. - Use deadlines on every call. A gRPC call with no deadline is a thread leak waiting to happen. Refuse them at the server with an interceptor.
- One channel per upstream, not per call. Channels are expensive to set up and cheap to share. Multiplexing is the whole point of HTTP/2.
- Don’t put business outcomes in gRPC error codes. Use
OKwith a domain status field, notFAILED_PRECONDITION, for “cart already checked out”. ReserveUNAVAILABLEand friends for infra. - Record the proto, not the JSON, in your service mesh. Linkerd, Istio, and Envoy all understand gRPC framing — let them see codes and methods, not opaque POSTs.
- Use Buf for lint, breaking-change checks, and codegen. Hand-rolled
protocinvocations rot fast. - Pair gRPC with the same resilience patterns as REST. Circuit breakers, retries with jitter, bulkheads — all still apply, and the gRPC client config supports them declaratively.
- Choose the boundary deliberately. Internal: gRPC. Public: REST or Connect. Browser: Connect.
The single most useful sentence about gRPC
If you remember one thing
The value of gRPC isn’t the binary wire format — it’s that the schema is mandatory, versioned, and shared. Once your services agree on a contract that compiles, half the production bugs that used to come from “the field changed shape” just stop happening.