Protobuf Decoding Guide: Wire Format, Varint, and gRPC Framing
Protocol Buffers (protobuf) are compact and fast, but they produce binary output that is completely opaque to human eyes. When a gRPC call fails, a .pb cache file looks wrong, or you are inspecting network traffic in a browser DevTools panel, you need to decode those bytes. Use the Protobuf Decoder to follow along with the examples in this guide.
Why You Need to Decode Protobuf
JSON is self-describing. Protobuf is not. A protobuf payload is a sequence of binary-encoded fields with no field names, no type labels beyond the wire type, and no human-readable separators. Three situations force developers to decode it manually:
- gRPC payloads in browser DevTools or proxy logs: Chrome shows the raw bytes of HTTP/2 DATA frames. curl and mitmproxy intercept the stream but cannot render it without the schema.
- .pb files from caches or queues: Serialized proto objects written to disk by server processes, message-queue bodies, or build artifact stores are just binary blobs without the accompanying
.protofile. - No schema available: Third-party APIs, compiled mobile apps, and legacy services sometimes expose protobuf endpoints without publishing their
.protodefinitions. You can still extract field numbers and raw values.
Wire Format: Tag Byte
Every field in a protobuf message is encoded as a tag followed by a value. The tag is a varint that packs two pieces of information:
tag = (field_number << 3) | wire_typeThe lower 3 bits are the wire type. The upper bits are the field number from the .proto definition. To decode a tag, read the varint, mask the last 3 bits for the wire type, and shift right by 3 for the field number.
Example: a tag byte of 0x0a (decimal 10) decodes as field number 1, wire type 2 (length-delimited).
0x0a = 0b00001010
wire_type = 0b010 = 2 (length-delimited)
field_number = 0b00001 = 1Wire Types
| Wire Type | Value | Encoding | Used For |
|---|---|---|---|
| Varint | 0 | Variable-length integer | int32, int64, uint32, uint64, sint32, sint64, bool, enum |
| 64-bit | 1 | 8 bytes, little-endian | fixed64, sfixed64, double |
| Length-delimited | 2 | Varint length, then raw bytes | string, bytes, embedded messages, packed repeated fields |
| 32-bit | 5 | 4 bytes, little-endian | fixed32, sfixed32, float |
Wire types 3 and 4 were used for the deprecated group construct and are not used in any modern .proto file. If you encounter them, you are likely reading a very old serialized object.
Varint Encoding
Varints are the key to protobuf's compactness. Instead of always using 4 or 8 bytes for an integer, protobuf uses as few bytes as necessary. The encoding rule is:
- Each byte contributes 7 bits of the integer value (low 7 bits).
- The most-significant bit (MSB) of each byte is a continuation flag: 1 means more bytes follow, 0 means this is the last byte.
- Bytes are in little-endian order (least significant group first).
Decoding the two-byte varint 0x96 0x01 by hand:
Byte 1: 0x96 = 0b10010110 → MSB=1 (more follows), payload bits = 0010110
Byte 2: 0x01 = 0b00000001 → MSB=0 (last byte), payload bits = 0000001
Concatenate in reverse order (little-endian):
0000001 | 0010110 = 0b00000010010110 = 150Small values (0–127) always fit in a single byte. Large values up to 2^64 may take up to 10 bytes.
Length-Delimited Fields
Wire type 2 is the most versatile. After the tag, a varint encodes the byte length of the field, followed immediately by that many raw bytes. The same wire type encodes four different logical types:
- Strings: UTF-8 encoded text. You can usually detect this because the bytes are valid UTF-8.
- Bytes fields: Arbitrary binary data with no encoding guarantee.
- Embedded messages: Another protobuf message serialized inline. The inner bytes follow the same tag-value format recursively.
- Packed repeated fields: Proto3 uses packed encoding by default for repeated scalar fields — all values concatenated without individual tags, preceded by the total byte count.
Without the .proto schema you cannot distinguish between these four subtypes from the wire format alone. A schemaless decoder will label them all as bytes or attempt a heuristic UTF-8 decode.
Schemaless Decoding
You do not need a .proto file to partially decode a protobuf payload. The wire format gives you:
- Field numbers for every field present in the message.
- Wire types, which constrain the possible proto types per field.
- Raw values: integers as numbers, wire-type-2 fields as hex or attempted UTF-8.
What you lose without a schema: field names, the distinction between string/bytes/message for wire type 2, signed vs unsigned integer interpretation, and enum labels.
The Protobuf Decoder operates in schemaless mode by default. Paste raw hex or base64 and it extracts all field numbers with their wire types and decoded values. Supply a .proto definition to get named fields and proper type resolution.
Schema-Aware CLI Tools
When you do have the schema, these tools provide full, named decoding:
# protoc: decode a binary .pb file using a known message type
protoc --decode=mypackage.MyMessage my_service.proto < payload.pb
# grpcurl: make a gRPC call and decode the response
grpcurl -plaintext -proto my_service.proto \
-d '{"id": 1}' \
localhost:50051 mypackage.MyService/GetItem
# buf curl: same idea via the Buf CLI
buf curl --schema my_service.proto \
--data '{"id": 1}' \
http://localhost:50051/mypackage.MyService/GetItemAll three tools require the .proto source. grpcurl also supports gRPC server reflection (-reflection) when the server exposes it, which eliminates the need to supply the schema locally.
gRPC Framing Header
A raw protobuf message and a gRPC message over HTTP/2 are not the same thing. gRPC wraps every protobuf payload in a 5-byte framing header:
Byte 0: Compressed flag (0 = not compressed, 1 = compressed)
Bytes 1-4: Message length as a 4-byte big-endian unsigned integer
Bytes 5+: The serialized protobuf messageIf you capture a gRPC DATA frame from a browser or proxy and try to decode it directly as protobuf, the first 5 bytes will confuse the decoder. Strip the framing header first:
# Strip the 5-byte gRPC frame header and decode the rest
dd if=grpc_payload.bin bs=1 skip=5 | \
protoc --decode_rawgrpc-web uses the same framing format but carries it over HTTP/1.1 or HTTP/2 with a Content-Type: application/grpc-web+proto header. The trailer frame (flag byte 0x80) carries gRPC status metadata rather than a protobuf message body.
Decoding in Code
For production use cases — deserializing cache entries, inspecting queue messages, writing test fixtures — here are minimal decode patterns in three languages:
// Node.js — protobufjs dynamic decode (no generated code needed)
import protobuf from 'protobufjs';
const root = await protobuf.load('my_service.proto');
const MyMessage = root.lookupType('mypackage.MyMessage');
const buffer = Buffer.from(hexString, 'hex');
const message = MyMessage.decode(buffer);
console.log(MyMessage.toObject(message, { longs: String, enums: String }));# Python — low-level decoder without a .proto file
from google.protobuf.internal.decoder import _DecodeVarint
def decode_raw(data: bytes):
pos = 0
while pos < len(data):
tag, new_pos = _DecodeVarint(data, pos)
field_number = tag >> 3
wire_type = tag & 0x7
pos = new_pos
if wire_type == 0: # varint
value, pos = _DecodeVarint(data, pos)
yield field_number, wire_type, value
elif wire_type == 2: # length-delimited
length, pos = _DecodeVarint(data, pos)
yield field_number, wire_type, data[pos:pos + length]
pos += length
# handle wire types 1 and 5 similarly (fixed 8 / 4 bytes)// Go — proto.Unmarshal with a generated type
import (
"google.golang.org/protobuf/proto"
mypb "example.com/mypackage"
)
func decode(b []byte) (*mypb.MyMessage, error) {
msg := &mypb.MyMessage{}
if err := proto.Unmarshal(b, msg); err != nil {
return nil, err
}
return msg, nil
}Common Pitfalls
- Signed varints use zigzag encoding:
sint32andsint64map negative integers to positive varints via zigzag (n * 2for non-negative,-n * 2 - 1for negative). A raw varint decoder will show a large positive number for-1if you do not apply zigzag decoding. - int32 vs sint32: Plain
int32encodes negative numbers as 10-byte varints (sign extension to 64 bits). This is a known inefficiency in the format. Usesint32when negative values are common. - Proto2 vs proto3 defaults: Proto3 does not serialize fields set to their default value (0 for numbers, empty string, false for bool). A missing field in a decoded payload means the value was the default, not that the field was absent.
- Field number reuse: Removing a field from a
.protoand reusing its field number for a different type is a compatibility break. Old serialized objects will decode the bytes with the wrong type. Always usereservedstatements. - Deprecated groups (wire types 3 and 4): If you find these in a payload, the object was serialized by a very old protoc version. Most modern decoders do not handle them gracefully.
Decode protobuf payloads directly in your browser — no server, no data upload — with the Protobuf Decoder. Paste hex, base64, or raw bytes and get field numbers, wire types, and values immediately. For related data conversion tasks, see the Data Converters Guide and the JSON to TypeScript Types article.