DevToys Web Pro iconDevToys Web ProBlogg
Översatt med LocalePack logoLocalePack
Betygsätt oss:
Prova webbläsartillägget:
← Back to Blog

Protobuf Decoding Guide: Wire Format, Varint, and gRPC Framing

9 min read

Protocol Buffers (protobuf) are compact and fast, but they produce binary output that is completely opaque to human eyes. When a gRPC call fails, a .pb cache file looks wrong, or you are inspecting network traffic in a browser DevTools panel, you need to decode those bytes. Use the Protobuf Decoder to follow along with the examples in this guide.

Why You Need to Decode Protobuf

JSON is self-describing. Protobuf is not. A protobuf payload is a sequence of binary-encoded fields with no field names, no type labels beyond the wire type, and no human-readable separators. Three situations force developers to decode it manually:

  • gRPC payloads in browser DevTools or proxy logs: Chrome shows the raw bytes of HTTP/2 DATA frames. curl and mitmproxy intercept the stream but cannot render it without the schema.
  • .pb files from caches or queues: Serialized proto objects written to disk by server processes, message-queue bodies, or build artifact stores are just binary blobs without the accompanying .proto file.
  • No schema available: Third-party APIs, compiled mobile apps, and legacy services sometimes expose protobuf endpoints without publishing their .proto definitions. You can still extract field numbers and raw values.

Wire Format: Tag Byte

Every field in a protobuf message is encoded as a tag followed by a value. The tag is a varint that packs two pieces of information:

tag = (field_number << 3) | wire_type

The lower 3 bits are the wire type. The upper bits are the field number from the .proto definition. To decode a tag, read the varint, mask the last 3 bits for the wire type, and shift right by 3 for the field number.

Example: a tag byte of 0x0a (decimal 10) decodes as field number 1, wire type 2 (length-delimited).

0x0a = 0b00001010
wire_type  = 0b010 = 2  (length-delimited)
field_number = 0b00001 = 1

Wire Types

Wire TypeValueEncodingUsed For
Varint0Variable-length integerint32, int64, uint32, uint64, sint32, sint64, bool, enum
64-bit18 bytes, little-endianfixed64, sfixed64, double
Length-delimited2Varint length, then raw bytesstring, bytes, embedded messages, packed repeated fields
32-bit54 bytes, little-endianfixed32, sfixed32, float

Wire types 3 and 4 were used for the deprecated group construct and are not used in any modern .proto file. If you encounter them, you are likely reading a very old serialized object.

Varint Encoding

Varints are the key to protobuf's compactness. Instead of always using 4 or 8 bytes for an integer, protobuf uses as few bytes as necessary. The encoding rule is:

  • Each byte contributes 7 bits of the integer value (low 7 bits).
  • The most-significant bit (MSB) of each byte is a continuation flag: 1 means more bytes follow, 0 means this is the last byte.
  • Bytes are in little-endian order (least significant group first).

Decoding the two-byte varint 0x96 0x01 by hand:

Byte 1: 0x96 = 0b10010110 MSB=1 (more follows), payload bits = 0010110
Byte 2: 0x01 = 0b00000001 MSB=0 (last byte),  payload bits = 0000001

Concatenate in reverse order (little-endian):
  0000001 | 0010110 = 0b00000010010110 = 150

Small values (0–127) always fit in a single byte. Large values up to 2^64 may take up to 10 bytes.

Length-Delimited Fields

Wire type 2 is the most versatile. After the tag, a varint encodes the byte length of the field, followed immediately by that many raw bytes. The same wire type encodes four different logical types:

  • Strings: UTF-8 encoded text. You can usually detect this because the bytes are valid UTF-8.
  • Bytes fields: Arbitrary binary data with no encoding guarantee.
  • Embedded messages: Another protobuf message serialized inline. The inner bytes follow the same tag-value format recursively.
  • Packed repeated fields: Proto3 uses packed encoding by default for repeated scalar fields — all values concatenated without individual tags, preceded by the total byte count.

Without the .proto schema you cannot distinguish between these four subtypes from the wire format alone. A schemaless decoder will label them all as bytes or attempt a heuristic UTF-8 decode.

Schemaless Decoding

You do not need a .proto file to partially decode a protobuf payload. The wire format gives you:

  • Field numbers for every field present in the message.
  • Wire types, which constrain the possible proto types per field.
  • Raw values: integers as numbers, wire-type-2 fields as hex or attempted UTF-8.

What you lose without a schema: field names, the distinction between string/bytes/message for wire type 2, signed vs unsigned integer interpretation, and enum labels.

The Protobuf Decoder operates in schemaless mode by default. Paste raw hex or base64 and it extracts all field numbers with their wire types and decoded values. Supply a .proto definition to get named fields and proper type resolution.

Schema-Aware CLI Tools

When you do have the schema, these tools provide full, named decoding:

# protoc: decode a binary .pb file using a known message type
protoc --decode=mypackage.MyMessage my_service.proto < payload.pb

# grpcurl: make a gRPC call and decode the response
grpcurl -plaintext -proto my_service.proto \
  -d '{"id": 1}' \
  localhost:50051 mypackage.MyService/GetItem

# buf curl: same idea via the Buf CLI
buf curl --schema my_service.proto \
  --data '{"id": 1}' \
  http://localhost:50051/mypackage.MyService/GetItem

All three tools require the .proto source. grpcurl also supports gRPC server reflection (-reflection) when the server exposes it, which eliminates the need to supply the schema locally.

gRPC Framing Header

A raw protobuf message and a gRPC message over HTTP/2 are not the same thing. gRPC wraps every protobuf payload in a 5-byte framing header:

Byte 0:    Compressed flag (0 = not compressed, 1 = compressed)
Bytes 1-4: Message length as a 4-byte big-endian unsigned integer
Bytes 5+:  The serialized protobuf message

If you capture a gRPC DATA frame from a browser or proxy and try to decode it directly as protobuf, the first 5 bytes will confuse the decoder. Strip the framing header first:

# Strip the 5-byte gRPC frame header and decode the rest
dd if=grpc_payload.bin bs=1 skip=5 | \
  protoc --decode_raw

grpc-web uses the same framing format but carries it over HTTP/1.1 or HTTP/2 with a Content-Type: application/grpc-web+proto header. The trailer frame (flag byte 0x80) carries gRPC status metadata rather than a protobuf message body.

Decoding in Code

For production use cases — deserializing cache entries, inspecting queue messages, writing test fixtures — here are minimal decode patterns in three languages:

// Node.js — protobufjs dynamic decode (no generated code needed)
import protobuf from 'protobufjs';

const root = await protobuf.load('my_service.proto');
const MyMessage = root.lookupType('mypackage.MyMessage');

const buffer = Buffer.from(hexString, 'hex');
const message = MyMessage.decode(buffer);
console.log(MyMessage.toObject(message, { longs: String, enums: String }));
# Python — low-level decoder without a .proto file
from google.protobuf.internal.decoder import _DecodeVarint

def decode_raw(data: bytes):
    pos = 0
    while pos < len(data):
        tag, new_pos = _DecodeVarint(data, pos)
        field_number = tag >> 3
        wire_type = tag & 0x7
        pos = new_pos
        if wire_type == 0:  # varint
            value, pos = _DecodeVarint(data, pos)
            yield field_number, wire_type, value
        elif wire_type == 2:  # length-delimited
            length, pos = _DecodeVarint(data, pos)
            yield field_number, wire_type, data[pos:pos + length]
            pos += length
        # handle wire types 1 and 5 similarly (fixed 8 / 4 bytes)
// Go — proto.Unmarshal with a generated type
import (
    "google.golang.org/protobuf/proto"
    mypb "example.com/mypackage"
)

func decode(b []byte) (*mypb.MyMessage, error) {
    msg := &mypb.MyMessage{}
    if err := proto.Unmarshal(b, msg); err != nil {
        return nil, err
    }
    return msg, nil
}

Common Pitfalls

  • Signed varints use zigzag encoding: sint32 and sint64 map negative integers to positive varints via zigzag (n * 2 for non-negative, -n * 2 - 1 for negative). A raw varint decoder will show a large positive number for -1 if you do not apply zigzag decoding.
  • int32 vs sint32: Plain int32 encodes negative numbers as 10-byte varints (sign extension to 64 bits). This is a known inefficiency in the format. Use sint32 when negative values are common.
  • Proto2 vs proto3 defaults: Proto3 does not serialize fields set to their default value (0 for numbers, empty string, false for bool). A missing field in a decoded payload means the value was the default, not that the field was absent.
  • Field number reuse: Removing a field from a .proto and reusing its field number for a different type is a compatibility break. Old serialized objects will decode the bytes with the wrong type. Always use reserved statements.
  • Deprecated groups (wire types 3 and 4): If you find these in a payload, the object was serialized by a very old protoc version. Most modern decoders do not handle them gracefully.

Decode protobuf payloads directly in your browser — no server, no data upload — with the Protobuf Decoder. Paste hex, base64, or raw bytes and get field numbers, wire types, and values immediately. For related data conversion tasks, see the Data Converters Guide and the JSON to TypeScript Types article.