DevToys Web Pro iconDevToys Web Proബ്ലോഗ്
ഞങ്ങളെ റേറ്റ് ചെയ്യുക:
ബ്രൗസർ എക്സ്റ്റൻഷൻ പരീക്ഷിക്കുക:
← Back to Blog

VarInt Encoding Guide: LEB128, Protobuf Varint, and ZigZag Explained

8 min read

A fixed-width integer always occupies the same number of bytes regardless of its value — 4 bytes for a 32-bit integer even if the value is 1. Variable-length integers (VarInts) fix this: the number 1 takes a single byte, while larger numbers grow as needed. Use the VarInt encoder/decoder to follow the examples in this guide interactively.

The Core Idea: Base-128 Groups

A VarInt splits a number into 7-bit groups, then stores each group in one byte. The remaining bit — the most-significant bit (MSB) — is used as a continuation flag:

  • MSB = 1: more bytes follow; keep reading.
  • MSB = 0: this is the last byte.

Because each byte carries 7 bits of payload, the scheme is called base-128 encoding. Numbers from 0 to 127 fit in a single byte. Numbers from 128 to 16383 take two bytes, and so on. A 64-bit integer never needs more than 10 bytes.

Byte Order: Least-Significant Group First

Protobuf and DWARF/LEB128 both store the groups in little-endian order: the byte carrying the lowest 7 bits comes first, the byte with the next 7 bits comes second, and so on. This is the opposite of how humans read numbers but matches the little-endian byte order used by x86 CPUs for multi-byte integers.

Worked Example: Encoding 300

Walk through encoding the value 300 (0x12C) step by step. First, write 300 in binary: 100101100. Split into 7-bit groups from the right:

300 = 0b_0000010_0101100
        ‾‾‾‾‾‾‾  ‾‾‾‾‾‾‾
Group 1 (low):  0101100  = 0x2C  (44 decimal)
Group 2 (high): 0000010  = 0x02  (2 decimal)

Now set the continuation bit. Group 1 is not the last group, so set its MSB to 1. Group 2 is the last group, so its MSB stays 0:

ByteRaw 7-bit groupMSB set?Final byteHex
1 (first, low bits)0101100Yes — more follows1010 11000xAC
2 (last, high bits)0000010No — final byte0000 00100x02

The encoded byte sequence is 0xAC 0x02. To decode, strip the MSB from each byte, concatenate the 7-bit groups in reverse order (high group first), and read the resulting binary number.

Decode 0xAC 0x02:

Byte 1: 0xAC = 1010_1100 MSB=1 (more), payload = 010_1100
Byte 2: 0x02 = 0000_0010 MSB=0 (last), payload = 000_0010

Reassemble (high group first): 000_0010 | 010_1100
                              = 0000_0010_0101_100
                              = 256 + 44 = 300

JavaScript Implementation

Here is a minimal unsigned VarInt encoder and decoder in JavaScript. The encoder uses bitwise operations to extract 7-bit groups; note that JavaScript's >> operator is a signed right shift, so the loop uses a BigInt for values above 2^31.

// Encode a non-negative integer as an unsigned LEB128 / Protobuf varint
function encodeVarint(value) {
  const bytes = [];
  let n = BigInt(value);
  do {
    let byte = Number(n & 0x7Fn); // low 7 bits
    n >>= 7n;
    if (n > 0n) byte |= 0x80;    // set continuation bit
    bytes.push(byte);
  } while (n > 0n);
  return new Uint8Array(bytes);
}

// Decode an unsigned LEB128 / Protobuf varint from a byte array at offset
function decodeVarint(bytes, offset = 0) {
  let result = 0n;
  let shift = 0n;
  let pos = offset;
  while (pos < bytes.length) {
    const byte = bytes[pos++];
    result |= BigInt(byte & 0x7F) << shift;
    shift += 7n;
    if ((byte & 0x80) === 0) break; // MSB=0, done
  }
  return { value: result, bytesRead: pos - offset };
}

// Example
const encoded = encodeVarint(300);
console.log([...encoded].map(b => '0x' + b.toString(16))); // ['0xac', '0x2']
console.log(decodeVarint(encoded).value); // 300n

Unsigned vs Signed LEB128

Unsigned LEB128 (ULEB128) treats the integer as a non-negative value — exactly what Protobuf uses for uint32, uint64, field tags, and wire-type-2 lengths. Signed LEB128 (SLEB128) is used in DWARF debug information and WebAssembly for integers that can be negative.

In SLEB128 the final byte's second-most-significant bit is a sign bit. Decoding must sign-extend the result if that bit is 1. Encoding a negative number produces a sequence of bytes ending in 0x7F or lower values with the high bit clear.

Protobuf does not use SLEB128. Instead, it offers two options for signed integers:

  • int32 / int64: Negative values are sign-extended to 64 bits and encoded as a 10-byte varint. This is intentionally wasteful — negative int32 always costs 10 bytes.
  • sint32 / sint64: Uses ZigZag encoding so small negative numbers stay small.

ZigZag Encoding

ZigZag maps signed integers to unsigned integers by interleaving positive and negative values: 0 maps to 0, -1 maps to 1, 1 maps to 2, -2 maps to 3, 2 maps to 4, and so on. The formula for a 32-bit signed integer is:

// ZigZag encode a signed 32-bit integer
function zigzagEncode32(n) {
  return (n << 1) ^ (n >> 31);
}

// ZigZag decode back to signed
function zigzagDecode32(n) {
  return (n >>> 1) ^ -(n & 1);
}

// Examples
zigzagEncode32(0);   // 0
zigzagEncode32(-1);  // 1
zigzagEncode32(1);   // 2
zigzagEncode32(-2);  // 3
zigzagEncode32(2147483647);  // 4294967294
zigzagEncode32(-2147483648); // 4294967295

The arithmetic right shift (n >> 31) produces all zeros for non-negative numbers and all ones (0xFFFFFFFF) for negative numbers. XOR-ing with the left-shifted value achieves the interleaving. After ZigZag encoding, the result is a non-negative integer that can be stored as a compact unsigned varint.

Where VarInts Appear

Format / ProtocolVariantNotes
Protocol BuffersUnsigned LEB128Field tags, int32/64, uint32/64, bool, enum; sint32/64 uses ZigZag first
DWARF debug infoBoth ULEB128 and SLEB128Section offsets and lengths use ULEB128; signed attribute values use SLEB128
WebAssemblyBoth ULEB128 and SLEB128Section sizes, type indices, and instruction immediates use ULEB128; i32.const and i64.const operands use SLEB128
Bitcoin CompactSizeSimilar but differentUses a prefix byte (0xFD, 0xFE, 0xFF) to signal 2-, 4-, or 8-byte lengths; not base-128 continuation bits
SQLiteCustom varint9-byte maximum; the 9th byte uses all 8 bits instead of 7; used in B-tree page format for row IDs and record lengths

Protobuf field tags themselves are varints: the lower 3 bits encode the wire type and the upper bits encode the field number. A field number of 1 with wire type 0 (varint) produces the tag 0x08. The Protobuf decoder handles tag parsing automatically, but understanding varints lets you decode payloads by hand when needed. For a deeper look at the full protobuf wire format, see the Protobuf Decoding guide.

Decoding by Hand vs Using a Tool

Decoding a single varint by hand is straightforward with the byte table approach shown above. Decoding an entire Protobuf message by hand is tedious: you must track the current byte offset, parse each tag to find the field number and wire type, read the appropriate number of bytes for each wire type, and recurse into nested messages.

A dedicated tool eliminates the error-prone bookkeeping. Paste a hex string or base64 blob into the VarInt encoder/decoder to instantly see the decoded value, byte count, and binary representation with the continuation bits highlighted. No server upload required — all processing runs in your browser.

Common Mistakes

  • Confusing signed and unsigned decoding: If you decode a sint32 Protobuf field as a plain varint without applying ZigZag decode, negative values appear as large positive numbers.
  • Forgetting the little-endian group order: The first byte carries the lowest 7 bits. Reversing the concatenation order is the most common manual decoding mistake.
  • Using 32-bit bitwise operators in JavaScript: JavaScript's |, &, and << operators coerce operands to signed 32-bit integers. Use BigInt for varints that may exceed 2^31.
  • Mixing up ULEB128 and SLEB128: Both formats look identical for non-negative values. The difference only appears for values where the sign bit matters. Always check whether the spec calls for signed or unsigned interpretation.
  • Bitcoin CompactSize is not LEB128: Despite serving the same purpose (variable-length size encoding), CompactSize uses a different scheme with prefix bytes rather than continuation bits. Do not use a LEB128 decoder on Bitcoin data.

Encode and decode variable-length integers instantly with the VarInt encoder/decoder — runs entirely in your browser with no data sent to any server. Enter a decimal value to see the LEB128 byte sequence, or paste hex bytes to decode them back to a number.