DevToys Web Pro iconDevToys Web ProБлог
Перекладено за допомогою LocalePack logoLocalePack
Оцініть нас:
Спробуйте розширення для браузера:
← Back to Blog

Base32 Encoding Guide: RFC 4648, Crockford, TOTP Secrets, and Padding Rules

8 min read

Base32 is a binary-to-text encoding that maps arbitrary bytes into a 32-character alphabet. You encounter it every time you set up two-factor authentication — that string of uppercase letters and digits your authenticator app scans is a Base32-encoded secret. Use the Base32 Encoder / Decoder to follow along with the examples below.

Why Base32 Exists

Base32 encodes 5 bits per output character. Because 5 does not divide evenly into 8 (one byte), groups of 5 bytes (40 bits) produce exactly 8 Base32 characters — and that eight-character block is the fundamental unit of the encoding.

The design goals that distinguish Base32 from Base64 are:

  • Case-insensitivity. The alphabet uses only letters and digits, so JBSWY3DP and jbswy3dp decode to the same bytes. This matters for voice dictation, OCR, and systems that fold case.
  • No special characters. Base64 uses +, /, and sometimes = — characters that require URL-encoding or quoting in shells. Base32 output is safe for filenames, DNS labels, and most identifiers without escaping.
  • Human typeability. Some variants deliberately exclude visually ambiguous glyphs (0 vs O, 1 vs I vs l) so that humans can transcribe codes reliably.

RFC 4648 Base32 Alphabet

The standard defined in RFC 4648 uses 26 uppercase letters A–Z plus the digits 2–7 (digits 0, 1, 8, and 9 are excluded to avoid confusion with O, I, B, and similar-looking glyphs):

Value  Char  |  Value  Char  |  Value  Char  |  Value  Char
  0    A     |    8    I     |   16    Q     |   24    Y
  1    B     |    9    J     |   17    R     |   25    Z
  2    C     |   10    K     |   18    S     |   26    2
  3    D     |   11    L     |   19    T     |   27    3
  4    E     |   12    M     |   20    U     |   28    4
  5    F     |   13    N     |   21    V     |   29    5
  6    G     |   14    O     |   22    W     |   30    6
  7    H     |   15    P     |   23    X     |   31    7
Padding: =

When the input byte count is not a multiple of 5, the encoder pads the output to the next multiple of 8 characters using = signs. One byte produces 2 characters + 6 padding characters; two bytes produce 4 + 4; three bytes produce 5 + 3; four bytes produce 7 + 1.

Crockford Base32

Douglas Crockford defined a variant optimized for human input. It removes four characters that cause transcription errors:

  • I — looks like 1 and l
  • L — looks like 1 and I
  • O — looks like 0
  • U — looks like V and avoids accidental obscenities

The Crockford alphabet is 0123456789ABCDEFGHJKMNPQRSTVWXYZ. It also allows an optional check digit (one of 37 symbols, including *~$=U) appended after the encoded value to detect single-character transcription errors — useful for serial numbers and coupon codes typed by humans.

Crockford Base32 is used by ULID (Universally Unique Lexicographically Sortable Identifiers), where the 128-bit value is split into a 48-bit timestamp and 80-bit random component, each encoded in Crockford Base32 to produce a 26-character sortable string — one of its most visible modern uses alongside Base58.

z-base-32

z-base-32 is a human-oriented variant designed by Zooko Wilcox-O'Hearn. Instead of ordering the alphabet by bit value, it assigns characters by frequency of use — more common characters (easier to type and less error-prone) get the lower values. The alphabet is ybndrfg8ejkmcpqxot1uwisza345h769.

Because the character ordering is optimized for human perception rather than standard value mapping, z-base-32 output is not compatible with RFC 4648 decoders. It appears in some peer-to-peer and cryptographic protocols (including older versions of Tahoe-LAFS) where human-readable capability strings matter.

Size Overhead: Base32 vs Base64

Base32 is less space-efficient than Base64 because it packs fewer bits per character:

EncodingBits per charOverheadCase-insensitiveAlphabet-safe
Base646~33% larger than binaryNoNo (+/= require escaping)
Base32 (RFC 4648)5~60% larger than binaryYesYes (letters + digits only)
Hex4100% larger than binaryYesYes

A 20-byte SHA-1 hash encodes to 32 Base32 characters (with padding) but only 28 Base64 characters. The ~60% overhead is the price of case-insensitivity and alphabet safety. For long binary blobs (file attachments, image data), Base64 is the practical choice. Base32 makes sense when humans need to read, speak, or type the encoded value.

TOTP Base32 Secrets

Every TOTP setup secret — the value behind your authenticator app's QR code — is a Base32 string. The RFC 6238 TOTP standard and its predecessor RFC 4226 (HOTP) specify Base32 for the shared secret because:

  • Authenticator apps display secrets for manual entry — case-insensitive, no special characters makes manual transcription far less error-prone.
  • The alphabet is safe in QR code alphanumeric mode (A–Z + digits), which produces denser, easier-to-scan codes than byte mode.
  • RFC 4648 Base32 is the required variant — not Crockford, not z-base-32.

TOTP secrets are conventionally transmitted without padding — the trailing = characters are stripped. Most authenticator libraries accept both padded and unpadded forms, but when generating secrets always strip padding before storing or transmitting. See the TOTP / OTP guide for the full implementation walkthrough.

Code Examples

Node.js

npm install base32-encode base32-decode
import base32Encode from 'base32-encode';
import base32Decode from 'base32-decode';

// Encode a Buffer to RFC 4648 Base32 (no padding)
const secret = crypto.randomBytes(20);
const encoded = base32Encode(secret, 'RFC4648', { padding: false });
console.log(encoded); // e.g. JBSWY3DPEHPK3PXP

// Decode back to Buffer
const decoded = Buffer.from(base32Decode(encoded, 'RFC4648'));

// Crockford variant
const crockford = base32Encode(secret, 'Crockford');
console.log(crockford);

Python

import base64
import os

# Generate and encode a 20-byte TOTP secret (RFC 4648)
secret_bytes = os.urandom(20)
encoded = base64.b32encode(secret_bytes).decode('ascii')
print(encoded)          # WITH padding: JBSWY3DPEHPK3PXP====

# Strip padding for TOTP storage/transmission
encoded_nopad = encoded.rstrip('=')
print(encoded_nopad)    # JBSWY3DPEHPK3PXP

# Decode — add padding back if needed
padded = encoded_nopad + '=' * (-len(encoded_nopad) % 8)
decoded = base64.b32decode(padded)

Rust

use data_encoding::BASE32;
// Cargo.toml: data-encoding = "2"

fn main() {
    let data = b"Hello, Base32!";

    // Encode (includes padding)
    let encoded = BASE32.encode(data);
    println!("{}", encoded); // JBSWY3DPEB3W64TMMQQQ====

    // Decode
    let decoded = BASE32.decode(encoded.as_bytes()).unwrap();
    assert_eq!(decoded, data);

    // No-padding encode (for TOTP secrets)
    use data_encoding::BASE32_NOPAD;
    let nopad = BASE32_NOPAD.encode(data);
    println!("{}", nopad); // JBSWY3DPEB3W64TMMQQQ
}

Padding Pitfalls

The most common Base32 bug is a mismatch between encoder and decoder padding expectations:

  • RFC 4648 with padding: The spec requires = padding. Most library decoders expect it by default. If you strip padding before storage and then pass the raw string to a strict decoder, you will get an error.
  • TOTP secrets without padding: Authenticator apps and OTP libraries typically accept both. If you write your own decoder, add padding back before decoding: s += "=" * (-len(s) % 8) in Python.
  • Crockford vs RFC 4648: These alphabets differ in 4 characters. Passing Crockford-encoded data to an RFC 4648 decoder (or vice versa) produces wrong results silently — no exception, just garbage output.
  • Lowercase input: RFC 4648 decoders may reject lowercase. Uppercase before decoding when input comes from user input or config files.

When to Pick Base32 Over Alternatives

ScenarioBest EncodingReason
TOTP / HOTP shared secretsBase32 (RFC 4648)Required by RFC 6238; human-typeable, QR-safe
Voice-dictated activation codesCrockford Base32Excludes I/L/O/U ambiguity; optional check digit
Sortable unique IDs (ULID)Crockford Base32Lexicographic sort preserves time order
File or email attachmentsBase6433% overhead vs 60% — size matters for large payloads
OCR-scanned serial numbersCrockford Base32No ambiguous glyphs; survives OCR substitution errors
URL-safe tokens (short)Base32 or Base58Base32 is case-insensitive; Base58 is more compact
Filesystem-safe checksumsBase32No / or + characters; safe in filenames

For a broader comparison of all binary-to-text encodings, see the Encoders and Decoders guide.


Encode and decode Base32 strings (RFC 4648, Crockford, and z-base-32) directly in your browser with the Base32 Encoder / Decoder — all processing is local, nothing leaves your machine.