DevToys Web Pro iconDevToys Web ProBlog
Traducido con LocalePack logoLocalePack
Califícanos:
Prueba la extensión del navegador:
← Back to Blog

Mock Data Testing Guide: Fake Data, Seeded Randomness, and Locale-Aware Generation

9 min read

Fake data is one of those things developers reach for constantly — filling a database for a demo, generating fixtures for integration tests, populating a UI prototype, stress-testing a form validator. The Fake Data Generator and Lorem Ipsum Generator let you do this directly in the browser. But the way you generate fake data matters as much as the fact that you use it at all.

Why You Need Fake Data

There are four distinct reasons developers reach for fake data, and each one has different requirements:

  • GDPR and data privacy: You must never copy production data into a development or staging database without scrubbing PII. Real names, email addresses, and phone numbers belong to real people. Generating fake data from scratch sidesteps the problem entirely — there is no risk of accidental exposure because the data was never real.
  • UI and visual testing: Screenshot comparison tests need the same data every run. If a randomly generated name changes between runs, the diff will flag it as a visual regression even when nothing in the UI changed.
  • Load and performance testing: You need a large volume of varied, realistic records. A thousand rows of test@test.com will not expose the same bugs as a thousand rows of varied email formats across different domains and TLDs.
  • Demos and prototypes: A demo populated with "Lorem Ipsum" and "test@example.com" looks unfinished. Realistic-looking names, addresses, and product descriptions sell the concept.

Realistic vs. Predictable: The Core Tradeoff

Random fake data and reproducible fake data pull in opposite directions, and conflating them causes subtle bugs in test suites.

Purely random data is fine for Playwright end-to-end tests where you want to exercise the full input space — different name lengths, accented characters, numbers in usernames. The test asserts behavior, not exact values.

Screenshot tests and snapshot tests are the opposite case. They compare pixel-by-pixel or DOM-by-DOM output against a stored baseline. If the generated name is "Alexandra Kowalski" on one run and "Ben Park" on the next, the snapshot will never match and the test is useless. These tests require the exact same data on every run, on every machine, including CI.

The failure mode is silent: tests pass locally because the developer always regenerates the snapshot, but the snapshot commits contain random data that invalidates as soon as another developer runs the suite.

Seeded Randomness

The solution is seeded randomness. A seed is an integer you supply to the random number generator. Given the same seed, the generator always produces the same sequence of values — deterministic, but still varied enough to exercise realistic code paths.

import { faker } from '@faker-js/faker';

// Without seed: different output every run
console.log(faker.person.fullName()); // "Marcus Hill"
console.log(faker.person.fullName()); // "Priya Nair"

// With seed: identical output every run, on every machine
faker.seed(42);
console.log(faker.person.fullName()); // "Lena Fischer" — always
console.log(faker.person.fullName()); // "Tom Wallace" — always

// Reset seed for each test to keep tests independent
beforeEach(() => {
  faker.seed(42);
});

Pick a seed value and commit it alongside your test fixtures. CI will produce the exact same data as your local machine. When you deliberately want to refresh the test data, change the seed, regenerate, and commit the updated snapshots in a single, reviewable commit.

Locale Awareness

Faker.js ships with locale support covering names, addresses, phone number formats, and more. This matters more than most developers expect.

import { faker } from '@faker-js/faker';

// English US names and addresses
const enFaker = new faker.Faker({ locale: [faker.locales.en_US] });
console.log(enFaker.person.fullName());    // "Michael Johnson"
console.log(enFaker.location.city());     // "Springfield"
console.log(enFaker.phone.number());      // "(555) 867-5309"

// German locale
const deFaker = new faker.Faker({ locale: [faker.locales.de] });
console.log(deFaker.person.fullName());   // "Karl-Heinz Bauer"
console.log(deFaker.location.city());    // "Düsseldorf"
console.log(deFaker.phone.number());     // "030 12345678"

// Japanese locale
const jaFaker = new faker.Faker({ locale: [faker.locales.ja] });
console.log(jaFaker.person.fullName());  // "田中 太郎"
console.log(jaFaker.location.city());   // "大阪市"

Testing your application exclusively with en_US fake data hides real bugs:

  • German names frequently contain umlauts (ä, ö, ü) and double-barrelled surnames — does your sort order handle them correctly?
  • Japanese addresses are written largest-to-smallest (country, prefecture, city, street) — the reverse of English convention. Does your address display component render either order?
  • Phone number formats vary dramatically between locales — some include country codes, some use dots as separators. Does your phone validator accept international formats?

Adding a second locale to your fixture set — even just de or fr — catches an entire class of internationalization bugs that en_US-only data masks.

Lorem Ipsum vs. Real Sentences

Lorem ipsum has been the placeholder text of choice for decades, but it has a significant drawback: it is immediately recognizable as placeholder text. Developers see "Lorem ipsum dolor sit amet" and mentally skip over it. That means bugs in text rendering go unnoticed.

Consider the difference between these two cases:

  • A CMS preview populated with lorem ipsum looks fine to a developer — the layout renders, the font loads. The same preview populated with a customer-style sentence like "Your order #A-10042 has been shipped to 123 Main St." reveals that the order number truncates at 12 characters and the address wraps awkwardly on mobile.
  • A product description field showing "lorem ipsum" in a staging environment does not tell you whether your Markdown renderer handles backticks inside bold text. A realistic description with inline code does.

The Lorem Ipsum Generator is useful for visual layout work where you need neutral filler. For functional testing, generate domain-specific fake sentences with Faker's lorem.sentence() or write hand-crafted fixture strings that exercise specific edge cases.

Common Data Categories

CategoryFaker.js methodGotchas
Namefaker.person.fullName()Some locales produce names with apostrophes (O'Brien) or hyphens — test your sort and search
Emailfaker.internet.email()Generated emails are syntactically valid but not deliverable; never use for email deliverability tests
Phonefaker.phone.number()Format varies by locale; may not pass strict E.164 validation without normalization
Addressfaker.location.streetAddress()Street number ranges and formats differ by locale; postal codes may not match city
Datefaker.date.between()Returns a JS Date object; timezone handling depends on your serialization layer
UUIDfaker.string.uuid()Random per call; use seeded faker or a fixed list for reproducible IDs in relation tests
Numberfaker.number.int()Specify min and max to avoid out-of-range database constraint failures

Structural Generation

Generating a single user is straightforward. Generating 1,000 users with related orders, each with line items pointing to products, requires a structured approach to avoid foreign key violations and maintain referential integrity.

import { faker } from '@faker-js/faker';

faker.seed(42);

// Generate base entities first
const users = Array.from({ length: 100 }, () => ({
  id: faker.string.uuid(),
  name: faker.person.fullName(),
  email: faker.internet.email(),
  createdAt: faker.date.past({ years: 2 }),
}));

const products = Array.from({ length: 20 }, () => ({
  id: faker.string.uuid(),
  name: faker.commerce.productName(),
  price: faker.number.float({ min: 5, max: 500, fractionDigits: 2 }),
}));

// Generate dependent entities referencing parent IDs
const orders = Array.from({ length: 500 }, () => {
  const user = faker.helpers.arrayElement(users);
  const product = faker.helpers.arrayElement(products);
  return {
    id: faker.string.uuid(),
    userId: user.id,
    productId: product.id,
    quantity: faker.number.int({ min: 1, max: 10 }),
    orderedAt: faker.date.between({ from: user.createdAt, to: new Date() }),
  };
});

The key pattern: generate parent entities first, collect their IDs, then reference those IDs when generating child entities. This guarantees referential integrity without a live database and produces data you can pg_dump into a seed file.

Library Comparison

LibraryLanguageNotes
@faker-js/fakerJavaScript / TypeScriptCommunity fork of the original faker.js; actively maintained; full locale support; tree-shakeable
Faker (Python)PythonMature library; used widely in Django test fixtures; good locale coverage
MimesisPythonFaster than Python Faker for bulk generation; strict typing; fewer locales
Mockaroo APIREST APIWeb UI and API for generating CSV / JSON / SQL; good for one-off bulk exports
SQL Insert GeneratorBrowser toolGenerates INSERT statements directly; useful when you need seed SQL for migrations

Integration Tests: Faker + MSW + Playwright

The most robust integration test setup combines seeded Faker with a request interceptor (Mock Service Worker) and a test runner like Playwright or Vitest. This approach keeps your tests independent of the network and makes fixture data explicit.

// fixtures/users.ts — generated once, committed to repo
import { faker } from '@faker-js/faker';

faker.seed(42);

export const testUser = {
  id: faker.string.uuid(),        // deterministic UUID
  name: faker.person.fullName(),  // "Lena Fischer" — always
  email: faker.internet.email(),  // deterministic email
};

// msw/handlers.ts
import { http, HttpResponse } from 'msw';
import { testUser } from '../fixtures/users';

export const handlers = [
  http.get('/api/users/:id', ({ params }) => {
    return HttpResponse.json(testUser);
  }),
];

// user.test.ts (Playwright)
test('shows user profile', async ({ page }) => {
  await page.goto('/users/' + testUser.id);
  await expect(page.getByText(testUser.name)).toBeVisible();
});

Because faker.seed(42) is called before generating testUser, the UUID and name are the same every time the fixture file is imported — in local runs, in CI, and in colleague's machines. The MSW handler returns the same object the test expects to see on screen.

Common Pitfalls

  • Name collisions: Faker generates names from a finite list. With large datasets you will get duplicate full names. If your application requires unique names (user display names, for example), append the UUID suffix or use faker.helpers.unique().
  • Timezone and daylight saving bugs: faker.date.between() returns UTC Date objects. If your application serializes dates as ISO strings and displays them in the user's local timezone, a date like2024-03-10T01:30:00Z (during a US daylight saving transition) will display differently depending on the test machine's system clock. Pin your test timezone with TZ=UTC in CI.
  • Emoji in names: Faker does not generate emoji by default, but if you build a fixture that allows arbitrary Unicode and a user supplies a name like "Alex Parker" with a zero-width joiner somewhere in it, downstream systems — database columns with non-Unicode collations, PDF renderers, email headers — will break in unexpected ways. Test with at least one fixture containing non-ASCII characters.
  • Email deliverability confusion: Faker emails look real. If your application sends transactional email, make absolutely sure your test environment's mail transport is mocked or pointed at a mail trap. Sending to fake addresses generated by Faker will result in bounces and can damage your sending reputation.
  • Seeding across modules: If multiple test files call faker.seed(42) and then call Faker methods in different orders, they will produce different values because the sequence depends on how many values have been drawn. Reset the seed inside a beforeEach block within each test file, or generate all fixtures in a single factory module that is imported rather than re-executed.

Generate fake data and lorem ipsum placeholder text in your browser using Fake Data Generator and Lorem Ipsum Generator — no account required, no data sent to a server. For broader context on the generators category, see the Generators Guide and the UUID Guide for reproducible identifier patterns.