Generating Zod schemas from Contentful: why your project needs this

Fabio Fognani May 29, 2026

I have been working with Contentful for about three years now.

On projects using GraphQL API, I used graphql-codegen and moved on. When I started working more heavily with the Contentful REST APIs, I wrote a custom codegen — nothing exotic, just enough to generate types from the content model and avoid the most tedious boilerplate.

It worked, but it had gaps. After running into the same issues a few times I realized: none of this is specific to my project. And I’m going to need this again. And again.

So I rebuilt it properly as @xndrjs/contentful-to-zod — the way I’d want any infrastructure tool in the xndrjs ecosystem to work.

Here’s what the gaps were, and what this tool does differently.

What my custom codegen was missing

The `required` assumption

My codegen was initially treating required fields as non-optional in the generated types. Which makes sense — until you hit Preview.

In Contentful, required means required to publish, not always present in the API response. A draft entry with slug filled in but title empty is a perfectly valid response:

{
  "sys": { "id": "abc123" },
  "fields": {
    "slug": "my-draft-post"
  }
}

title is just absent. Not null. The generated type said title: string. That was wrong.

Localization was an afterthought

Fetching with ?locale=* changes the shape of every localized field:

{ "title": { "en-US": "Hello", "it-IT": "Ciao" } }

Single-locale response:

{ "title": "Hello" }

Same field, two completely different shapes. My codegen generated one shape and assumed I’d sort out the other. The transforms that went from transport payload to flat, locale-specific object were hand-written glue that lived outside the codegen, inconsistently.

`Object` fields were untyped

The CMA declares Object fields without any inner schema. My codegen, just like graphql-codegen, initially rendered them as Record<string, unknown>. Technically correct, practically useless or brittle for anything downstream.

The actual gap: types vs. runtime

The tools I had — both graphql-codegen and my custom tool — were good at one thing: compile-time types. They told the compiler what shape to expect. What they couldn’t do:

Validate what actually arrived at runtime
Normalize omitted keys, undefined, and explicit null into something consistent
Flatten a multi-locale field into a single locale on read
Narrow Object fields to a real inner shape

That’s not a flaw in those tools — it’s just what they’re for. But the boundary between Contentful and my app was mine to own, and types alone weren’t enough there.

What `contentful-to-zod` generates instead

Rather than TypeScript interfaces, the tool generates Zod 4 schemas — artifacts that run at the boundary, not just at compile time.

For a blogPost content type with a localized title and an optional author link, it looks something like this:

// Transport shape — what comes over the wire
export const BlogPostDeliveryFieldsSchema = z.object({
  title: transportField(z.record(ContentfulLocaleCodeSchema, z.string().max(256))),
  slug: transportField(z.string()),
  author: transportField(ContentfulEntryLinkSchema),
});

// Flat shape — after locale flattening
export const BlogPostFieldSchema = z.object({
  title: flatField(z.string().max(256)),
  slug: flatField(z.string()),
  author: flatField(ContentfulEntryLinkSchema),
});

Two schemas — one for the transport layer, one for the flat locale-specific shape. CMA validations like .max(256) flow into the Zod chains. And z.infer gives accurate types: where graphql-codegen gives you string for a Symbol field with allowed values, this gives you the actual union.

transportField and flatField normalize absent values to null:

export function transportField<T extends z.ZodType>(schema: T) {
  return schema
    .nullable()
    .optional()
    .transform((v) => v ?? null);
}

They look identical today, but the semantic distinction matters: one marks a Contentful wire payload, the other a locale-flattened shape. It keeps the door open for diverging behavior in future codegen without touching consuming code.

At the boundary

const entry = BlogPostEntrySchema.parse(rawFromContentful);
const flat = flattenBlogPostEntryFields(entry.fields, "it-IT");
const post = BlogPostFieldSchema.parse(flat);

The flatten* helper is also generated — one per content type when using locale.mode: "both" (the default). No more hand-written glue per content type. No more isRecord utils to guess if some value is localized or not (this was actually the solution AI kept suggesting me before I decided to design a more structured solution, and it gave me the shivers).

Domain rules stay separate:

const PublishedPost = BlogPostFieldSchema.extend({
  title: z.string().min(1),
});
const trusted = PublishedPost.parse(flat);

The multi-locale caching pattern

Having explicit transport and flat schemas also made another pattern cleaner: fetch once with ?locale=*, cache the full multi-locale payload, flatten on read.

?locale=* → cache raw multi-locale entry
→ flattenBlogPostEntryFields(fields, "it-IT") for /it/...
→ flattenBlogPostEntryFields(fields, "en-US") for /en/...

Same cached entry, different flat object per locale. Worth noting that payload size grows with locale count — this makes more sense for subsets of content than for everything in your space, please keep it in mind!

`Object` field overrides

By default, Object fields generate z.record(z.string(), z.unknown()). For fields where the inner shape is known, you can narrow them in config:

export default defineConfig({
  objects: {
    "blogPost.metadata": z.object({
      seoTitle: z.string(),
      noIndex: z.boolean().optional(),
    }),
  },
});

Inlined at codegen time — no runtime dependency on the config.

Setup

pnpm add zod@^4
pnpm add -D @xndrjs/contentful-to-zod @dotenvx/dotenvx

Put your Contentful credentials in .env:

CONTENTFUL_SPACE_ID=your_space_id
CONTENTFUL_MANAGEMENT_TOKEN=your_management_token
CONTENTFUL_ENVIRONMENT=master

Then add codegen scripts to package.json so your package manager resolves the local CLI:

{
  "scripts": {
    "contentful:schema": "dotenvx run -- contentful-to-zod --out ./src/generated/contentful.schemas.ts --snapshot ./content-types.json --snapshot-locales ./locales.json"
  }
}

Fetch from your space and generate:

pnpm run contentful:schema

For a one-off run, you can also use npx:

npx @dotenvx/dotenvx run -- npx @xndrjs/contentful-to-zod \
  --out ./src/generated/contentful.schemas.ts \
  --snapshot ./content-types.json \
  --snapshot-locales ./locales.json

No runtime dependency on @xndrjs/contentful-to-zod in production — only the generated file and zod.

Related Links