L10N.md Specification v1 Source repository

Specification

L10N.md

A convention for repository-resident translation context

Status
Stable
Version
1
Updated
2026-05-25
Source
github.com/glossia/l10n
License
MIT

Abstract

L10N.md is a convention by which translators provide the context that AI-based translation systems need before they translate strings. It keeps the machine-readable surface small in YAML frontmatter and pushes the subtle guidance into ordinary Markdown prose. Documents live in the repository, are reviewed in pull requests, and are validated against versioned JSON Schemas. This document specifies the document model, the version 1 schema, and conformance requirements.

Status of this document

This is version 1 of the L10N.md standard and is considered stable. Sections are normative unless explicitly marked Informative. A future version may add fields or tighten rules; such changes require a version increment and a new schema directory and will not silently change the editing model defined here. A future version is also expected to introduce extension capabilities that let repository-scoped context be extended with remotely-provided context. Copyright (c) 2026 Glossia; released under the MIT license recorded in the source repository.


1Introduction

AI-based translation systems produce better results when they are given context before they begin: the source language, the intended tone, brand terminology, and any constraints that do not fit neatly into key and value pairs. L10N.md is the convention by which translators record that context as files that live next to the code they describe, so the systems that translate the strings can read it.

The convention works because it keeps the machine-readable surface tiny and pushes the judgment into prose. Frontmatter carries structure. Markdown carries nuance. Additional files appear only when the translation surface actually splits. The same shape holds whether a project has one document or many.

2Conventions and Terminology

2.1 Requirements Notation

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2.2 Terminology

Repository
The version-controlled project in which L10N documents reside.
Document
A Markdown file that participates in this standard, classified by its repository path.
Frontmatter
An optional leading YAML mapping delimited by lines containing only ---.
Body
The Markdown content following the frontmatter; the prose that carries guidance.
Root document
The repository-wide L10N.md, also called the global document.
Scoped document
A non-root L10N.md that adds workflow settings to a subtree.
Locale overlay
A per-language file stored under an L10N/ directory.
Source language
The language strings are authored in, declared by the root document.
Locale identifier
A language tag as defined by [BCP 47], with an underscore (_) permitted in place of a hyphen as the subtag separator. See Section 4.5. Examples: en, es, zh-Hant, zh_Hant.
Validation command
An array of non-empty strings carried by the validation field. The translation tool MUST execute it as a command and arguments against translated output, treating exit code 0 as pass and any non-zero exit as failure. Before execution the tool MUST set these environment variables:
  • L10N_SOURCE_PATH — absolute path to the source file
  • L10N_TARGET_PATH — absolute path to the translated output file
  • L10N_LOCALE — the target locale identifier (e.g. es, ja)
  • L10N_DOC_PATH — absolute path to the L10N.md document controlling the scope
Implementations MAY provide additional env vars. Validation commands SHOULD write diagnostics to stderr; stdout MAY be captured as structured output by the tool.

3Document Model

There are three document shapes. They share one mental model: state structure in frontmatter, explain judgment in prose, and add files only when scope demands it. The normative contract for each shape is the corresponding JSON Schema in Section 4.

3.1 General Requirements

An L10N document MUST be a Markdown file. It MAY begin with a frontmatter block; the content after the frontmatter is the body. Either side MAY be empty: the body MAY be empty when the frontmatter is sufficient on its own, and the frontmatter MAY be absent when the document only adds body context. Each role's schema specifies which keys, if any, remain required.

A document's role MUST be determined by its repository path, and the document MUST validate against the version 1 entry schema (Section 4.1), which requires it to satisfy exactly one of the three document schemas.

Implementations MUST ignore frontmatter keys they do not understand and MUST NOT reject a document solely because such keys are present.

3.2 Root Document

A repository MAY contain a root document. When present, its path MUST be exactly L10N.md at the repository root. It MUST declare source_language as a locale identifier, and MAY provide a body giving repository-wide guidance such as tone, brand terminology, and formatting rules. A root document MAY also declare the same workflow frontmatter as a scoped document (validation, sources, targets); this lets a small repository describe both its language and its translation workflow in a single file without introducing nested scopes. A conforming root document satisfies the schema in Section 4.2; a minimal example appears in Appendix A.1 and a single-file repository example in Appendix A.2.

3.3 Scoped Document

A scoped document is an L10N.md located in a directory other than the repository root. Its path MUST match ^(?!L10N.md$).+/L10N.md$. A scoped document MAY declare workflow frontmatter: validation (an array of non-empty strings), sources (a mapping of one or more source patterns to target path templates), and targets (an array of one or more locale identifiers). When any of these are declared, they configure how a translation tool processes the scope. When none are declared, the document acts purely as a context override for its subtree. The body MAY be empty when the workflow frontmatter alone captures the scope's intent. The validation array, when present, is a command and arguments that the translation tool executes against translated output (see Section 2.2). A conforming scoped document satisfies the schema in Section 4.3; a complete example appears in Appendix A.3.

3.4 Locale Overlay

A locale overlay is a Markdown file whose path MUST match ^(?:.+/)?L10N/[A-Za-z0-9_-]+.md$. It MAY provide a body and MAY declare locale as a locale identifier; when locale is omitted, the locale SHOULD be inferred from the file name. An overlay refines guidance for a single language. It MAY also declare validation; no other frontmatter keys are permitted. When a locale overlay declares validation, that command replaces the scope-level or root-level validation for that language only. A conforming overlay satisfies the schema in Section 4.4; complete examples appear in Appendix A.4 and Appendix A.5.

4Document Schema, Version 1

The version 1 contract is defined by the JSON Schema files in schemas/v1/. Those files are the machine-readable, authoritative form; the tables below state the same rules so they can be read without parsing JSON Schema. Where a table and its linked file ever disagree, the file wins.

In every table, path is the document's location in the repository (supplied by the validator, not written in the file) and body is the Markdown content after the frontmatter; the remaining properties are the document's YAML frontmatter keys. Frontmatter keys not listed here are ignored by conforming implementations.

4.1 Entry Point

A document conforms to this standard when it satisfies exactly one of the three document schemas that follow: the global document (Section 4.2), the scoped document (Section 4.3), or the locale overlay (Section 4.4). The entry-point schema expresses this as a oneOf over those three.

Canonical schema: schemas/v1/l10n-document.schema.json

4.2 Global Document Schema

The repository root L10N.md. Workflow fields are optional; including them lets a single-file repository describe its language and its translation workflow in one place.

PropertyRequiredTypeRule
pathyesstringexactly L10N.md
source_languageyesstringa locale identifier (Section 4.5)
validationnoarray of stringthe command and its arguments; each element non-empty. See Section 2.2.
sourcesnoobjectone or more entries; keys are non-empty source path patterns, values are non-empty target path templates (typically containing {locale})
targetsnoarray of stringone or more locale identifiers; duplicates are not permitted
bodyyesstringMarkdown prose; may be empty

Canonical schema: schemas/v1/global-document.schema.json

4.3 Scoped Document Schema

A non-root L10N.md. Workflow fields are optional; when omitted, the document acts as a context override for its subtree.

PropertyRequiredTypeRule
pathyesstringmatches ^(?!L10N\.md$).+/L10N\.md$
validationnoarray of stringthe command and its arguments; each element non-empty. See Section 2.2.
sourcesnoobjectone or more entries; keys are non-empty source path patterns, values are non-empty target path templates (typically containing {locale})
targetsnoarray of stringone or more locale identifiers; duplicates are not permitted
bodyyesstringMarkdown prose; may be empty

Canonical schema: schemas/v1/scoped-document.schema.json

4.4 Locale Overlay Schema

A per-language file under an L10N/ directory.

PropertyRequiredTypeRule
pathyesstringmatches ^(?:.+/)?L10N/[A-Za-z0-9_-]+\.md$
localenostringa locale identifier; when present, equals the file name
validationnoarray of stringsame as above
bodyyesstringMarkdown prose; may be empty

Canonical schema: schemas/v1/locale-overlay.schema.json

4.5 Shared Definitions

Reusable definitions referenced by the schemas above. A locale identifier in this standard is a language tag as defined by [BCP 47], with one accommodation: an underscore (_) MAY be used in place of a hyphen as the subtag separator, so that identifiers can serve as filenames and directory names on systems and tools that disallow hyphens. The localeIdentifier pattern below is a syntactic approximation that accepts well-formed [BCP47] tags and their underscore-separated equivalents; consuming implementations SHOULD additionally validate against [BCP47] when stricter checking is required.

DefinitionTypeRule
localeIdentifierstringa language tag per [BCP 47], with _ permitted as a subtag separator; approximated by ^[A-Za-z]{2,3}(?:[-_][A-Za-z0-9]{2,8})*$. Examples: en, es, ja, zh-Hant, zh_Hant.
nonEmptyStringstringat least one character
nonEmptyStringArrayarrayeach element is a nonEmptyString; minimum one element
markdownBodystringany string, including empty

Canonical schema: schemas/v1/shared.schema.json

5Validation Workflow

This section is informative.

An L10N document MAY declare validation in its frontmatter to specify a command the translation tool runs after translation is complete. The typical use is for developers to provide a script that checks the LLM-generated translation against project-specific rules (length limits, glossary compliance, balance constraints, formatting invariants) before the output is accepted.

The validation value MUST be an array of non-empty strings. The first element is the command; subsequent elements are its arguments. The shell is not involved — the array elements map directly to OS-level exec arguments. The translation tool MUST execute the command with the directory containing the L10N.md document that declared the validation as the working directory, or document a different default.

Before execution, the tool MUST set the following environment variables:

  • L10N_SOURCE_PATH — absolute path to the source file
  • L10N_TARGET_PATH — absolute path to the translated output file
  • L10N_LOCALE — the target locale identifier (e.g. es, ja)
  • L10N_DOC_PATH — absolute path to the L10N.md document whose validation triggered the command

Implementations MAY provide additional environment variables. The command MUST exit with code 0 to signal that the translation is acceptable; any non-zero exit MUST be treated as a failure. The tool SHOULD pass the command's stderr back to the translation agent so it can incorporate the diagnostics into a retry, and MAY capture stdout for structured reporting.

Validation commands compose through replacement: a scoped document's validation replaces the root-level command for that scope's files, and a locale overlay's validation replaces the scoped (or root) command for that language only. When no validation is declared at any level, the translation tool MAY apply built-in checks inferred from file type (e.g. Markdown parse validity, gettext .mo compilation).

6Conformance

A conforming document is a Markdown file whose path matches one of the document patterns in Section 3 and that validates against the version 1 entry schema (Section 4.1).

A conforming repository MUST contain at most one root document, located at L10N.md, and every file whose path matches a document pattern MUST be a conforming document. A validating tool MUST treat a schema validation failure as an error and SHOULD report the offending path and the failing constraint.

7References

7.1 Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, May 2017.
[BCP47]
Phillips, A., and M. Davis, "Tags for Identifying Languages", BCP 47, RFC 5646, September 2009.
[JSON-SCHEMA]
Wright, A., et al., "JSON Schema: A Media Type for Describing JSON Documents", Draft 2020-12.
[YAML]
Ben-Kiki, O., et al., "YAML Ain't Markup Language (YAML) Version 1.2".
[COMMONMARK]
MacFarlane, J., "CommonMark Spec".

7.2 Informative References

[MISE]
"mise-en-place", https://mise.jdx.dev/.
[AUBE]
"aube, a fast Node.js package manager", https://aube.en.dev/.
[ELEVENTY]
"Eleventy", https://www.11ty.dev/.

AExamples

This appendix is informative.

A.1 Minimal Root Document

Listing 2. Canonical root L10N.md
---
source_language: "en"
sources:
  "site/_content/v1/en/*.md": "site/_content/v1/{locale}/*.md"
targets:
  - es
  - ja
---

# Global Translation Context

## Purpose
- This repository defines a lightweight `L10N.md` convention by which translators provide the context that AI-based translation systems need before they translate strings.
- The format is intentionally simple: YAML frontmatter for machine-readable settings, followed by Markdown prose for nuance that does not fit neatly into key-value pairs.
- This repository dog-foods the convention: the L10N.md specification text itself is split into per-section Markdown files under `site/_content/v1/en/`, and is translated into Spanish and Japanese using the same workflow.

## Structure
- Use a root `L10N.md` for repository-wide guidance.
- Add scoped `L10N.md` files inside product or package directories when a subset of strings needs extra instructions.
- Add locale overlays in `L10N/<locale>.md` when one language needs special terminology, grammar, or cultural notes.

## Brand & Terminology
- "Glossia" is a proper noun and should never be translated.
- Keep product names, command names, file names, and API identifiers exactly as written unless a scoped document says otherwise.
- Treat `L10N.md` as the canonical name of the convention in prose, code comments, and documentation.
- Keep all-caps BCP 14 keywords (`MUST`, `MUST NOT`, `SHOULD`, `MAY`, etc.) in English. They are normative tokens and must remain recognizable across locales.
- Keep field names (`source_language`, `sources`, `targets`, `validation`, `locale`, `path`, `body`), file names (`L10N.md`), and JSON Schema identifiers untranslated.

## Formatting Rules
- Preserve Markdown structure unless the target format requires a documented transformation.
- Preserve placeholders, interpolation markers, code fences, URLs, email addresses, and file paths exactly as they appear in the source.
- Preserve inline HTML in spec content (`<span class="kw">`, `<code>`, `<a href="...">`, table markup). Translate only the visible text.
- Preserve heading attribute syntax `{:#id:}` exactly. The identifier is the anchor used by cross-references and must not change across locales.
- Do not invent punctuation, headings, or examples that were not present in the source string or in the surrounding context.
- Do not use em dashes. Prefer a hyphen or rewrite the sentence.

## Tone
- Documentation should be precise, direct, and easy to scan.
- Spec prose should read as a formal technical specification, not as marketing copy.
- Interface copy should be concise and action-oriented.
- Error messages should be calm, concrete, and non-blaming.

## Authoring Notes
- Keep frontmatter small and stable.
- Put project-specific nuance in headings and bullet lists where humans can maintain it comfortably.
- Reach for scoped files before the root file becomes a dumping ground for unrelated product details.
- When translating a section file, keep the frontmatter `id`, `kind`, and `number` fields exactly as the source uses them; translate only `title` and `label`.

A.2 Single-File Repository Root

A root document that also carries workflow frontmatter, so the repository needs no nested scopes.

Listing 3. examples/single-repo/L10N.md
---
source_language: "en"
validation:
  - "./scripts/validate-gettext.sh"
  - "--strict"
sources:
  "priv/gettext/*.pot": "priv/gettext/{locale}/LC_MESSAGES"
targets:
  - es
  - ja
  - zh_Hant
---

# Project Translation Context

This repository holds a single small product. There is no need for nested
scopes; the root document defines both the repository-wide voice and the
translation workflow.

## Voice
- Friendly and direct. Prefer plain words over jargon.
- Use the second person ("you") for user-facing copy.
- Avoid exclamation marks outside of celebratory states.

## Product Language
- "Workspace" refers to the team's shared area.
- "Run" is the unit of execution; do not translate to a localized verb form.

A.3 Scoped Document

Listing 4. examples/app/L10N.md
---
validation:
  - "./scripts/validate-gettext.sh"
  - "--strict"
sources:
  "priv/gettext/*.pot": "priv/gettext/{locale}/LC_MESSAGES"
targets:
  - es
  - ja
  - zh_Hant
---

# App Translation Context

The app includes marketing copy, onboarding, and a signed-in dashboard for localization managers.

## Translation Domains
- `marketing`: Public-facing copy. Tone should feel clear, credible, and inviting.
- `onboarding`: New user setup. Keep copy short and confidence-building.
- `dashboard`: Logged-in interface. Prefer compact labels and direct verbs.
- `errors`: Recovery-oriented error messages. Explain what happened and what to do next.

## Product Language
- "Workspace" refers to the translator's team space and should remain a singular concept.
- "String set" refers to a grouped collection of related translations.
- "Ship review" means the final approval pass before a locale is published.

A.4 Spanish Locale Overlay

Listing 5. examples/app/L10N/es.md
---
locale: es
validation:
  - "./scripts/check-es.sh"
  - "--max-length"
  - "120"
---

## Spanish-specific translation rules

- Translate "workspace" as "espacio de trabajo".
- Translate "string set" as "conjunto de cadenas".
- Keep "ship review" in English when it appears as a feature label.

A.5 Japanese Locale Overlay

Listing 6. examples/app/L10N/ja.md
## Japanese-specific translation rules

- Prefer a neutral and professional register over playful copy.
- Keep button labels compact so they fit narrow mobile layouts.
- Leave "ship review" untranslated when it is used as the name of a release workflow.