Specification
L10N.md
A convention for repository-resident translation context
Abstract
L10N.md is a convention by which translators provide the context that AI-based translation systems need before they translate strings. It keeps the machine-readable surface small in YAML frontmatter and pushes the subtle guidance into ordinary Markdown prose. Documents live in the repository, are reviewed in pull requests, and are validated against versioned JSON Schemas. This document specifies the document model, the version 1 schema, and conformance requirements.
Status of this document
This is version 1 of the L10N.md standard and is considered stable. Sections are normative unless explicitly marked Informative. A future version may add fields or tighten rules; such changes require a version increment and a new schema directory and will not silently change the editing model defined here. A future version is also expected to introduce extension capabilities that let repository-scoped context be extended with remotely-provided context. Copyright (c) 2026 Glossia; released under the MIT license recorded in the source repository.
1Introduction
AI-based translation systems produce better results when they are given context before they begin: the source language, the intended tone, brand terminology, and any constraints that do not fit neatly into key and value pairs. L10N.md is the convention by which translators record that context as files that live next to the code they describe, so the systems that translate the strings can read it.
The convention works because it keeps the machine-readable surface tiny and pushes the judgment into prose. Frontmatter carries structure. Markdown carries nuance. Additional files appear only when the translation surface actually splits. The same shape holds whether a project has one document or many.
2Conventions and Terminology
2.1 Requirements Notation
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
2.2 Terminology
- Repository
- The version-controlled project in which L10N documents reside.
- Document
- A Markdown file that participates in this standard, classified by its repository path.
- Frontmatter
- An optional leading YAML mapping delimited by lines containing only
---. - Body
- The Markdown content following the frontmatter; the prose that carries guidance.
- Root document
- The repository-wide
L10N.md, also called the global document. - Scoped document
- A non-root
L10N.mdthat adds workflow settings to a subtree. - Locale overlay
- A per-language file stored under an
L10N/directory. - Source language
- The language strings are authored in, declared by the root document.
- Locale identifier
- A language tag as defined by [BCP 47], with an underscore (
_) permitted in place of a hyphen as the subtag separator. See Section 4.5. Examples:en,es,zh-Hant,zh_Hant. - Validation command
-
An array of non-empty strings carried by the
validationfield. The translation tool MUST execute it as a command and arguments against translated output, treating exit code 0 as pass and any non-zero exit as failure. Before execution the tool MUST set these environment variables:L10N_SOURCE_PATH— absolute path to the source fileL10N_TARGET_PATH— absolute path to the translated output fileL10N_LOCALE— the target locale identifier (e.g.es,ja)L10N_DOC_PATH— absolute path to the L10N.md document controlling the scope
3Document Model
There are three document shapes. They share one mental model: state structure in frontmatter, explain judgment in prose, and add files only when scope demands it. The normative contract for each shape is the corresponding JSON Schema in Section 4.
3.1 General Requirements
An L10N document MUST be a Markdown file. It MAY begin with a frontmatter block; the content after the frontmatter is the body. Either side MAY be empty: the body MAY be empty when the frontmatter is sufficient on its own, and the frontmatter MAY be absent when the document only adds body context. Each role's schema specifies which keys, if any, remain required.
A document's role MUST be determined by its repository path, and the document MUST validate against the version 1 entry schema (Section 4.1), which requires it to satisfy exactly one of the three document schemas.
Implementations MUST ignore frontmatter keys they do not understand and MUST NOT reject a document solely because such keys are present.
3.2 Root Document
A repository MAY contain a root document. When present, its path
MUST be exactly L10N.md at the repository root. It
MUST declare source_language as a locale identifier,
and MAY provide a body giving repository-wide guidance such as
tone, brand terminology, and formatting rules. A root document
MAY also declare the same workflow frontmatter as a scoped
document (validation, sources, targets); this
lets a small repository describe both its language and its translation workflow in a
single file without introducing nested scopes. A conforming
root document satisfies the schema in Section 4.2; a minimal example
appears in Appendix A.1 and a single-file repository example in
Appendix A.2.
3.3 Scoped Document
A scoped document is an L10N.md located in a directory other than the
repository root. Its path MUST match
^(?!L10N.md$).+/L10N.md$. A scoped document MAY
declare workflow frontmatter: validation (an array of non-empty strings),
sources (a mapping of one or more source patterns to target path
templates), and targets (an array of one or more locale identifiers).
When any of these are declared, they configure how a
translation tool processes the scope. When none are declared, the document acts purely as
a context override for its subtree. The body MAY be empty when
the workflow frontmatter alone captures the scope's intent. The validation
array, when present, is a command and arguments that the translation tool executes
against translated output (see Section 2.2). A conforming
scoped document satisfies the schema in Section 4.3; a complete example
appears in Appendix A.3.
3.4 Locale Overlay
A locale overlay is a Markdown file whose path MUST match
^(?:.+/)?L10N/[A-Za-z0-9_-]+.md$. It MAY provide
a body and MAY declare locale as a locale
identifier; when locale is omitted, the locale
SHOULD be inferred from the file name. An overlay refines
guidance for a single language. It MAY also declare
validation; no other frontmatter keys are permitted. When a locale overlay
declares validation, that command replaces the scope-level or root-level
validation for that language only. A conforming overlay satisfies the schema in
Section 4.4; complete examples appear in Appendix A.4
and Appendix A.5.
4Document Schema, Version 1
The version 1 contract is defined by the JSON Schema files in schemas/v1/.
Those files are the machine-readable, authoritative form; the tables below state the same
rules so they can be read without parsing JSON Schema. Where a table and its linked file
ever disagree, the file wins.
In every table, path is the document's location in the repository (supplied
by the validator, not written in the file) and body is the Markdown content
after the frontmatter; the remaining properties are the document's YAML frontmatter keys.
Frontmatter keys not listed here are ignored by conforming implementations.
4.1 Entry Point
A document conforms to this standard when it satisfies exactly one of the three document
schemas that follow: the global document (Section 4.2), the scoped
document (Section 4.3), or the locale overlay
(Section 4.4). The entry-point schema expresses this as a
oneOf over those three.
Canonical schema: schemas/v1/l10n-document.schema.json
4.2 Global Document Schema
The repository root L10N.md. Workflow fields are optional; including them lets a single-file repository describe its language and its translation workflow in one place.
| Property | Required | Type | Rule |
|---|---|---|---|
path | yes | string | exactly L10N.md |
source_language | yes | string | a locale identifier (Section 4.5) |
validation | no | array of string | the command and its arguments; each element non-empty. See Section 2.2. |
sources | no | object | one or more entries; keys are non-empty source path patterns, values are non-empty target path templates (typically containing {locale}) |
targets | no | array of string | one or more locale identifiers; duplicates are not permitted |
body | yes | string | Markdown prose; may be empty |
Canonical schema: schemas/v1/global-document.schema.json
4.3 Scoped Document Schema
A non-root L10N.md. Workflow fields are optional; when omitted, the document acts as a context override for its subtree.
| Property | Required | Type | Rule |
|---|---|---|---|
path | yes | string | matches ^(?!L10N\.md$).+/L10N\.md$ |
validation | no | array of string | the command and its arguments; each element non-empty. See Section 2.2. |
sources | no | object | one or more entries; keys are non-empty source path patterns, values are non-empty target path templates (typically containing {locale}) |
targets | no | array of string | one or more locale identifiers; duplicates are not permitted |
body | yes | string | Markdown prose; may be empty |
Canonical schema: schemas/v1/scoped-document.schema.json
4.4 Locale Overlay Schema
A per-language file under an L10N/ directory.
| Property | Required | Type | Rule |
|---|---|---|---|
path | yes | string | matches ^(?:.+/)?L10N/[A-Za-z0-9_-]+\.md$ |
locale | no | string | a locale identifier; when present, equals the file name |
validation | no | array of string | same as above |
body | yes | string | Markdown prose; may be empty |
Canonical schema: schemas/v1/locale-overlay.schema.json
4.5 Shared Definitions
Reusable definitions referenced by the schemas above. A locale identifier in this
standard is a language tag as defined by [BCP 47], with one
accommodation: an underscore (_) MAY be used in
place of a hyphen as the subtag separator, so that identifiers can serve as filenames
and directory names on systems and tools that disallow hyphens. The
localeIdentifier pattern below is a syntactic approximation that accepts
well-formed [BCP47] tags and their underscore-separated equivalents; consuming
implementations SHOULD additionally validate against [BCP47]
when stricter checking is required.
| Definition | Type | Rule |
|---|---|---|
localeIdentifier | string | a language tag per [BCP 47], with _ permitted as a subtag separator; approximated by ^[A-Za-z]{2,3}(?:[-_][A-Za-z0-9]{2,8})*$. Examples: en, es, ja, zh-Hant, zh_Hant. |
nonEmptyString | string | at least one character |
nonEmptyStringArray | array | each element is a nonEmptyString; minimum one element |
markdownBody | string | any string, including empty |
Canonical schema: schemas/v1/shared.schema.json
5Validation Workflow
This section is informative.
An L10N document MAY declare validation in its
frontmatter to specify a command the translation tool runs after translation is
complete. The typical use is for developers to provide a script that checks the
LLM-generated translation against project-specific rules (length limits, glossary
compliance, balance constraints, formatting invariants) before the output is accepted.
The validation value MUST be an array of non-empty
strings. The first element is the command; subsequent elements are its arguments. The
shell is not involved — the array elements map directly to OS-level
exec arguments. The translation tool MUST execute
the command with the directory containing the L10N.md document that declared the
validation as the working directory, or document a different
default.
Before execution, the tool MUST set the following environment variables:
L10N_SOURCE_PATH— absolute path to the source fileL10N_TARGET_PATH— absolute path to the translated output fileL10N_LOCALE— the target locale identifier (e.g.es,ja)L10N_DOC_PATH— absolute path to the L10N.md document whosevalidationtriggered the command
Implementations MAY provide additional environment variables. The command MUST exit with code 0 to signal that the translation is acceptable; any non-zero exit MUST be treated as a failure. The tool SHOULD pass the command's stderr back to the translation agent so it can incorporate the diagnostics into a retry, and MAY capture stdout for structured reporting.
Validation commands compose through replacement: a scoped document's
validation replaces the root-level command for that scope's files, and a
locale overlay's validation replaces the scoped (or root) command for that
language only. When no validation is declared at any level, the translation
tool MAY apply built-in checks inferred from file type (e.g.
Markdown parse validity, gettext .mo compilation).
6Conformance
A conforming document is a Markdown file whose path matches one of the document patterns in Section 3 and that validates against the version 1 entry schema (Section 4.1).
A conforming repository MUST contain at most one root
document, located at L10N.md, and every file whose path matches a document
pattern MUST be a conforming document. A validating tool
MUST treat a schema validation failure as an error and
SHOULD report the offending path and the failing constraint.
7References
7.1 Normative References
- [RFC2119]
- Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
- [RFC8174]
- Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, May 2017.
- [BCP47]
- Phillips, A., and M. Davis, "Tags for Identifying Languages", BCP 47, RFC 5646, September 2009.
- [JSON-SCHEMA]
- Wright, A., et al., "JSON Schema: A Media Type for Describing JSON Documents", Draft 2020-12.
- [YAML]
- Ben-Kiki, O., et al., "YAML Ain't Markup Language (YAML) Version 1.2".
- [COMMONMARK]
- MacFarlane, J., "CommonMark Spec".
7.2 Informative References
- [MISE]
- "mise-en-place", https://mise.jdx.dev/.
- [AUBE]
- "aube, a fast Node.js package manager", https://aube.en.dev/.
- [ELEVENTY]
- "Eleventy", https://www.11ty.dev/.
AExamples
This appendix is informative.
A.1 Minimal Root Document
L10N.md---
source_language: "en"
sources:
"site/_content/v1/en/*.md": "site/_content/v1/{locale}/*.md"
targets:
- es
- ja
---
# Global Translation Context
## Purpose
- This repository defines a lightweight `L10N.md` convention by which translators provide the context that AI-based translation systems need before they translate strings.
- The format is intentionally simple: YAML frontmatter for machine-readable settings, followed by Markdown prose for nuance that does not fit neatly into key-value pairs.
- This repository dog-foods the convention: the L10N.md specification text itself is split into per-section Markdown files under `site/_content/v1/en/`, and is translated into Spanish and Japanese using the same workflow.
## Structure
- Use a root `L10N.md` for repository-wide guidance.
- Add scoped `L10N.md` files inside product or package directories when a subset of strings needs extra instructions.
- Add locale overlays in `L10N/<locale>.md` when one language needs special terminology, grammar, or cultural notes.
## Brand & Terminology
- "Glossia" is a proper noun and should never be translated.
- Keep product names, command names, file names, and API identifiers exactly as written unless a scoped document says otherwise.
- Treat `L10N.md` as the canonical name of the convention in prose, code comments, and documentation.
- Keep all-caps BCP 14 keywords (`MUST`, `MUST NOT`, `SHOULD`, `MAY`, etc.) in English. They are normative tokens and must remain recognizable across locales.
- Keep field names (`source_language`, `sources`, `targets`, `validation`, `locale`, `path`, `body`), file names (`L10N.md`), and JSON Schema identifiers untranslated.
## Formatting Rules
- Preserve Markdown structure unless the target format requires a documented transformation.
- Preserve placeholders, interpolation markers, code fences, URLs, email addresses, and file paths exactly as they appear in the source.
- Preserve inline HTML in spec content (`<span class="kw">`, `<code>`, `<a href="...">`, table markup). Translate only the visible text.
- Preserve heading attribute syntax `{:#id:}` exactly. The identifier is the anchor used by cross-references and must not change across locales.
- Do not invent punctuation, headings, or examples that were not present in the source string or in the surrounding context.
- Do not use em dashes. Prefer a hyphen or rewrite the sentence.
## Tone
- Documentation should be precise, direct, and easy to scan.
- Spec prose should read as a formal technical specification, not as marketing copy.
- Interface copy should be concise and action-oriented.
- Error messages should be calm, concrete, and non-blaming.
## Authoring Notes
- Keep frontmatter small and stable.
- Put project-specific nuance in headings and bullet lists where humans can maintain it comfortably.
- Reach for scoped files before the root file becomes a dumping ground for unrelated product details.
- When translating a section file, keep the frontmatter `id`, `kind`, and `number` fields exactly as the source uses them; translate only `title` and `label`.
A.2 Single-File Repository Root
A root document that also carries workflow frontmatter, so the repository needs no nested scopes.
examples/single-repo/L10N.md---
source_language: "en"
validation:
- "./scripts/validate-gettext.sh"
- "--strict"
sources:
"priv/gettext/*.pot": "priv/gettext/{locale}/LC_MESSAGES"
targets:
- es
- ja
- zh_Hant
---
# Project Translation Context
This repository holds a single small product. There is no need for nested
scopes; the root document defines both the repository-wide voice and the
translation workflow.
## Voice
- Friendly and direct. Prefer plain words over jargon.
- Use the second person ("you") for user-facing copy.
- Avoid exclamation marks outside of celebratory states.
## Product Language
- "Workspace" refers to the team's shared area.
- "Run" is the unit of execution; do not translate to a localized verb form.
A.3 Scoped Document
examples/app/L10N.md---
validation:
- "./scripts/validate-gettext.sh"
- "--strict"
sources:
"priv/gettext/*.pot": "priv/gettext/{locale}/LC_MESSAGES"
targets:
- es
- ja
- zh_Hant
---
# App Translation Context
The app includes marketing copy, onboarding, and a signed-in dashboard for localization managers.
## Translation Domains
- `marketing`: Public-facing copy. Tone should feel clear, credible, and inviting.
- `onboarding`: New user setup. Keep copy short and confidence-building.
- `dashboard`: Logged-in interface. Prefer compact labels and direct verbs.
- `errors`: Recovery-oriented error messages. Explain what happened and what to do next.
## Product Language
- "Workspace" refers to the translator's team space and should remain a singular concept.
- "String set" refers to a grouped collection of related translations.
- "Ship review" means the final approval pass before a locale is published.
A.4 Spanish Locale Overlay
examples/app/L10N/es.md---
locale: es
validation:
- "./scripts/check-es.sh"
- "--max-length"
- "120"
---
## Spanish-specific translation rules
- Translate "workspace" as "espacio de trabajo".
- Translate "string set" as "conjunto de cadenas".
- Keep "ship review" in English when it appears as a feature label.
A.5 Japanese Locale Overlay
examples/app/L10N/ja.md## Japanese-specific translation rules
- Prefer a neutral and professional register over playful copy.
- Keep button labels compact so they fit narrow mobile layouts.
- Leave "ship review" untranslated when it is used as the name of a release workflow.