This page describes the very basics of YAML, as a quick reference.

It doesn’t take much to be able to use the YAML format, but subtleties and corner cases appear very quickly. There are a few gotchas listed and how to avoid them.

YAML is a generic serialization format similar to JSON: it describes an arbitrary data structure in a “human-friendly” unicode format. The meaning of the data structure is up to the program reading the file. YAML has a schema specification, though it looks like it’s rarely used (or used implicitely).

YAML specification: https://yaml.org/spec/1.2.2

Overview

YAML defines three types of structures:

  • Scalars: strings, possibly interpreted differently by the parser (e.g. as a number)
  • Sequence: ordered list
  • Mapping: string to arbitrary value, including another sequence or mapping

Comments start with a hash: #

By default YAML parsers are supposed to abide by the JSON schema, which defines a few scalars: true, false, null, integers/floats. Everything else is treated as a string. Well, except when it’s not: some parsers treat ‘yes’ the same as ‘true’ (see section Scalars below).

YAML files can contain multiple “documents” (i.e. distinct data structures). Each document is supposed to start with a triple dash and end with a triple dot. It seems that a lot of parsers (including Python’s) don’t care and accept files without those markers.

Scalars

  • number literals: 1, 1.2, 0x12d4
  • boolean: true, false. The Python parser seems to accept yes, no, on, off, and also True, TRUE, Yes, YES, but not TRue or yeS (these are strings)
  • null/none: null, ~
  • strings: written between single or double quotes, and a lot of other weird ways (see dedicated section below)

Mappings

aka dictionary, hash table.

Example YAML file:

key1: value1
key2: value2

Corresponding json:

{
    "key1": "value1"
    "key2": "value2"
}

The same can be written “inline”: {key1: value1, key2: value2} It doesn’t have to on a single line, but probably not a good idea.

Sequences

aka lists

Example YAML file with a list:

- string 1
- 2
- 3 this is a string too

This is the same list of strings and numbers:

[string 1, 2, 3 this is a string too]

Though it’s probably better to always add double quotes to avoid issues with inline commas: ["string 1", 2, "3 this is a string too"]

Equivalent in JSON:

["string 1", 2, "3 this is a string too"]

Nesting

You can of course nest mappings and sequences.

key1: value1
key2:
  - string 1
  - 2
  - 3 this is a string too

Equivalent JSON:

{
    "key1": "value1",
    "key2": ["string 1", "2", "3 this is a string too"]
}

Strings

TL;DR: It’s madness. See https://brettweir.com/blog/yaml-strings/ for all the madness glory.

This section shows robust ways of writing strings. Some information here may be specific to the Python parser.

Single lines

# YAML (all equivalent)
# Good practice: alway quote unless it's a single word
key: hello world
key: 'hello world'
key: "hello world"
# JSON
{'key': 'hello world'}

Multiple lines

Concatenates all the lines putting a single space between them

# Note the \n at the end of the string
# JSON: {'key': 'hello world\n'}
key: > 
  hello
  world

# Note the LACK of \n at the end string
# JSON: {'key': 'hello world'}
key: >-
  hello
  world

Concatenates all the lines putting a single \n between them, similar to Python’s dedent.

Same as above, the dash symbol in |- controls the presence of the final \n.

# JSON: {'key': 'hello\nworld\n'}    Note both \n
key: | 
  hello
  world

# JSON: {'key': 'hello\n  world'}
key: |-
  hello
    world

Strings Gotchas

Now for some madness.

TL;DR:

  • always quote strings containing a space
  • avoid splitting a string when it starts on the same line as a key.

On the pitfall of not quoting a string (beware of inline comments):

# JSON: {'key': 'mambo'}
key: mambo #5

If there are no colons or leading dashes, things are treated as a single one-line string (similar to the > case above):

# JSON: {'key': 'this is a single line of text'}
key: this
  is 
  a single
  line of text
  
# JSON: {'key': 'this is - a single line of text'}
key: this
  is
  - a single
  line of text

But watch out:

# JSON: {'key': '-is - a single - line of text'}
key:
  -is 
  - a single
  - line of text

# JSON: {'key': ['is', 'a single', 'line of text']}
key:
  - is 
  - a single
  - line of text

# Syntax error
key:
  - is 
  -a single
  - line of text
 
# Syntax error
key: - hello

Beware of implicit conversions, like yes -> true.

# May be specific to the Python parser 
# JSON: {'key': [true, 'we', 'can']}
key: 
  - yes
  - we
  - can

More corner cases

# JSON: {'key': 'hello "world"'}
key: hello
  "world"

# But: syntax error
key: "hello"
  "world"