Skip to content

API reference

The complete public surface of yarutsk on a single page. Authoritative signatures live in python/yarutsk/__init__.pyi.

Loading and dumping

Function Purpose
load(stream, *, schema=None) Load the first document from a file-like object
loads(text, *, schema=None) Load the first document from a string or UTF-8 bytes
load_all(stream, *, schema=None) Load every document from a multi-doc stream
loads_all(text, *, schema=None) Load every document from a multi-doc string or UTF-8 bytes
iter_load_all(stream, *, schema=None) Iterator over documents in a stream
iter_loads_all(text, *, schema=None) Iterator over documents in a string or UTF-8 bytes
dump(doc, stream, *, schema=None, indent=2) Emit a single document to a file-like object
dumps(doc, *, schema=None, indent=2) Emit a single document to a string
dump_all(docs, stream, *, schema=None, indent=2) Emit multiple documents to a stream
dumps_all(docs, *, schema=None, indent=2) Emit multiple documents to a string

load / loads return a YamlMapping, YamlSequence, or YamlScalar (for a top-level scalar document), or None for empty input. Nested container nodes are YamlMapping or YamlSequence; scalar leaves inside mappings and sequences are returned as native Python primitives (int, float, bool, str, bytes, datetime.datetime, datetime.date, or None).

dump / dumps accept YamlMapping, YamlSequence, and YamlScalar objects (preserving comments, styles, and tags), but also plain Python types: dict, list, tuple, set, frozenset, bytes, bytearray, scalar primitives, and any collections.abc.Mapping or iterable. Plain types are auto-converted with default formatting.

iter_load_all / iter_loads_all return a YamlIter object that drives the parser on demand and yields documents one at a time — never accumulating all documents in memory:

import io
import yarutsk

stream = io.StringIO("---\na: 1\n---\nb: 2\n---\nc: 3\n")
for doc in yarutsk.iter_load_all(stream):
    print(doc)   # {'a': 1}, then {'b': 2}, then {'c': 3}

load / load_all also stream from IO in 8 KB chunks rather than reading the entire input first, but they still build and return the full document tree.

loads / loads_all / iter_loads_all accept either str or UTF-8 bytes/bytearray — useful for feeding raw process output directly:

out = subprocess.run([...], capture_output=True, check=True).stdout
doc = yarutsk.loads(out)

Non-UTF-8 bytes raise UnicodeDecodeError; any other type raises TypeError.

Type conversions

Implicit coercion

Plain YAML values (no tag) are converted to Python types automatically:

Value pattern Python type Examples
Decimal integer int 42, -7
Hex / octal integer int 0xFF255, 0o1715
Float float 3.14, 1.5e2, .inf, -.inf, .nan
true / false (any case) bool True, FALSE
yes / no / on / off (any case) bool YAML 1.1 booleans
null, Null, NULL, ~, empty value None
Anything else str hello, "quoted"

Non-canonical forms are reproduced as written on dump — yes stays yes, 0xFF stays 0xFF, ~ stays ~.

Explicit tags

A !!tag overrides implicit coercion and controls which Python type is returned:

Tag Python type Notes
!!str str Forces string even if the value looks like an int, bool, or null
!!int int Parses decimal, hex (0xFF), and octal (0o17)
!!float float Promotes integer literals (!!float 11.0)
!!bool bool
!!null None Forces null regardless of content (!!null ""None)
!!binary bytes Base64-decoded on load; base64-encoded on dump
!!timestamp datetime.datetime or datetime.date Date-only values return date; datetime values return datetime
import datetime, yarutsk

# !!binary
doc = yarutsk.loads("data: !!binary aGVsbG8=\n")
doc["data"]                            # b'hello'

# !!timestamp
doc = yarutsk.loads("ts: !!timestamp 2024-01-15T10:30:00\n")
doc["ts"]                              # datetime.datetime(2024, 1, 15, 10, 30)

# !!float promotes integers
doc = yarutsk.loads("x: !!float 1\n")
doc["x"]                               # 1.0  (float, not int)

Dumping Python bytes / datetime auto-applies the appropriate tag.

Schema — custom types

Schema lets you register loaders (tag → Python object, fired on load) and dumpers (Python type → tag + data, fired on dump). Pass it as a keyword argument to any load or dump function.

Mapping types

Loader receives a YamlMapping; dumper returns a (tag, dict) tuple:

import yarutsk

class Point:
    def __init__(self, x, y): self.x, self.y = x, y

schema = yarutsk.Schema()
schema.add_loader("!point", lambda d: Point(d["x"], d["y"]))
schema.add_dumper(Point, lambda p: ("!point", {"x": p.x, "y": p.y}))

doc = yarutsk.loads("origin: !point\n  x: 0\n  y: 0\n", schema=schema)
doc["origin"]                          # Point(0, 0)

Scalar types

Loader receives the raw scalar string; dumper returns a (tag, str) tuple:

class Color:
    def __init__(self, r, g, b): self.r, self.g, self.b = r, g, b

schema = yarutsk.Schema()
schema.add_loader("!color", lambda s: Color(*[int(x) for x in s.split(",")]))
schema.add_dumper(Color, lambda c: ("!color", f"{c.r},{c.g},{c.b}"))

A dumper can return a YamlScalar, YamlMapping, or YamlSequence as the second tuple element to control the emitted style — the tag from the first element is stamped on top. Returning a YamlMapping(style="flow") or YamlSequence(style="flow") emits the container in flow style.

Overriding built-in tags

Registering a loader for !!int, !!float, !!bool, !!null, or !!str bypasses the built-in coercion. The callable receives the raw YAML string rather than the already-converted Python value:

schema = yarutsk.Schema()
schema.add_loader("!!int", lambda raw: int(raw, 0))  # parses 0xFF, 0o77, etc.
doc = yarutsk.loads("x: !!int 0xFF\n", schema=schema)
doc["x"]                               # 255

Multiple dumpers for the same type are checked in registration order; the first isinstance match wins.

Worked examples for plugging yarutsk into pydantic / msgspec / cattrs live on the Library integrations page.

YamlScalar

Top-level scalar documents are wrapped in a YamlScalar node:

doc = yarutsk.loads("42")
doc.value                              # 42 (Python int)
doc.to_python()                        # same as .value

# .value applies built-in tag handling
doc = yarutsk.loads("!!binary aGVsbG8=")
doc.value                              # b'hello'
doc = yarutsk.loads("!!timestamp 2024-01-01")
doc.value                              # datetime.date(2024, 1, 1)

# Style
doc = yarutsk.loads("---\n'hello'\n")
doc.style                              # 'single'
doc.style = "double"                   # 'plain'|'single'|'double'|'literal'|'folded'

# Tag
doc = yarutsk.loads("!!str 42")
doc.tag                                # '!!str'

# Anchor (demonstrated on a scalar root)
doc = yarutsk.loads("&root 42\n")
doc.anchor                             # 'root'

YamlScalar can be constructed directly to control emission when assigning into a mapping or sequence:

# Constructor: YamlScalar(value, *, style="plain", tag=None)
doc["x"] = yarutsk.YamlScalar("hello", style="double")    # 'x: "hello"\n'
doc["x"] = yarutsk.YamlScalar("42", tag="!!str")          # 'x: !!str 42\n'
doc["x"] = yarutsk.YamlScalar(b"hello")                   # 'x: !!binary aGVsbG8=\n'
doc["x"] = yarutsk.YamlScalar(datetime.date(2024, 1, 15)) # 'x: !!timestamp 2024-01-15\n'
  • valuebool, int, float, str, None, bytes, bytearray, datetime.datetime, or datetime.date
  • style"plain" (default), "single", "double", "literal", "folded"
  • tag — YAML tag string, or None. For bytes defaults to "!!binary", for datetime defaults to "!!timestamp"

Comments and blank lines

Comments and blank-lines-before live directly on each node. Reach the child via parent.node(key) (or parent.node(index) for a sequence) and read/write the attribute directly:

doc = yarutsk.loads("port: 5432  # db port\n")
doc.node("port").comment_inline          # 'db port'

doc.node("port").comment_inline = "updated"
yarutsk.dumps(doc)                       # 'port: 5432  # updated\n'

doc.node("port").blank_lines_before = 2  # int property, clamped 0–255

For bare-scalar documents, comment_before and comment_inline are both preserved on the scalar:

doc = yarutsk.loads("# hello\n42  # answer\n")
doc.comment_before                       # 'hello'
doc.comment_inline                       # 'answer'

YamlMapping

YamlMapping is a subclass of dict with insertion-ordered keys. Constructor:

# YamlMapping(mapping=None, *, style="block", tag=None)
m = yarutsk.YamlMapping({"a": 1, "b": 2}, style="flow")
yarutsk.dumps(m)                       # '{a: 1, b: 2}\n'

The full method surface, grouped by concern:

Read / write

Every standard dict method works unchanged: doc[k], doc[k] = v, del doc[k], in, len, get, pop, setdefault, update, keys / values / items, iteration, equality, json.dumps(doc). Setting an existing key preserves its position.

Also:

  • doc.to_python() — deep conversion to a plain Python dict / list / primitive tree (loses all style metadata). Applies built-in tag handling (!!binarybytes, !!timestampdatetime/date)
  • doc.node(key) — returns the underlying YamlScalar / YamlMapping / YamlSequence preserving style/tag/anchor; KeyError if absent
  • doc.nodes()[(key, node)] pairs with metadata preserved

Per-child metadata — use node(key)

Style, comments, and blank-lines-before live on each child node. Reach the child with doc.node(key) and read/write the attribute directly:

doc["nested"] = yarutsk.YamlMapping(style="flow")
doc["nested"]["x"] = 1
doc["nested"].node("x").style = "double"           # scalar style

doc.node("key").comment_inline = "hi"              # comment on a child
doc.node("key").comment_inline = None              # clear
doc.node("key").comment_before = "block\ncomment"
doc.node("key").blank_lines_before = 2             # int, clamped 0–255

node(key) returns a live handle: setter calls propagate to the parent, so the change is visible on the next dumps(doc).

Whole-mapping properties

  • doc.style / doc.style = "block" | "flow" — container style of this mapping itself
  • doc.tag / doc.tag = "!!map" — YAML tag
  • doc.anchor / doc.anchor = "myanchor" — emits &myanchor before the mapping
  • doc.blank_lines_beforeint, clamped 0–255
  • doc.trailing_blank_lines = 1 — blank lines after all entries
  • Top-level-only: explicit_start, explicit_end, yaml_version, tag_directives

Aliases

doc = yarutsk.loads("base: &val 1\nref: *val\n")
doc.get_alias("ref")                   # 'val'
doc.get_alias("base")                  # None (has anchor, not alias)
doc["ref"]                             # 1  (resolved value always accessible)

doc.set_alias("other", "anchor")       # mark value as emitting *anchor

Sorting

doc.sort_keys()                        # alphabetical, in-place
doc.sort_keys(reverse=True)
doc.sort_keys(key=lambda k: len(k))    # custom key
doc.sort_keys(recursive=True)          # also sort nested mappings

Sorting preserves per-entry comments — each entry carries its inline and before-key comments with it.

Copy

  • doc.copy() — metadata-preserving shallow copy
  • copy.copy(doc) / copy.deepcopy(doc) — same

Format

See Normalizing formatting.

YamlSequence

YamlSequence is a subclass of list. Everything on YamlMapping applies, keyed by integer index instead of string key. Constructor:

# YamlSequence(iterable=None, *, style="block", tag=None)
s = yarutsk.YamlSequence([1, 2, 3], style="flow")
yarutsk.dumps(s)                       # '[1, 2, 3]\n'

All standard list operations work: indexing (negative supported), slicing, append, insert, pop, remove, extend, index, count, reverse, in, len, iteration, equality, json.dumps.

Per-item metadata is reached the same way as mappings — via seq.node(i). IndexError on out-of-range indices.

# Underlying node access
doc.node(0)                              # YamlScalar / YamlMapping / YamlSequence
doc.nodes()                              # [node, node, ...] preserving metadata

# Style
doc.node(0).style = "double"             # scalar: plain|single|double|literal|folded
doc.node(1).style = "flow"               # container: block|flow
doc[0] = yarutsk.YamlScalar("item", style="single")

# Comments
doc.node(0).comment_inline = "first item"
doc.node(2).comment_before = "group B"

# Blank lines
doc.node(0).blank_lines_before = 1

# Aliases
doc.get_alias(idx)                       # anchor name if alias, else None
doc.set_alias(idx, "anchor")

# Sorting (preserves comment metadata)
doc.sort()
doc.sort(reverse=True)
doc.sort(key=lambda v: len(v))
doc.sort(recursive=True)

Normalizing formatting

format() strips all cosmetic metadata and resets the document to clean YAML defaults. Available on YamlMapping, YamlSequence, and YamlScalar; recurses into nested containers.

src = """\
# Config
server:
  host: 'localhost'  # primary
  port: 8080

  debug: yes
"""
doc = yarutsk.loads(src)
doc.format()
print(yarutsk.dumps(doc))
# server:
#   host: localhost
#   port: 8080
#   debug: yes

Three keyword flags (all True by default) control what resets:

Flag Effect
styles=True Scalar quoting → plain (multiline strings → literal block \|); container style → block; non-canonical originals (0xFF, 1.5e10) cleared
comments=True comment_before and comment_inline cleared on every entry/item
blank_lines=True blank_lines_before and trailing_blank_lines zeroed

Tags, anchors, and document-level markers (explicit_start, yaml_version, etc.) are always preserved — they are semantic, not cosmetic.

Exceptions

Class Raised when
YarutskError Base class for all library errors
ParseError YAML input is malformed
LoaderError Schema loader callable raised
DumperError Schema dumper raised or returned the wrong type

Standard Python errors also surface naturally: RuntimeError for unsupported Python types without a registered dumper, KeyError for missing mapping keys, IndexError for out-of-range sequence indices.

See Error handling for worked examples.