API reference¶
The complete public surface of yarutsk on a single page. Authoritative signatures live in python/yarutsk/__init__.pyi.
Loading and dumping¶
| Function | Purpose |
|---|---|
load(stream, *, schema=None) |
Load the first document from a file-like object |
loads(text, *, schema=None) |
Load the first document from a string or UTF-8 bytes |
load_all(stream, *, schema=None) |
Load every document from a multi-doc stream |
loads_all(text, *, schema=None) |
Load every document from a multi-doc string or UTF-8 bytes |
iter_load_all(stream, *, schema=None) |
Iterator over documents in a stream |
iter_loads_all(text, *, schema=None) |
Iterator over documents in a string or UTF-8 bytes |
dump(doc, stream, *, schema=None, indent=2) |
Emit a single document to a file-like object |
dumps(doc, *, schema=None, indent=2) |
Emit a single document to a string |
dump_all(docs, stream, *, schema=None, indent=2) |
Emit multiple documents to a stream |
dumps_all(docs, *, schema=None, indent=2) |
Emit multiple documents to a string |
load / loads return a YamlMapping, YamlSequence, or YamlScalar (for a top-level scalar document), or None for empty input. Nested container nodes are YamlMapping or YamlSequence; scalar leaves inside mappings and sequences are returned as native Python primitives (int, float, bool, str, bytes, datetime.datetime, datetime.date, or None).
dump / dumps accept YamlMapping, YamlSequence, and YamlScalar objects (preserving comments, styles, and tags), but also plain Python types: dict, list, tuple, set, frozenset, bytes, bytearray, scalar primitives, and any collections.abc.Mapping or iterable. Plain types are auto-converted with default formatting.
iter_load_all / iter_loads_all return a YamlIter object that drives the parser on demand and yields documents one at a time — never accumulating all documents in memory:
import io
import yarutsk
stream = io.StringIO("---\na: 1\n---\nb: 2\n---\nc: 3\n")
for doc in yarutsk.iter_load_all(stream):
print(doc) # {'a': 1}, then {'b': 2}, then {'c': 3}
load / load_all also stream from IO in 8 KB chunks rather than reading the entire input first, but they still build and return the full document tree.
loads / loads_all / iter_loads_all accept either str or UTF-8 bytes/bytearray — useful for feeding raw process output directly:
Non-UTF-8 bytes raise UnicodeDecodeError; any other type raises TypeError.
Type conversions¶
Implicit coercion¶
Plain YAML values (no tag) are converted to Python types automatically:
| Value pattern | Python type | Examples |
|---|---|---|
| Decimal integer | int |
42, -7 |
| Hex / octal integer | int |
0xFF → 255, 0o17 → 15 |
| Float | float |
3.14, 1.5e2, .inf, -.inf, .nan |
true / false (any case) |
bool |
True, FALSE |
yes / no / on / off (any case) |
bool |
YAML 1.1 booleans |
null, Null, NULL, ~, empty value |
None |
— |
| Anything else | str |
hello, "quoted" |
Non-canonical forms are reproduced as written on dump — yes stays yes, 0xFF stays 0xFF, ~ stays ~.
Explicit tags¶
A !!tag overrides implicit coercion and controls which Python type is returned:
| Tag | Python type | Notes |
|---|---|---|
!!str |
str |
Forces string even if the value looks like an int, bool, or null |
!!int |
int |
Parses decimal, hex (0xFF), and octal (0o17) |
!!float |
float |
Promotes integer literals (!!float 1 → 1.0) |
!!bool |
bool |
— |
!!null |
None |
Forces null regardless of content (!!null "" → None) |
!!binary |
bytes |
Base64-decoded on load; base64-encoded on dump |
!!timestamp |
datetime.datetime or datetime.date |
Date-only values return date; datetime values return datetime |
import datetime, yarutsk
# !!binary
doc = yarutsk.loads("data: !!binary aGVsbG8=\n")
doc["data"] # b'hello'
# !!timestamp
doc = yarutsk.loads("ts: !!timestamp 2024-01-15T10:30:00\n")
doc["ts"] # datetime.datetime(2024, 1, 15, 10, 30)
# !!float promotes integers
doc = yarutsk.loads("x: !!float 1\n")
doc["x"] # 1.0 (float, not int)
Dumping Python bytes / datetime auto-applies the appropriate tag.
Schema — custom types¶
Schema lets you register loaders (tag → Python object, fired on load) and dumpers (Python type → tag + data, fired on dump). Pass it as a keyword argument to any load or dump function.
Mapping types¶
Loader receives a YamlMapping; dumper returns a (tag, dict) tuple:
import yarutsk
class Point:
def __init__(self, x, y): self.x, self.y = x, y
schema = yarutsk.Schema()
schema.add_loader("!point", lambda d: Point(d["x"], d["y"]))
schema.add_dumper(Point, lambda p: ("!point", {"x": p.x, "y": p.y}))
doc = yarutsk.loads("origin: !point\n x: 0\n y: 0\n", schema=schema)
doc["origin"] # Point(0, 0)
Scalar types¶
Loader receives the raw scalar string; dumper returns a (tag, str) tuple:
class Color:
def __init__(self, r, g, b): self.r, self.g, self.b = r, g, b
schema = yarutsk.Schema()
schema.add_loader("!color", lambda s: Color(*[int(x) for x in s.split(",")]))
schema.add_dumper(Color, lambda c: ("!color", f"{c.r},{c.g},{c.b}"))
A dumper can return a YamlScalar, YamlMapping, or YamlSequence as the second tuple element to control the emitted style — the tag from the first element is stamped on top. Returning a YamlMapping(style="flow") or YamlSequence(style="flow") emits the container in flow style.
Overriding built-in tags¶
Registering a loader for !!int, !!float, !!bool, !!null, or !!str bypasses the built-in coercion. The callable receives the raw YAML string rather than the already-converted Python value:
schema = yarutsk.Schema()
schema.add_loader("!!int", lambda raw: int(raw, 0)) # parses 0xFF, 0o77, etc.
doc = yarutsk.loads("x: !!int 0xFF\n", schema=schema)
doc["x"] # 255
Multiple dumpers for the same type are checked in registration order; the first isinstance match wins.
Worked examples for plugging yarutsk into pydantic / msgspec / cattrs live on the Library integrations page.
YamlScalar¶
Top-level scalar documents are wrapped in a YamlScalar node:
doc = yarutsk.loads("42")
doc.value # 42 (Python int)
doc.to_python() # same as .value
# .value applies built-in tag handling
doc = yarutsk.loads("!!binary aGVsbG8=")
doc.value # b'hello'
doc = yarutsk.loads("!!timestamp 2024-01-01")
doc.value # datetime.date(2024, 1, 1)
# Style
doc = yarutsk.loads("---\n'hello'\n")
doc.style # 'single'
doc.style = "double" # 'plain'|'single'|'double'|'literal'|'folded'
# Tag
doc = yarutsk.loads("!!str 42")
doc.tag # '!!str'
# Anchor (demonstrated on a scalar root)
doc = yarutsk.loads("&root 42\n")
doc.anchor # 'root'
YamlScalar can be constructed directly to control emission when assigning into a mapping or sequence:
# Constructor: YamlScalar(value, *, style="plain", tag=None)
doc["x"] = yarutsk.YamlScalar("hello", style="double") # 'x: "hello"\n'
doc["x"] = yarutsk.YamlScalar("42", tag="!!str") # 'x: !!str 42\n'
doc["x"] = yarutsk.YamlScalar(b"hello") # 'x: !!binary aGVsbG8=\n'
doc["x"] = yarutsk.YamlScalar(datetime.date(2024, 1, 15)) # 'x: !!timestamp 2024-01-15\n'
value—bool,int,float,str,None,bytes,bytearray,datetime.datetime, ordatetime.datestyle—"plain"(default),"single","double","literal","folded"tag— YAML tag string, orNone. Forbytesdefaults to"!!binary", for datetime defaults to"!!timestamp"
Comments and blank lines¶
Comments and blank-lines-before live directly on each node. Reach the child via parent.node(key) (or parent.node(index) for a sequence) and read/write the attribute directly:
doc = yarutsk.loads("port: 5432 # db port\n")
doc.node("port").comment_inline # 'db port'
doc.node("port").comment_inline = "updated"
yarutsk.dumps(doc) # 'port: 5432 # updated\n'
doc.node("port").blank_lines_before = 2 # int property, clamped 0–255
For bare-scalar documents, comment_before and comment_inline are both preserved on the scalar:
doc = yarutsk.loads("# hello\n42 # answer\n")
doc.comment_before # 'hello'
doc.comment_inline # 'answer'
YamlMapping¶
YamlMapping is a subclass of dict with insertion-ordered keys. Constructor:
# YamlMapping(mapping=None, *, style="block", tag=None)
m = yarutsk.YamlMapping({"a": 1, "b": 2}, style="flow")
yarutsk.dumps(m) # '{a: 1, b: 2}\n'
The full method surface, grouped by concern:
Read / write¶
Every standard dict method works unchanged: doc[k], doc[k] = v, del doc[k], in, len, get, pop, setdefault, update, keys / values / items, iteration, equality, json.dumps(doc). Setting an existing key preserves its position.
Also:
doc.to_python()— deep conversion to a plain Pythondict/list/ primitive tree (loses all style metadata). Applies built-in tag handling (!!binary→bytes,!!timestamp→datetime/date)doc.node(key)— returns the underlyingYamlScalar/YamlMapping/YamlSequencepreserving style/tag/anchor;KeyErrorif absentdoc.nodes()—[(key, node)]pairs with metadata preserved
Per-child metadata — use node(key)¶
Style, comments, and blank-lines-before live on each child node. Reach the child with doc.node(key) and read/write the attribute directly:
doc["nested"] = yarutsk.YamlMapping(style="flow")
doc["nested"]["x"] = 1
doc["nested"].node("x").style = "double" # scalar style
doc.node("key").comment_inline = "hi" # comment on a child
doc.node("key").comment_inline = None # clear
doc.node("key").comment_before = "block\ncomment"
doc.node("key").blank_lines_before = 2 # int, clamped 0–255
node(key) returns a live handle: setter calls propagate to the parent, so the change is visible on the next dumps(doc).
Whole-mapping properties¶
doc.style/doc.style = "block" | "flow"— container style of this mapping itselfdoc.tag/doc.tag = "!!map"— YAML tagdoc.anchor/doc.anchor = "myanchor"— emits&myanchorbefore the mappingdoc.blank_lines_before—int, clamped 0–255doc.trailing_blank_lines = 1— blank lines after all entries- Top-level-only:
explicit_start,explicit_end,yaml_version,tag_directives
Aliases¶
doc = yarutsk.loads("base: &val 1\nref: *val\n")
doc.get_alias("ref") # 'val'
doc.get_alias("base") # None (has anchor, not alias)
doc["ref"] # 1 (resolved value always accessible)
doc.set_alias("other", "anchor") # mark value as emitting *anchor
Sorting¶
doc.sort_keys() # alphabetical, in-place
doc.sort_keys(reverse=True)
doc.sort_keys(key=lambda k: len(k)) # custom key
doc.sort_keys(recursive=True) # also sort nested mappings
Sorting preserves per-entry comments — each entry carries its inline and before-key comments with it.
Copy¶
doc.copy()— metadata-preserving shallow copycopy.copy(doc)/copy.deepcopy(doc)— same
Format¶
YamlSequence¶
YamlSequence is a subclass of list. Everything on YamlMapping applies, keyed by integer index instead of string key. Constructor:
# YamlSequence(iterable=None, *, style="block", tag=None)
s = yarutsk.YamlSequence([1, 2, 3], style="flow")
yarutsk.dumps(s) # '[1, 2, 3]\n'
All standard list operations work: indexing (negative supported), slicing, append, insert, pop, remove, extend, index, count, reverse, in, len, iteration, equality, json.dumps.
Per-item metadata is reached the same way as mappings — via seq.node(i). IndexError on out-of-range indices.
# Underlying node access
doc.node(0) # YamlScalar / YamlMapping / YamlSequence
doc.nodes() # [node, node, ...] preserving metadata
# Style
doc.node(0).style = "double" # scalar: plain|single|double|literal|folded
doc.node(1).style = "flow" # container: block|flow
doc[0] = yarutsk.YamlScalar("item", style="single")
# Comments
doc.node(0).comment_inline = "first item"
doc.node(2).comment_before = "group B"
# Blank lines
doc.node(0).blank_lines_before = 1
# Aliases
doc.get_alias(idx) # anchor name if alias, else None
doc.set_alias(idx, "anchor")
# Sorting (preserves comment metadata)
doc.sort()
doc.sort(reverse=True)
doc.sort(key=lambda v: len(v))
doc.sort(recursive=True)
Normalizing formatting¶
format() strips all cosmetic metadata and resets the document to clean YAML defaults. Available on YamlMapping, YamlSequence, and YamlScalar; recurses into nested containers.
src = """\
# Config
server:
host: 'localhost' # primary
port: 8080
debug: yes
"""
doc = yarutsk.loads(src)
doc.format()
print(yarutsk.dumps(doc))
# server:
# host: localhost
# port: 8080
# debug: yes
Three keyword flags (all True by default) control what resets:
| Flag | Effect |
|---|---|
styles=True |
Scalar quoting → plain (multiline strings → literal block \|); container style → block; non-canonical originals (0xFF, 1.5e10) cleared |
comments=True |
comment_before and comment_inline cleared on every entry/item |
blank_lines=True |
blank_lines_before and trailing_blank_lines zeroed |
Tags, anchors, and document-level markers (explicit_start, yaml_version, etc.) are always preserved — they are semantic, not cosmetic.
Exceptions¶
| Class | Raised when |
|---|---|
YarutskError |
Base class for all library errors |
ParseError |
YAML input is malformed |
LoaderError |
Schema loader callable raised |
DumperError |
Schema dumper raised or returned the wrong type |
Standard Python errors also surface naturally: RuntimeError for unsupported Python types without a registered dumper, KeyError for missing mapping keys, IndexError for out-of-range sequence indices.
See Error handling for worked examples.