R1_PROPERTIES — Semantic Properties + Property-Aware Algebra (Release 1)
Status: Phase A–F implemented (Phase E expanded: property-aware matmul/solve/eigvalsh; O(1) mutation effect summaries integrated)
Last updated: 2025-12-21
Documentation note:
This file is a planning/spec artifact. User-visible R1 behavior is documented in:
- Release 1: Properties (what shipped)
- Storage and Memory (how properties/caches persist)
- API footprint: MatrixBase and VectorBase
Purpose
Release 1 needs a semantic properties system that algorithms can rely on to select fast paths, specialized behavior, and stable dispatch.
This is a deliberate shift from “structure-by-type” (e.g. selecting triangular behavior by checking whether an object is a triangular class) to structure-by-properties. In R1_PROPERTIES, algorithms treat the relevant properties as the authoritative source of truth.
These properties are authoritative (“gospel”):
- The system does not validate that asserted semantic properties are mathematically true.
- Implementations are allowed to behave as if the property is true.
- The only checks performed are minimal compatibility/sanity checks that prevent self-contradictory property states or structurally impossible states (e.g.,
is_unitaryon a non-square matrix).
This plan defines:
- the canonical properties schema,
- compatibility rules,
- propagation rules across metadata-only transforms,
- the derived-cache correctness/invalidation model,
- and where/how properties affect operator implementations and dispatch.
At a glance (the contract)
- Properties:
obj.propertiesis a typed mapping exposed to users. - Hard errors on incompatibility: structurally impossible / internally contradictory asserted states raise immediately.
- No scans: neither property rules nor cache validation require scanning payload.
- Cached derived values: some entries in
obj.propertiesare cached derived values (e.g.,trace) with strict validity rules; they are never treated as semantic structure properties. - Health check is \(O(1)\): no additional payload passes; parallel kernels may emit a constant-size effect summary.
- Storage integration: see guides/Storage and Memory (format + semantics). The container mechanics were tracked under the R1_STORAGE plan during implementation.
Implementation status (as of 2025-12-21)
This plan is implemented end-to-end for Release 1 scope.
- Properties mapping exposed:
obj.propertiesexists on native objects via Python patching (python/pycauset/_internal/properties.py, installed inpython/pycauset/__init__.py). - Phase A compatibility enforced:
obj.propertiesis a validating mapping; structurally impossible / contradictory asserted states raise immediately (no payload scans). Compatibility now includes square-only structure flags (triangular/atomic), ordering constraints (strictly-sorted ⇒ sorted), and the “upper+lower ⇒ not diagonal-false” constraint. - Persistence bridging present:
meta["properties"]round-trips; cached-derived values persist undercached.*and are surfaced only when signatures match (python/pycauset/_internal/persistence.py). - Metadata-only view propagation present:
transpose,conj,.Hpropagate properties in O(1) (python/pycauset/_internal/properties.py). - Cached-derived propagation improved: scalar multiply and add/sub apply the Phase-A propagation rules for cached-derived values where safe (O(1)), including
sum. - Priority tree implemented: an internal helper computes an effective structure category from properties (zero/identity/diagonal/triangular/general) without scanning payload.
- Operator wiring (Phase E expanded): property-aware routing exists for
matmul,solve/solve_triangular, andeigvalsh(cached eigenvalues).trace()/determinant()and scalar-returningpycauset.norm()/pycauset.sum()continue to consult/seed cached-derived values. - Tests + docs exist: see
tests/python/test_properties_persistence.pyandtests/python/test_properties_operator_wiring.py; docs updated in the storage/semantics guides.
Open gaps relative to the plan:
- None for R1. Kernel-style effect summaries are applied via the properties shim (
_apply_effect_summary_inplace) and are exercised by mutation patches; device kernels can emit the same constant-size summaries later without changing semantics.
Key references
- Roadmap node:
documentation/internals/plans/TODO.md→ R1_PROPERTIES - Compute model:
documentation/internals/Compute Architecture.md - Storage semantics + format (canonical docs):
documentation/guides/Storage and Memory.md - Current persistence implementation:
python/pycauset/_internal/persistence.py - Support readiness tracking:
documentation/internals/plans/SUPPORT_READINESS_FRAMEWORK.md
Constraints (non-negotiable)
- No truth validation: never scan matrix/vector data to “confirm” a property.
- No backwards-compat aliases: the public name is
properties(notflags, notuser_flags), and the persistence schema must not accept legacy names. - Deterministic behavior: given the same properties and operation sequence, results and chosen algorithmic paths must be deterministic.
- Minimal incompatibility checks only: only reject property combinations that are structurally impossible or internally contradictory for the property lattice.
- Propagation is mandatory: metadata-only transforms must update properties predictably.
- Testing + documentation phase at the end: the final phase is an explicit “tests + docs” closure phase.
Design anchors (agreed)
These are design principles specific to R1_PROPERTIES (in addition to the project-wide mantras).
- Properties-as-gospel (power-user feature): properties are authoritative claims. If a matrix is marked as triangular, triangular-aware algorithms are allowed to behave as if entries outside the relevant triangle are zero, even if the underlying payload is not triangular.
- No truth validation: the system must never scan matrix/vector data to “confirm” a property.
- Lazy-evaluation invariant (metadata scaling): property compatibility checks, property propagation, and effective-structure selection must be \(O(1)\) and must depend only on existing metadata (shape, dtype, existing properties) plus the transform parameters (e.g. the scalar value).
- Bedrock scope: every
PersistentObjectmust have properties support. R1 includes matrices and vectors. - Documentation is part of the feature: user-facing docs must explicitly teach the “power user” semantics and the fact that properties can change algorithm choices.
- Hybrid caching (agreed): some expensive derived quantities should be preserved under certain transforms using only \(O(1)\) rules; otherwise they must be unset. No cache exists if it forces recomputation.
Definitions
Terminology
- “Property” means a typed metadata item attached to a
PersistentObject. obj.propertiesis the single user-facing mapping, but it contains two semantic classes of entries:- Gospel assertions: structural/special-case intent (triangular/diagonal/unitary/etc). Never truth-validated.
- Cached-derived values:
trace/determinant/rank/normand similar. Validity-checked and may be cleared. - Some keys are boolean-like; for those, we use a tri-state convention via key presence.
Canonical representation
Public API concept: obj.properties is a mapping from stable string keys to typed values.
- Keys are stable, snake_case strings.
- Values are typed and may be unset.
- Properties are persisted in
.pycausetmetadata (see “Persistence + metadata schema”).
Value model (Release 1)
obj.properties is a typed mapping (string keys → typed values). For boolean-like keys we use tri-state semantics:
True: asserted; algorithms may exploit it.False: explicitly negated.- Unset: no claim.
Public API decision (agreed):
- Unset means the key is absent from
obj.properties. - Explicit
Falseis represented by the key being present with valueFalse.
Persistence and internal representations must preserve “missing vs explicit False”.
Metadata taxonomy (aligns with R1_STORAGE)
R1_STORAGE defines three kinds of metadata on disk (identity/header vs view-state vs user-facing properties/caches). R1_PROPERTIES defines the user-facing semantics and how operators interpret and propagate them.
In practice:
1) Identity/header metadata (system-managed) - Shape/dtype and any payload layout descriptor needed to interpret bytes. - Not user-facing.
2) View-state metadata (system-managed) - Examples: transpose/conjugation/adjoint state, scalar factors. - Must be metadata-only and \(O(1)\). - Participates in cached-derived validity signatures.
3) User-facing properties (single mapping; two semantic classes)
- Exposed as obj.properties.
- Gospel assertions are authoritative and never truth-validated.
- Cached-derived values are user-facing via clean keys, but are persisted as caches with validity metadata (see Phase C).
Caching system (Release 1)
R1 introduces a more rigorous caching model so that derived metadata remains correct under mutation and metadata-only transforms.
Principles:
- Correctness first: cached values must never be used if they might be stale.
- No scans for validation: cache validity checks must be \(O(1)\) and based on metadata/versioning only.
- Deterministic: given the same history of operations, caches must be filled/cleared deterministically.
Post-operation property health check (required)
Every operation that returns a new PersistentObject and every in-place mutation must run a cheap, deterministic post-operation health check step that:
- applies property propagation rules for metadata-only transforms,
- updates/propagates cached derived values where safe (\(O(1)\) rules only),
- clears any cached derived values that are no longer guaranteed to match the object’s semantics,
- performs minimal compatibility checks (no scans).
This “health check” must not require any payload scan or any additional payload pass.
Clarification (parallelization-friendly contract):
- Compute kernels (CPU/GPU/streaming) are always allowed to read/write payload as required to produce correct results.
- The health check is a metadata normalization step, not a compute step.
- The health check may use:
- existing metadata (shape, dtype, view-state, existing properties), and
- the operation parameters (e.g., transpose, scalar value, indices for
M[i, j] = ...), and - an optional constant-size effect summary produced by the operation while it runs.
The key rule is: property/caching correctness must not force a second pass over payload.
Complexity requirement (agreed):
- The health check must be strictly \(O(1)\).
- No \(O(\log N)\) work is allowed here (even for “small” metadata updates), because this check must also run after very frequent primitives such as element
set().
Effect summaries (enabler for parallel kernels)
To make future parallelization plug in cleanly (via AutoSolver / ComputeDevice) without special integration work, operations may emit an effect summary:
- A small fixed-size struct (conceptually) that answers a few yes/no questions about what the op definitely did.
- It is computed “for free” during the operation (e.g., via an OR-reduction of per-thread booleans), without extra passes.
- If an op cannot cheaply provide a fact, it reports “unknown” and the health check conservatively unsets affected properties/caches.
Examples of useful effect bits:
- “payload mutated” (increments content epoch)
- “wrote any off-diagonal element”
- “wrote any diagonal element”
- “result is known-all-zero” (only for ops like explicit zero-fill; not for general matmul)
This keeps the health check deterministic and \(O(1)\), while letting highly-parallel kernels remain unconstrained.
Where the health check is required (non-exhaustive; must be completed in Phase A):
- After any payload mutation method (e.g., element
set, bulk fill, random fill). - After any metadata-only transform that changes view-state (transpose/conjugate/adjoint, scalar changes).
- After any operation that constructs a view (clones, slices/views, transpose views).
- After load/deserialize (properties and caches must be normalized).
- After any API that lets users set properties explicitly (power-user setter).
Minimum design requirements:
- Every
PersistentObjectmaintains a cheap-to-check content version (e.g., a monotonically increasing epoch) that changes whenever payload data is mutated. - Every cached-derived entry records the version/signature it depends on (payload version + relevant view-state signature).
- On lookup, the system uses a cached-derived value only if the dependency signature matches; otherwise it is ignored/cleared.
Cached-derived propagation under metadata-only transforms:
- Some caches can be preserved or transformed without scanning (e.g., trace and determinant under transpose; trace scales linearly with scalar; determinant scales by \(scalar^n\) for \(n\times n\)).
- If a cache does not have a safe propagation rule, it must be cleared to avoid stale answers.
Core cached-derived propagation table (initial; must be completed in Phase A)
This is the minimum expected “cheap propagation” set for R1.
Definitions:
- “Unset” means the entry is removed from the cached-derived store.
- If a rule says “requires X known”, it means the input cache entry must be present.
| Operation / transform | trace |
determinant |
rank |
norm (vector) |
|---|---|---|---|---|
| transpose | preserve (if known) | preserve (if known) | preserve (if known) | preserve (if known) |
| conjugation | conjugate (if known) | conjugate (if known) | preserve (if known) | preserve (if known) |
| adjoint | conjugate (if known) | conjugate (if known) | preserve (if known) | preserve (if known) |
| scalar multiply by \(s\) | if known: trace *= s |
if known and square: det *= s^n |
if known: if \(s=0\) then rank=0 else preserve |
if known: norm *= |s| |
| add/subtract | if both known: add/sub | unset | unset | unset |
| any payload mutation (e.g. set element) | unset | unset | unset | unset |
| any other op without an explicit cheap rule | unset | unset | unset | unset |
Notes:
- “payload mutation” means modifying stored data, not changing metadata-only view state.
- This table applies to cached-derived values (regardless of whether the value was produced by computation or set by a power user).
R1 decision (updated):
trace,determinant,rank, and vectornormare cached-derived values, not semantic structure properties.- There is no semantic distinction between “user-set” and “computed” cache values. If a user sets
trace=231.2, that is simply the cached trace value. - Public methods/endpoints (e.g.,
trace()) may use a cached value if present and valid; otherwise they compute and then populate the cache.
User-populated caches (power-user feature)
Power users may set derived cache values directly to avoid expensive computation.
Semantics (agreed by direction):
- A user-populated cache entry is treated exactly like a computed cache entry.
- Cache validity is still enforced: a cache entry is used only when its dependency signature matches (payload content epoch + relevant view-state signature).
- If the payload is mutated, the dependency signature changes and the cached value is ignored/cleared.
- If a metadata-only transform occurs and there is no explicit \(O(1)\) propagation rule for that cache key, the cache entry is cleared.
This preserves the correctness bar (“never stale”) while still allowing power users to seed expensive derived quantities.
Provenance note:
- The system does not expose a user-visible distinction between “user-set” and “computed” cache values.
- Internal provenance tracking is optional (debug/telemetry only) and must not change semantics.
Robust general cache system (required; makes future cached quantities easy)
R1 must define a general cache system that supports adding cached quantities later without redesign.
Contract:
- Cached-derived values have a cache store (internal backing) with validity metadata.
- Cache keys are stable identifiers (e.g.,
trace,determinant,rank,norm). - Each cache entry records:
- a typed value, and
- a dependency signature (at minimum: payload content epoch + view-state signature).
- Cache lookup is deterministic and \(O(1)\).
- Cache validity checks and propagation never require scanning payload.
Registration/extension model:
- Each cache key must define:
- its value type,
- what it depends on (payload epoch + which view-state components matter), and
- explicit \(O(1)\) propagation rules for the supported metadata-only transforms.
- If a cache key has no rule for a given transform, the default behavior is to clear it.
Practical outcome:
- Adding a new cached quantity later is a matter of defining the key and its propagation/invalidation rules, not inventing a new ad-hoc persistence path or special-case logic inside operators.
Properties included in R1_PROPERTIES (initial set)
This plan covers properties for matrices and vectors.
Matrix properties:
Special-case algebra (short-circuits):
is_zero: treat every entry as \(0\) (a “zero matrix” property; not a shape claim).is_identity: treat as an identity-like matrix \(I\).- Important: PyCauset allows rectangular identity (see
pycauset.identity()), sois_identity=Trueis allowed for non-square matrices. is_permutation: treat as a permutation matrix \(P\) (re-indexing / row/col permutation).
Structure properties (index skipping / structured algorithms):
is_diagonaldiagonal_value(typed; if present, asserts every diagonal element equals this value)has_unit_diagonal(shorthand; equivalent todiagonal_value = 1when meaningful)is_upper_triangularis_lower_triangularhas_zero_diagonal(shorthand; equivalent todiagonal_value = 0when meaningful)
Symmetry family:
is_symmetricis_anti_symmetric(a.k.a. “skew-symmetric”)is_hermitianis_skew_hermitian
Norm-preserving / spectral hints:
is_unitary
Matrix-function hint:
is_atomic: treat as an “atomic” triangular matrix in the sense used by common matrix-function algorithms (e.g. clustered diagonal values). This is gospel and is not validated.
Cached-derived matrix properties (cache semantics; validity-checked; may be cleared):
tracedeterminantranksum
Vector properties:
is_zero: treat every entry as \(0\) (a “zero vector” property; not a shape claim).is_unit_norm: treat \(\|v\|_2 = 1\).is_sortedis_strictly_sorted
Cached-derived vector properties (cache semantics; validity-checked; may be cleared):
norm(e.g. cached \(\|v\|_2\))sum
Notes:
- Strict triangular is represented as
diagonal_value=0+ triangular (no separate strict-triangular booleans). - We do not add a broad zoo (SPD/PSD/normal/etc) in R1_PROPERTIES; those can be R2+.
Important: the public/user-facing concept is properties, and the system must be ready for non-boolean properties (e.g., numeric metadata like rank) without redesigning the plumbing.
Compatibility model (minimal sanity checks)
These checks are allowed because they protect the internal semantics of the property lattice and avoid impossible propagation states. They are not truth validation.
Hard error policy (agreed):
- If a caller (user or internal) attempts to assert an incompatible property state (e.g.
is_unitary=Trueon a non-square matrix), this must raise a hard error immediately. - This is not “validation”; it is enforcing internal semantic consistency.
Tri-state scope rule:
- Compatibility checks primarily constrain asserted structural truths (keys with value
True). - Unset (missing) means “no claim” and does not create contradictions by itself.
- Explicit
Falsemust be preserved, but does not introduce additional obligations unless it directly contradicts a required implication of an assertedTrue.
Shape constraints
is_unitary=Truerequires a square matrix.-
is_hermitian=Truerequires a square matrix. -
is_identity=Trueis allowed for rectangular matrices (rectangular identity semantics). is_permutation=Truerequires a square matrix.is_symmetric=Truerequires a square matrix.is_anti_symmetric=Truerequires a square matrix.is_skew_hermitian=Truerequires a square matrix.
Diagonal note (agreed):
is_diagonal=Trueis allowed for rectangular matrices. Semantics: all off-diagonal entries are treated as zero, with the diagonal defined for indices \(i=i\) up to \(\min(m,n)\).
For triangular-only properties:
is_upper_triangular=Trueand/oris_lower_triangular=Truerequires a square matrix.is_atomic=Truerequires a square matrix.
Shape constraints apply only when the relevant property value is True.
Lattice/implication constraints
Strict triangular note:
- R1 avoids a dedicated strict-triangular boolean zoo.
- Instead, “strictly upper/lower triangular” is represented as:
is_upper_triangular=True(oris_lower_triangular=True) anddiagonal_value = 0(orhas_zero_diagonal=True).
Diagonal consistency:
If is_diagonal=True then it may still be compatible with diagonal_value=0 (zero diagonal) or diagonal_value=1 (unit diagonal).
Diagonal-value consistency (agreed):
diagonal_valuedoes not implyis_diagonal.- If
has_unit_diagonal=True, thendiagonal_value(if present) must equal \(1\). - If
has_zero_diagonal=True, thendiagonal_value(if present) must equal \(0\).
Identity consistency (agreed):
is_identity=Truerequires:is_zerois notTrue.is_diagonalis notFalse.has_unit_diagonalis notFalse.has_zero_diagonalis notTrue.diagonal_value(if present) must equal \(1\).
Contradictions that must be rejected:
is_upper_triangular=Truetogether withis_lower_triangular=Trueis only meaningful as “diagonal-like” behavior; if the user also assertsis_diagonal=False, that must be rejected.
Symmetry-family sanity checks (minimal):
is_symmetric=Trueandis_anti_symmetric=Trueis rejected unlessis_zero=True.is_hermitian=Trueandis_skew_hermitian=Trueis rejected unlessis_zero=True.
Ordering constraints:
- If
is_strictly_sorted=Truethenis_sortedmust be True.
Tri-state refinement:
- If
is_strictly_sorted=Truethenis_sortedmust not beFalse(it may beTrueorNone).
What is explicitly not validated
- Hermitian/unitary truth is never checked.
- “Hermitian implies diagonal implies triangular” is not auto-enforced.
- Setting properties that are mathematically inconsistent but not structurally contradictory is allowed.
Priority tree (effective structure)
Operators that can exploit structure compute an effective structure category from properties without mutating stored properties.
Terminology note: this is computed from properties.
Implementation note (performance): the effective structure category must be computed once per operation and passed downward (do not repeatedly query the properties container inside inner loops).
Priority (highest first):
0) Zero: if is_zero=True.
1) Identity: if is_identity=True.
2) Diagonal: treat as diagonal if is_diagonal=True OR if both is_upper_triangular=True and is_lower_triangular=True.
3) Upper triangular: if is_upper_triangular=True.
4) Lower triangular: if is_lower_triangular=True.
5) General.
Rationale:
- Diagonal dominates triangular for index skipping.
- “Both upper and lower” implies diagonal behavior for algorithms (even if the user did not set
is_diagonal).
Propagation rules
Properties must be transformed by metadata-only operations deterministically.
Tri-state propagation rule:
- Propagation must preserve the distinction between
FalseandNone. - When a transform does not preserve a property in general, the propagated value should become
None(unset), notFalse.
Transpose
On transpose:
is_upper_triangular⇄is_lower_triangular- strictness is represented via
diagonal_value=0, which is preserved under transpose. is_diagonalstays the samehas_unit_diagonalstays the samehas_zero_diagonalstays the same-
diagonal_valuestays the same -
is_zerostays the same is_identitystays the same-
is_permutationstays the same -
is_symmetricstays the same is_anti_symmetricstays the sameis_hermitianbecomes None unless the dtype is real (transpose does not preserve Hermitian-ness in general)-
is_skew_hermitianbecomes None unless the dtype is real (transpose does not preserve skew-Hermitian-ness in general) -
is_atomicstays the same
For is_unitary:
is_unitarybecomes None (transpose does not preserve unitarity in general)
Note: the final line is intentional: we do not attempt to “correct” user-set properties via propagation.
Conjugation
On elementwise conjugation:
- Triangular/diagonal properties unchanged
is_hermitianbecomes None unless the dtype is real (conjugation does not preserve Hermitian-ness in general)-
is_skew_hermitianbecomes None unless the dtype is real (conjugation does not preserve skew-Hermitian-ness in general) -
diagonal_valueis conjugated (if present) -
is_zerostays the same is_identitystays the same-
is_permutationstays the same -
is_symmetricstays the same -
is_anti_symmetricstays the same -
is_atomicstays the same
For is_unitary:
is_unitarybecomes None (conjugation does not preserve unitarity in general)
Adjoint (conjugate-transpose)
If an adjoint operation exists as a single transform:
- apply transpose rules plus conjugation rules, except:
is_unitarystays the same (adjoint preserves unitarity)is_hermitianstays the same (adjoint preserves Hermitian-ness)is_skew_hermitianstays the same (adjoint preserves skew-Hermitian-ness)
Scalar multiply
On multiply-by-scalar:
-
Triangular/diagonal structural properties unchanged
-
diagonal_valuescales by the scalar (if present) -
is_zerostays the same
For is_identity and is_permutation:
- if the scalar is exactly \(1\), keep the property
- otherwise, the property becomes False
For is_hermitian:
- if the scalar is real,
is_hermitianstays the same - otherwise,
is_hermitianbecomes None
For is_skew_hermitian:
- if the scalar is real,
is_skew_hermitianstays the same - otherwise,
is_skew_hermitianbecomes None
For is_unitary:
- if \(|scalar| = 1\),
is_unitarystays the same - otherwise,
is_unitarybecomes None
We allow propagation rules to invalidate properties (set to None or False) when the operation semantics make that deterministic, but we do not infer new True claims from the scalar (e.g., scalar == 0 does not cause us to set additional properties to True).
Views / clones
- Pure metadata-only views must copy properties into the new object (independent ownership), and then apply the propagation rules for the view transform.
- Clones/materializations preserve properties by value.
Where properties affect algorithms (Release 1 scope)
Properties can legally change algorithm behavior. In R1_PROPERTIES we focus on the cases that create immediate leverage without broad rewrites.
Target operator families:
- Index skipping for structural loops (diagonal/triangular).
- Specialized solve paths:
- triangular solve shortcuts when triangular properties are set.
- Specialized multiplications:
- diagonal @ matrix and matrix @ diagonal.
- Spectral shortcuts:
is_hermitianchooses Hermitian eigen routines when available.is_unitaryallows using unitary identities where implemented.
The exact list of operators to update is tracked in SRP (op × dtype × device × structure table).
In particular, type-based structural dispatch (e.g., “triangular-by-class”) must be replaced (or gated) so that properties are the authoritative source of structural intent. Typed storage formats remain important for performance, but they are not the semantic authority.
Phases
Phase A — Lock schema + compatibility rules
Status (R1): Implemented
Implementation + coverage:
- Compatibility enforcement:
python/pycauset/_internal/properties.py(obj.propertiesvalidating mapping) - Tests:
tests/python/test_properties_compatibility.py
Deliverables:
- Canonical key set (initial properties) is finalized.
- Compatibility rules are finalized (shape constraints + lattice contradictions).
- Priority tree is finalized.
- Tri-state semantics (
True/False/None) and its propagation rules are finalized. - Lazy-evaluation invariant is explicitly enforced (no propagation/compatibility rule may require scanning data).
- Caching model requirements are locked (content-versioning + safe propagation vs clear rules).
Phase B — Public API surface contract
Status (R1): Implemented
Implementation + coverage:
- Public surface:
obj.propertiesmapping with tri-state booleans via key presence - Assignment + per-key updates:
python/pycauset/_internal/properties.py
Deliverables:
- Define
propertiesexposure shape in Python (mapping of stable keys to typed values; boolean-like keys use tri-state semantics via presence). - Define mutation contract (set whole mapping vs per-key updates) and error behavior for invalid combinations.
Phase C — Persistence + metadata schema
Status (R1): Implemented
Implementation + coverage:
- Persistence bridge:
python/pycauset/_internal/persistence.py - Tests:
tests/python/test_properties_persistence.py
Deliverables:
- Define the
.pycausetmetadata schema required to store: - gospel
properties(including missing-vs-explicit semantics), and - cached-derived values (validated by version/signature).
- Define the rule: persisted objects must preserve property values and whether each boolean-like property is explicitly set vs unset.
Naming/encoding note (must be decided in Phase C):
- User-facing keys are clean (e.g.,
trace), but persistence must also encode “this is cached-derived” + dependency signatures. - Decision (agreed): store cached-derived values under a dedicated
cached/cachessection keyed by clean names (e.g.,cached.trace).
Implications:
- In memory: users read/write
obj.properties["trace"]. - On disk:
metadatastores cached-derived values undercached.trace(and friends), alongside the dependency signature. - On load: if a cached-derived value is present and its dependency signature matches the restored object state, it is surfaced as
obj.properties["trace"]. - On save: cached-derived values are written under
cached.*(never as top-level keys likecached_trace).
Storage-format note:
- The single-file container format and typed metadata encoding are tracked under
documentation/internals/plans/R1_STORAGE_PLAN.md. - R1_PROPERTIES must remain storage-format agnostic: the
propertiesand derived-cache semantics must plug into the storage layer without changing frontend save/load call sites.
Phase D — Propagation integration
Status (R1): Implemented
Implementation + coverage:
- View propagation: transpose/conj/adjoint in
python/pycauset/_internal/properties.py - Scalar + add/sub cached-derived propagation in
python/pycauset/_internal/properties.py - Tests:
tests/python/test_properties_propagation.pyandtests/python/test_properties_compatibility.py
Deliverables:
- Ensure metadata-only transforms update properties per the propagation rules.
- Ensure clones/materializations preserve properties.
- Ensure derived caches are invalidated or safely propagated under metadata-only transforms.
Phase E — Property-aware operator wiring
Status (R1): Implemented
Implemented in R1:
- Priority tree helper for effective structure selection (no scans)
- Scalar operator wiring:
trace(),determinant(),pycauset.norm(),pycauset.sum() - Structured routing:
matmulroutes diagonal/triangular cases based on effective structure.solveroutes diagonal/triangular cases viasolve_triangular.eigvalshconsults/seeds cached-derivedeigenvalues.
Deliverables:
- Identify the minimal operator set that must become property-aware in R1 (tracked via SRP inventory).
- Implement algorithm selection using the priority tree (effective structure), without mutating stored properties.
Phase F — Tests + documentation (FINAL)
Status (R1): Implemented for current surface; ongoing as operator coverage expands
Artifacts:
- Tests:
tests/python/test_properties_*.py - Canonical storage/caching docs:
documentation/guides/Storage and Memory.md
Deliverables:
- Tests:
- compatibility checks (invalid combinations reject deterministically),
- propagation checks (transpose/conjugate/scalar multiply),
- persistence round-trip preserves properties.
- Docs:
- user-facing docs for
properties(semantics: gospel, not validated), - operator docs where properties change behavior.
- a dedicated “Power users” section explaining that properties can override data truth and force structured algorithms.
Additionally:
-
The “Adding Operations” protocol must be updated to include a properties checklist (how an op reads properties, whether it propagates/changes properties, and where that logic lives so it is not scattered).
-
The caching model (derived caches) must be documented: how caches are validated, when they are cleared, and which metadata-only transforms preserve/transform which caches.
Open questions (must be resolved before Phase B)
- Do we want to keep
is_unitarypropagation rules as written (transpose/conjugate clear; adjoint preserves), or shouldis_unitarybe preserved more aggressively? - Should
has_unit_diagonalandhas_zero_diagonalremain as stored boolean properties, or shoulddiagonal_valuebe the only canonical representation (with shorthands computed on-demand)?