Codebase Structure (Overview)
This page explains where things live in the repository and how the pieces fit together.
Core philosophy (non-negotiable)
- PyCauset is “NumPy for causal sets”. Users interact with top-level Python APIs:
pycauset.matrix,pycauset.causal_matrix,pycauset.matmul, etc. - Storage, dtype selection, backend dispatch (CPU/GPU), and performance optimizations are automatic and behind the scenes.
- Internal code organization may change, but the public surface stays stable.
Repository map
Python package (user-facing)
python/pycauset/- Public Python API and higher-level convenience wrappers.
python/pycauset/__init__.pyis the public facade (stablepycauset.*entrypoints).python/pycauset/_internal/contains non-public implementation modules that the facade delegates to.- Rule of thumb: if you’re adding a new helper that is not meant to be imported by end users, it belongs in
_internal/. - See dev/Python Internals for the current internal module layout and extension rules.
- Rule of thumb: if you’re adding a new helper that is not meant to be imported by end users, it belongs in
- The native extension is imported as
pycauset._pycauset. - Note: native exports can vary by build configuration; Python code should avoid import-time crashes by guarding optional symbols.
C++ core (engine)
include/pycauset/— public C++ headers (engine API)src/— C++ implementationssrc/core/— memory mapping, storage utils, I/O accelerator, system utilssrc/matrix/— matrix types (dense, triangular, bit, etc.)src/vector/— vector typessrc/compute/— compute dispatch architectureComputeContext(singleton)AutoSolver(routes CPU vs GPU)cpu/(CPU device + solvers)
src/accelerators/cuda/— optional CUDA plugin (loaded dynamically)src/bindings.cpp— Python extension entrypoint (pybind11 module)src/bindings/— modular binding translation units (e.g.bind_matrix.cpp)
Tests & benchmarks
tests/python/— Python interface + integration teststests/*.cpp— C++ unit tests (engine-level)benchmarks/— performance scripts and comparison suites
Build system
pyproject.toml— canonical Python build entry (scikit-build-core)CMakeLists.txt— C++ build configuration and compiler flagsbuild.ps1— thin wrapper around the canonical pip build commands
How a user call flows through the stack
Example: C = A @ B (matrix multiplication)
- User code calls
A @ B(Python operator) orpycauset.matmul(A, B). - Python calls into the native extension (
pycauset._pycauset) or a thin Python wrapper. - Native code performs:
- type resolution / allocation,
- device dispatch (CPU vs GPU) via
ComputeContext+AutoSolver, - algorithm selection (direct vs streaming) via
MemoryGovernorand solver logic. - The result is returned as a Python object that wraps a C++
MatrixBaseimplementation.
Example: out-of-core optimization
When a matrix is disk-backed (memory-mapped), solvers can: - emit access-pattern hints (sequential/strided), - trigger prefetching via the I/O accelerator, - avoid page-fault thrashing during streaming kernels.
Ownership rules (where new code should go)
If you add/change a user-facing Python API
- Add docs under
documentation/docs/+documentation/guides/. - Keep the public entrypoint top-level (
pycauset.*). - Prefer a thin wrapper that delegates to the C++ engine.
If you add/change an engine feature
- Update the relevant C++ subsystem:
- storage/memory:
src/core/,include/pycauset/core/ - matrix/vector types:
src/matrix/,src/vector/ - compute/dispatch:
src/compute/ - CUDA:
src/accelerators/cuda/ - Update internals docs under
documentation/internals/.
If you add/change bindings
- Update bindings code and ensure tests cover the Python-level behavior.
- Add/update the binding checklist in dev/Bindings & Dispatch.
Roadmap constraint: NxM matrices
Today, much of the engine assumes square matrices (NxN). The roadmap includes NxM support for all matrix types (dense, triangular, symmetric, bit, etc.), while explicitly avoiding N-D arrays.
When reorganizing or refactoring, avoid hard-coding “square-only” assumptions into new architecture layers.