pycauset.convert_file
pycauset.convert_file(src_path, dst_path, *, dst_format=None, allow_huge=False, dtype=None, npz_key=None)
Convert between PyCauset snapshots and NumPy container formats.
- Supported formats:
.pycauset(canonical snapshot),.npy,.npz. dst_formatdefaults fromdst_pathwhen omitted; must be one of the supported suffixes..npzimports default to the first key; setnpz_keyto choose a specific array name.- Exports honor the NumPy materialization guard: pass
allow_huge=Trueonly when you intentionally want to load spill/file-backed operands into RAM. - Optional
dtypecasts on export (to NumPy formats) before writing.
Exceptions / warnings
ValueErrorif source or destination format is not one of.pycauset,.npy,.npz(or cannot be inferred from the suffix).RuntimeErrorif NumPy is unavailable when exporting to.npy/.npz.- Materialization guard: exporting spill/file-backed objects to NumPy formats raises unless
allow_huge=True.
Parameters
- src_path (str | Path): Source file path; suffix must be
.pycauset,.npy, or.npz. - dst_path (str | Path): Destination file path; suffix or
dst_formatselects the output format. - dst_format (str, optional): Override destination format (
"pycauset","npy", or"npz"). If omitted, inferred fromdst_path. - allow_huge (bool, default
False): Forwarded to NumPy export helpers; required when exporting spill/file-backed objects to avoid surprise materialization. - dtype (optional): Override dtype on export to NumPy formats.
- npz_key (str, optional): Key to read from/write to when the source or destination is
.npz. Defaults to the first key on import and to"array"on export.
Returns
- Path: The destination path.
Examples
import pycauset as pc
# Snapshot (.pycauset) -> NumPy .npy -> snapshot
pc.convert_file("A.pycauset", "A.npy")
pc.convert_file("A.npy", "A_roundtrip.pycauset")
# Extract from an npz archive into a snapshot
pc.convert_file("bundle.npz", "vec.pycauset", npz_key="vector0")
# Export to npz with explicit dtype and large-export opt-in
pc.convert_file("big.pycauset", "big.npz", allow_huge=True, dtype="float32", npz_key="arr")
Future format targets (not implemented yet)
These are under consideration for later releases; they are not supported by convert_file today:
- MatrixMarket
.mtx(sparse/text interchange) - MATLAB
.mat(engineering/scientific interop) - Parquet / Arrow / CSV (tabular pipelines; CSV mainly for debugging)
- HDF5/NetCDF (only if a low-maintenance reader fits the budget)
See Also
- pycauset.save
- pycauset.load
- Storage and Memory
- NumPy Integration
- NumPy helpers in the Python API:
load_npy,load_npz,save_npy,save_npz