Python API#
The Vortex Python API provides a Pythonic interface to the Vortex library via PyO3 bindings. It supports reading and writing Vortex files, compressing data, and integrating with the broader Python data ecosystem including PyArrow, Pandas, Polars, DuckDB, and Ray.
Installation#
pip install vortex-data
Optional integrations can be installed as extras:
pip install vortex-data[polars,pandas,numpy,duckdb,ray]
Compatibility#
The Python bindings require Python 3.11 or newer. Pre-built wheels are available for:
x86_64 Linux
ARM64 Linux
Apple Silicon macOS
They support any Linux distribution with a GLIBC version >= 2.17. This includes
Amazon Linux 2 or newer
Ubuntu 14.04 or newer
Usage Example#
Here’s a basic example of using the Vortex Python API to write and read a Vortex file:
import vortex
# Write a Vortex file from a PyArrow table
vortex.io.write_path(my_table, "data.vortex")
# Read a Vortex file
dataset = vortex.dataset("data.vortex")
table = dataset.to_arrow()
API Reference#
- Data Types
- Scalars
- Arrays
- Factory Functions
- Base Class
ArrayArray.__len__()Array.display_tree()Array.dtypeArray.filter()Array.from_arrow()Array.from_range()Array.idArray.nbytesArray.scalar_at()Array.take()Array.to_arrow_array()Array.to_arrow_table()Array.to_numpy()Array.to_pandas()Array.to_polars_dataframe()Array.to_polars_series()Array.to_pylist()
- Canonical Encodings
- Utility Encodings
- Compressed Encodings
- Pluggable Encodings
- Registry and Serde
- Streams and Iterators
- Expressions
- Compression
- Input and Output
- Object Store support
- Dataset
VortexDatasetVortexDataset.count_rows()VortexDataset.filter()VortexDataset.get_fragments()VortexDataset.head()VortexDataset.join()VortexDataset.join_asof()VortexDataset.replace_schema()VortexDataset.scanner()VortexDataset.schemaVortexDataset.sort_by()VortexDataset.take()VortexDataset.to_batches()VortexDataset.to_record_batch_reader()VortexDataset.to_table()
VortexFragmentVortexScanner
- Type Aliases