vortex-python.md

Vortex Python

:::{warning} The Python API surface is not yet complete and is subject to change. Many operations available in the Rust API are not yet exposed. See the {doc}/api/python/index for the full reference. :::

Installation

```bash
pip install vortex-data
```

```bash
uv add vortex-data
```

Creating Arrays

{func}~vortex.array constructs a Vortex array from Python values:

>>> import vortex as vx
>>> arr = vx.array([1, 2, 3, 4])
>>> arr.dtype
int(64, nullable=False)
>>> len(arr)
4

Python's {obj}None represents a missing value and makes the dtype nullable:

>>> arr = vx.array([1, 2, None, 4])
>>> arr.dtype
int(64, nullable=True)

A list of {class}dict produces a struct array. Missing values may appear at any level:

>>> arr = vx.array([
...   {'name': 'Joseph', 'age': 25},
...   {'name': None, 'age': 31},
...   None,
... ])
>>> arr.dtype
struct({"age": int(64, nullable=True), "name": utf8(nullable=True)}, nullable=True)

{func}~vortex.array also accepts {class}pyarrow.Array, {class}pyarrow.Table, {class}pandas.DataFrame, and {class}range objects.

DTypes

DType factory functions are available at the top level of the vortex module:

>>> vx.int_(32)
int(32, nullable=False)
>>> vx.utf8(nullable=True)
utf8(nullable=True)
>>> vx.list_(vx.float_(64))
list(float(64, nullable=False), nullable=False)
>>> vx.struct({'x': vx.int_(32), 'y': vx.int_(32)})
struct({"x": int(32, nullable=False), "y": int(32, nullable=False)}, nullable=False)

Available types: {func}~vortex.null, {func}~vortex.bool_, {func}~vortex.int_, {func}~vortex.uint, {func}~vortex.float_, {func}~vortex.decimal, {func}~vortex.utf8, {func}~vortex.binary, {func}~vortex.struct, {func}~vortex.list_, {func}~vortex.fixed_size_list, {func}~vortex.date, {func}~vortex.time, {func}~vortex.timestamp.

Array Operations

Element Access

>>> arr = vx.array([10, 20, 30, 40, 50])
>>> arr.scalar_at(0).as_py()
10
>>> arr.to_arrow_array().to_pylist()
[10, 20, 30, 40, 50]

Slicing and Selection

>>> arr.slice(1, 3).to_arrow_array().to_pylist()
[20, 30]
>>> indices = vx.array([0, 2, 4])
>>> arr.take(indices).to_arrow_array().to_pylist()
[10, 30, 50]

Filtering

>>> mask = vx.array([True, False, True, False, True])
>>> arr.filter(mask).to_arrow_array().to_pylist()
[10, 30, 50]

Comparisons

>>> other = vx.array([10, 25, 25, 45, 50])
>>> (arr > other).to_arrow_array().to_pylist()
[False, False, True, False, False]

Expressions

The vortex.expr module provides expressions for filtering and projecting. These are primarily used with {meth}.VortexFile.scan and {meth}.VortexFile.to_arrow but can also be applied directly:

>>> import vortex.expr as ve
>>> arr = vx.array([
...     {'name': 'Alice', 'age': 30},
...     {'name': 'Bob', 'age': 25},
...     {'name': 'Carol', 'age': 35},
... ])
>>> expr = ve.column('age') > 28
>>> arr.apply(expr).to_arrow_array().to_pylist()
[True, False, True]

VortexFile

{func}~vortex.open lazily opens a Vortex file for reading:

>>> import pyarrow.parquet as pq
>>> vx.io.write(pq.read_table("_static/example.parquet"), 'example.vortex')
>>>
>>> f = vx.open('example.vortex')
>>> len(f)
1000

Use {meth}.VortexFile.scan to read data with optional projection, filtering, and limit:

>>> result = f.scan(['tip_amount'], limit=3).read_all()
>>> result.to_arrow_array()
<pyarrow.lib.StructArray object at ...>
-- is_valid: all not null
-- child 0 type: double
  [
    0,
    5.1,
    16.54
  ]

ArrayIterator

{class}.ArrayIterator streams batches of arrays from a scan or other source. It supports iteration, collecting into a single array, and conversion to Arrow.

{meth}.ArrayIterator.read_all collects all batches into a single in-memory {class}.Array:

>>> arr = f.scan(['tip_amount'], limit=5).read_all()
>>> len(arr)
5

{meth}.ArrayIterator.to_arrow converts to a {class}pyarrow.RecordBatchReader for use with Arrow-based tools:

>>> reader = f.scan(['tip_amount']).to_arrow()
>>> reader.schema
tip_amount: double
>>> table = reader.read_all()
>>> len(table)
1000

Conversion

Arrays convert to other formats:

Method	Result
{meth}`.Array.to_arrow_array`	{class}`pyarrow.Array`
{meth}`.Array.to_arrow_table`	{class}`pyarrow.Table`
{meth}`.Array.to_numpy`	{class}`numpy.ndarray`
{meth}`.Array.to_pandas`	{class}`pandas.DataFrame`
{meth}`.Array.to_pylist`	{class}`list`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vortex Python

Installation

Creating Arrays

DTypes

Array Operations

Element Access

Slicing and Selection

Filtering

Comparisons

Expressions

VortexFile

ArrayIterator

Conversion

FilesExpand file tree

vortex-python.md

Latest commit

History

vortex-python.md

File metadata and controls

Vortex Python

Installation

Creating Arrays

DTypes

Array Operations

Element Access

Slicing and Selection

Filtering

Comparisons

Expressions

VortexFile

ArrayIterator

Conversion