tiledbsoma.SparseNDArray

class tiledbsoma.SparseNDArray(handle: _WrapperType_co | DataFrameWrapper | DenseNDArrayWrapper | SparseNDArrayWrapper, *, _dont_call_this_use_create_or_open_instead: str = 'unset')

SparseNDArray is a sparse, N-dimensional array, with offset (zero-based) integer indexing on each dimension. SparseNDArray has a user-defined schema, which includes:

  • The element type, expressed as an Arrow type, indicating the type of data contained within the array.

  • The shape of the array, i.e., the number of dimensions and the length of each dimension.

All dimensions must have a positive, non-zero length, and there must be 1 or more dimensions. Implicitly stored elements (i.e., those not explicitly stored in the array) are assumed to have a value of zero.

Where explicitly referenced in the API, the dimensions are named soma_dim_N, where N is the dimension number (e.g., soma_dim_0), and elements are named soma_data.

Lifecycle

Experimental.

Examples

>>> import tiledbsoma
>>> import pyarrow as pa
>>> import numpy as np
>>> import scipy.sparse
>>> with tiledbsoma.SparseNDArray.create(
...     "./test_sparse_ndarray", type=pa.float32(), shape=(1000, 100)
... ) as arr:
...     data = pa.SparseCOOTensor.from_scipy(
...         scipy.sparse.random(1000, 100, format="coo", dtype=np.float32)
...     )
...     arr.write(data)
... with tiledbsoma.SparseNDArray.open("./test_sparse_ndarray") as arr:
...     print(arr.schema)
...     print('---')
...     print(arr.read().coos().concat())
...
soma_dim_0: int64
soma_dim_1: int64
soma_data: float
---
<pyarrow.SparseCOOTensor>
type: float
shape: (1000, 100)
__init__(handle: _WrapperType_co | DataFrameWrapper | DenseNDArrayWrapper | SparseNDArrayWrapper, *, _dont_call_this_use_create_or_open_instead: str = 'unset')

Internal-only common initializer steps.

This function is internal; users should open TileDB SOMA objects using the create() and open() factory class methods.

Methods

__init__(handle, *[, ...])

Internal-only common initializer steps.

close()

Release any resources held while the object is open.

create(uri, *, type, shape[, ...])

Creates a SOMA NDArray at the given URI.

exists(uri[, context, tiledb_timestamp])

Finds whether an object of this type exists at the given URI.

non_empty_domain()

Retrieves the non-empty domain for each dimension, namely the smallest and largest indices in each dimension for which the array/dataframe has data occupied.

open(uri[, mode, tiledb_timestamp, context, ...])

Opens this specific type of SOMA object.

read([coords, result_order, batch_size, ...])

Reads a user-defined slice of the SparseNDArray.

reshape(shape)

Unsupported operation for this object type.

used_shape()

Retrieve the range of indexes for a dimension that were explicitly written.

verify_open_for_writing()

Raises an error if the object is not open for writing.

write(values, *[, platform_config])

Writes an Arrow object to the SparseNDArray.

Attributes

closed

True if the object has been closed.

context

A value storing implementation-specific configuration information.

is_sparse

True if the array is sparse, False if it is dense.

metadata

The metadata of this SOMA object.

mode

The mode this object was opened in, either r or w.

ndim

The number of dimensions in this array.

nnz

The number of stored values in the array, including explicitly stored zeros.

schema

Returns data schema, in the form of an Arrow Schema.

shape

Returns capacity of each dimension, always a list of length ndim.

soma_type

A string describing the SOMA type of this object.

tiledb_timestamp

The time that this object was opened in UTC.

tiledb_timestamp_ms

The time this object was opened, as millis since the Unix epoch.

uri

Accessor for the object's storage URI.