TileDB Python API Reference

Warning

The Python interface to TileDB is still under development and the API is subject to change.

Modules

Typical usage of the Python interace to TileDB will use the top-level module tiledb, e.g.

import tiledb

There is also a submodule libtiledb which contains the necessary bindings to the underlying TileDB native library. Most of the time you will not need to interact with tiledb.libtiledb unless you need native-library specific information, e.g. the version number:

import tiledb
tiledb.libtiledb.version()  # Native TileDB library version number

Exceptions

exception tiledb.TileDBError

TileDB Error Exception

Captures and raises error return code (TILEDB_ERR) messages when calling libtiledb functions. The error message that is raised is the last error set for the tiledb.Ctx.

A Python MemoryError is raised on TILEDB_OOM

message

The TileDB error message string

Return type:str
Returns:error message

Context

class tiledb.Ctx(config=None)

Class representing a TileDB context.

A TileDB context wraps a TileDB storage manager.

Parameters:config (tiledb.Config or dict) – Initialize Ctx with given config parameters
config(self)

Returns the Config instance associated with the Ctx

Config

class tiledb.Config(params=None, path=None)

TileDB Config class

Valid parameters (unknown parameters will be ignored):

  • sm.tile_cache_size
    The tile cache size in bytes. Any uint64_t value is acceptable. Default: 10,000,000
  • sm.array_schema_cache_size
    The array schema cache size in bytes. Any uint64_t value is acceptable. Default: 10,000,000
  • sm.fragment_metadata_cache_size
    The fragment metadata cache size in bytes. Any uint64_t value is acceptable.
  • sm.enable_signal_handlers
    Whether or not TileDB will install signal handlers. Default: true Default: 10,000,000
  • sm.number_of_threads
    The number of allocated threads per TileDB context. Default: number of cores
  • vfs.max_parallel_ops
    The maximum number of VFS parallel operations. Default: number of cores
  • vfs.min_parallel_size
    The minimum number of bytes in a parallel VFS operation. (Does not affect parallel S3 writes.) Default: 10MB
  • vfs.s3.region
    The S3 region, if S3 is enabled. Default: us-east-1
  • vfs.s3.scheme
    The S3 scheme (http or https), if S3 is enabled. Default: https
  • vfs.s3.endpoint_override
    The S3 endpoint, if S3 is enabled. Default: “”
  • vfs.s3.use_virtual_addressing
    The S3 use of virtual addressing (true or false), if S3 is enabled. Default: true
  • vfs.s3.multipart_part_size
    The part size (in bytes) used in S3 multipart writes, if S3 is enabled. Any uint64_t value is acceptable. Note: vfs.s3.multipart_part_size * vfs.max_parallel_ops bytes will be buffered before issuing multipart uploads in parallel. Default: 5*1024*1024
  • vfs.s3.connect_timeout_ms
    The connection timeout in ms. Any long value is acceptable. Default: 3000
  • vfs.s3.connect_max_tries
    The maximum tries for a connection. Any long value is acceptable. Default: 5
  • vfs.s3.connect_scale_factor
    The scale factor for exponential backofff when connecting to S3. Any long value is acceptable. Default: 25
  • vfs.s3.request_timeout_ms
    The request timeout in ms. Any long value is acceptable. Default: 3000
  • vfs.hdfs.name_node"
    Name node for HDFS. Default: “”
  • vfs.hdfs.username
    HDFS username. Default: “”
  • vfs.hdfs.kerb_ticket_cache_path
    HDFS kerb ticket cache path. Default: “”
Parameters:
  • params (dict) – Set parameter values from dict like object
  • path (str) – Set parameter values from persisted Config parameter file
clear(self)

Unsets all Config parameters (returns them to their default values)

dict(self, prefix=u'')

Returns a dict representation of a Config object

Parameters:prefix (str) – return only parameters with a given prefix
Return type:dict
Returns:Config parameter / values as a a Python dict
from_file(self, path)

Update a Config object with from a persisted config file

Parameters:path – A local Config file path
get(self, key, *args)

Gets the value of a config parameter, or a default value.

Parameters:
  • key (str) – Config parameter
  • args – return arg if Config does not contain paramter key
Returns:

Parameter value, arg or None.

items(self, prefix=u'')

Returns an iterator object over Config paramters, values

Parameters:prefix (str) – return only parameters with a given prefix
Return type:ConfigItems
Returns:iterator over Config parameter, value tuples
keys(self, prefix=u'')

Returns an iterator object over Config parameters (keys)

Parameters:prefix (str) – return only parameters with a given prefix
Return type:ConfigKeys
Returns:Iterator over Config parameter string keys
static load(uri)

Constructs a Config class instance from config parameters loaded from a local Config file

Parameters:uri (str) – a local URI config file path
Return type:tiledb.Config
Returns:A TileDB Config instance with persisted parameter values
Raises:TypeErroruri cannot be converted to a unicode string
Raises:tiledb.TileDBError
save(self, uri)

Persist Config parameter values to a config file

Parameters:uri (str) – a local URI config file path
Raises:TypeErroruri cannot be converted to a unicode string
Raises:tiledb.TileDBError
update(self, odict)

Update a config object with parameter, values from a dict like object

Parameters:odict – dict-like object containing parameter, values to update Config.
values(self, prefix=u'')

Returns an iterator object over Config values

Parameters:prefix (str) – return only parameters with a given prefix
Return type:ConfigValues
Returns:Iterator over Config string values

Array Schema

class tiledb.ArraySchema(Ctx ctx, domain=None, attrs=(), cell_order='row-major', tile_order='row-major', capacity=0, coords_compressor=None, offsets_compressor=None, sparse=False)

Schema class for TileDB dense / sparse array representations

Parameters:
  • ctx (tiledb.Ctx) – A TileDB Context
  • attrs (tuple(tiledb.Attr, ..)) – one or more array attributes
  • cell_order ('row-major' or 'C', 'col-major' or 'F') – TileDB label for cell layout
  • tile_order ('row-major' or 'C', 'col-major' or 'F', 'unordered') – TileDB label for tile layout
  • capacity (int) – tile cell capacity
  • coords_compressor (tuple(str, int)) – compressor label, level for (sparse) coordinates
  • offsets_compressor – compressor label, level for varnum attribute cells
  • sparse (bool) – True if schema is sparse, else False (set by SparseArray and DenseArray derived classes)
Raises:

TypeError – cannot convert uri to unicode string

Raises:

tiledb.TileDBError

attr(self, key)

Returns an Attr instance given an int index or string label

Parameters:key (int or str) – attribute index (positional or associative)
Return type:tiledb.Attr
Returns:The ArraySchema attribute at index or with the given name (label)
Raises:TypeError – invalid key type
capacity

The array capacity

Return type:int
Raises:tiledb.TileDBError
cell_order

The cell order layout of the array.

coords_compressor

The compressor label and level for the array’s coordinates.

Return type:tuple(str, int)
Raises:tiledb.TileDBError
domain

The Domain associated with the array.

Return type:tiledb.Domain
Raises:tiledb.TileDBError
dump(self)

Dumps a string representation of the array object to standard output (stdout)

static load(Ctx ctx, uri)
nattr

The number of array attributes.

Return type:int
Raises:tiledb.TileDBError
ndim

The number of array domain dimensions.

Return type:int
offsets_compressor

The compressor label and level for the array’s variable-length attribute offsets.

Return type:tuple(str, int)
Raises:tiledb.TileDBError
shape

The array’s shape

Return type:tuple(numpy scalar, numpy scalar)
Raises:TypeError – floating point (inexact) domain
sparse

True if the array is a sparse array representation

Return type:bool
Raises:tiledb.TileDBError
tile_order

The tile order layout of the array.

Return type:str
Raises:tiledb.TileDBError

Attribute

class tiledb.Attr(Ctx ctx, name=u'', dtype=np.float64, compressor=None)

Class representing a TileDB array attribute.

Parameters:
  • ctx (tiledb.Ctx) – A TileDB Context
  • name (str) – Attribute name, empty if anonymous
  • dtype (numpy.dtype object or type or string) – Attribute value datatypes
  • compressor (tuple(str, int)) –

    The compressor name and level for attribute values. Available compressors:

    • ”gzip”
    • ”zstd”
    • ”lz4”
    • ”blosc-lz”
    • ”blosc-lz4”
    • ”blosc-lz4hc”
    • ”blosc-snappy”
    • ”blosc-zstd”
    • ”rle”
    • ”bzip2”
    • ”double-delta”
Raises:

TypeError – invalid dtype

Raises:

tiledb.TileDBError

compressor

String label of the attributes compressor and compressor level

Return type:tuple(str, int)
Raises:tiledb.TileDBError
dtype

Return numpy dtype object representing the Attr type

Return type:numpy.dtype
dump(self)

Dumps a string representation of the Attr object to standard output (stdout)

isanon

True if attribute is an anonymous attribute

Return type:bool
isvar

True if the attribute is variable length

Return type:bool
Raises:tiledb.TileDBError
name

Attribute string name, empty string if the attribute is anonymous

Return type:str
Raises:tiledb.TileDBError
ncells

The number of cells (scalar values) for a given attribute value

Return type:int
Raises:tiledb.TileDBError

Dimension

class tiledb.Dim(Ctx ctx, name=u'', domain=None, tile=0, dtype=np.uint64)

Class representing a dimension of a TileDB Array.

Parameters:
Dtype:

the Dim numpy dtype object, type object, or string that can be corerced into a numpy dtype object

Raises:
  • ValueError – invalid domain or tile extent
  • TypeError – invalid domain, tile extent, or dtype type
Raises:

TileDBError

domain

The dimension (inclusive) domain.

The dimension’s domain is defined by a (lower bound, upper bound) tuple.

Return type:tuple(numpy scalar, numpy scalar)
dtype

Numpy dtype representation of the dimension type.

Return type:numpy.dtype
isanon

True if the dimension is anonymous

Return type:bool
name

The dimension label string.

Anonymous dimensions return a default string representation based on the dimension index.

Return type:str
shape

The shape of the dimension given the dimension’s domain.

Note: The shape is only valid for integer dimension domains.

Return type:tuple(numpy scalar, numpy scalar)
Raises:TypeError – floating point (inexact) domain
size

The size of the dimension domain (number of cells along dimension).

Return type:int
Raises:TypeError – floating point (inexact) domain
tile

The tile extent of the dimension.

Return type:numpy scalar

Domain

class tiledb.Domain(Ctx ctx, *dims)

Class representing the domain of a TileDB Array.

Parameters:
  • ctx (tiledb.Ctx) – A TileDB Context
  • *dims

    one or more tiledb.Dim objects up to the Domain’s ndim

Raises:

TypeError – All dimensions must have the same dtype

Raises:

TileDBError

dim(self, int idx)

Returns a Dim object from the domain given the dimension’s index.

Parameters:idx (int) – dimension index
Raises:tiledb.TileDBError
dtype

The numpy dtype of the domain’s dimension type.

Return type:numpy.dtype
dump(self)

Dumps a string representation of the domain object to standard output (STDOUT)

ndim

The number of dimensions of the domain.

Return type:int
shape

The domain’s shape, valid only for integer domains.

Return type:tuple
Raises:TypeError – floating point (inexact) domain
size

The domain’s size (number of cells), valid only for integer domains.

Return type:int
Raises:TypeError – floating point (inexact) domain

Array

class tiledb.libtiledb.Array(Ctx ctx, uri, mode='r')

Base class for TileDB array objects.

Defines common properties/functionality for the different array types. When an Array instance is initialized, the array is opened with the specified mode.

Parameters:
  • ctx (Ctx) – TileDB context
  • uri (str) – URI of array to open
  • mode (str) – Mode (‘r’ or ‘w’) to open the array with.
attr(self, key)

Returns an Attr instance given an int index or string label

Parameters:key (int or str) – attribute index (positional or associative)
Return type:Attr
Returns:The array attribute at index or with the given name (label)
Raises:TypeError – invalid key type
close(self)

Closes this array, flushing all buffered data.

consolidate(self)

Consolidates fragments of an array object for increased read performance.

Raises:tiledb.TileDBError
coords_dtype

Returns the numpy record array dtype of the SparseArray coordinates

Return type:numpy.dtype
Returns:coord array record dtype
static create(uri, ArraySchema schema)

Creates a persistent TileDB Array at the given URI

Parameters:
  • uri (str) – URI at which to create the new empty array.
  • schema (ArraySchema) – Schema for the array
domain

The Domain of this array.

dump(self)
isopen

True if this array is currently open.

mode

The mode this array was opened with.

nattr

The number of attributes of this array.

ndim

The number of dimensions of this array.

nonempty_domain(self)

Return the minimum bounding domain which encompasses nonempty values.

Return type:tuple(tuple(numpy scalar, numpy scalar), ..)
Returns:A list of (inclusive) domain extent tuples, that contain all nonempty cells
reopen(self)

Reopens this array.

This is useful when the array is updated after it was opened. To sync-up with the updates, the user must either close the array and open again, or just use reopen() without closing. Reopening will be generally faster than the former alternative.

schema

The ArraySchema for this array.

shape

The shape of this array.

subarray(self, selection, coords=False, attrs=None, order=None)
tiledb.consolidate(Ctx ctx, path)

Consolidates a TileDB Array updates for improved read performance

Parameters:
  • ctx (tiledb.Ctx) – The TileDB Context
  • path (str) – path (URI) to the TileDB Array
Return type:

str

Returns:

path (URI) to the consolidated TileDB Array

Raises:

TypeError – cannot convert path to unicode string

Raises:

tiledb.TileDBError

Dense Array

class tiledb.DenseArray(*args, **kw)

Class representing a dense TileDB array.

Inherits properties and methods of tiledb.Array.

__getitem__(selection)

Retrieve data cells for an item or region of the array.

Parameters:selection (tuple) – An int index, slice or tuple of integer/slice objects, specifiying the selected subarray region for each dimension of the DenseArray.
Return type:numpy.ndarray or collections.OrderedDict
Returns:If the dense array has a single attribute than a Numpy array of corresponding shape/dtype is returned for that attribute. If the array has multiple attributes, a collections.OrderedDict is with dense Numpy subarrays for each attribute.
Raises:IndexError – invalid or unsupported index selection
Raises:tiledb.TileDBError

Example:

>>> import tiledb, numpy as np, tempfile
>>> ctx = tiledb.Ctx()
>>> with tempfile.TemporaryDirectory() as tmp:
...     # Creates array 'array' on disk.
...     A = tiledb.DenseArray.from_numpy(ctx, tmp + "/array",  np.ones((100, 100)))
...     # Many aspects of Numpy's fancy indexing are supported:
...     A[1:10, ...].shape
...     A[1:10, 20:99].shape
...     A[1, 2].shape
(9, 100)
(9, 79)
()
>>> # Subselect on attributes when reading:
>>> with tempfile.TemporaryDirectory() as tmp:
...     dom = tiledb.Domain(ctx, tiledb.Dim(ctx, domain=(0, 9), tile=2, dtype=np.uint64))
...     schema = tiledb.ArraySchema(ctx, domain=dom,
...         attrs=(tiledb.Attr(ctx, name="a1", dtype=np.int64),
...                tiledb.Attr(ctx, name="a2", dtype=np.int64)))
...     tiledb.DenseArray.create(tmp + "/array", schema)
...     with tiledb.DenseArray(ctx, tmp + "/array", mode='w') as A:
...         A[0:10] = {"a1": np.zeros((10)), "a2": np.ones((10))}
...     with tiledb.DenseArray(ctx, tmp + "/array", mode='r') as A:
...         # Access specific attributes individually.
...         A[0:5]["a1"]
...         A[0:5]["a2"]
array([0, 0, 0, 0, 0])
array([1, 1, 1, 1, 1])
__setitem__(selection, value)

Set / update dense data cells

Parameters:
  • selection (tuple) – An int index, slice or tuple of integer/slice objects, specifiying the selected subarray region for each dimension of the DenseArray.
  • value (dict or numpy.ndarray) – a dictionary of array attribute values, values must able to be converted to n-d numpy arrays. if the number of attributes is one, than a n-d numpy array is accepted.
Raises:
  • IndexError – invalid or unsupported index selection
  • ValueError – value / coordinate length mismatch
Raises:

tiledb.TileDBError

Example:

>>> import tiledb, numpy as np, tempfile
>>> ctx = tiledb.Ctx()
>>> # Write to single-attribute 2D array
>>> with tempfile.TemporaryDirectory() as tmp:
...     # Create an array initially with all zero values
...     with tiledb.DenseArray.from_numpy(ctx, tmp + "/array",  np.zeros((2, 2))) as A:
...         pass
...     with tiledb.DenseArray(ctx, tmp + "/array", mode='w') as A:
...         # Write to the single (anonymous) attribute
...         A[:] = np.array(([1,2], [3,4]))
>>>
>>> # Write to multi-attribute 2D array
>>> with tempfile.TemporaryDirectory() as tmp:
...     dom = tiledb.Domain(ctx,
...         tiledb.Dim(ctx, domain=(0, 1), tile=2, dtype=np.uint64),
...         tiledb.Dim(ctx, domain=(0, 1), tile=2, dtype=np.uint64))
...     schema = tiledb.ArraySchema(ctx, domain=dom,
...         attrs=(tiledb.Attr(ctx, name="a1", dtype=np.int64),
...                tiledb.Attr(ctx, name="a2", dtype=np.int64)))
...     tiledb.DenseArray.create(tmp + "/array", schema)
...     with tiledb.DenseArray(ctx, tmp + "/array", mode='w') as A:
...         # Write to each attribute
...         A[0:2, 0:2] = {"a1": np.array(([-3, -4], [-5, -6])),
...                        "a2": np.array(([1, 2], [3, 4]))}
static from_numpy(Ctx ctx, uri, ndarray array, **kw)

Persists a given numpy array as a TileDB DenseArray, returns a readonly DenseArray class instance.

Parameters:
  • ctx (tiledb.Ctx) – A TileDB Context
  • uri (str) – URI for the TileDB array resource
  • array (numpy.ndarray) – dense numpy array to persist
  • **kw – additional arguments to pass to the DenseArray constructor
Return type:

tiledb.DenseArray

Returns:

A DenseArray with a single anonymous attribute

Raises:

TypeError – cannot convert uri to unicode string

Raises:

tiledb.TileDBError

Example:

>>> import tiledb, numpy as np, tempfile
>>> ctx = tiledb.Ctx()
>>> with tempfile.TemporaryDirectory() as tmp:
...     # Creates array 'array' on disk.
...     A = tiledb.DenseArray.from_numpy(ctx, tmp + "/array",  np.array([1.0, 2.0, 3.0]))
query(self, attrs=None, coords=False, order='C')

Construct a proxy Query object for easy subarray queries of cells for an item or region of the array across one or more attributes.

Optionally subselect over attributes, return dense result coordinate values, and specify a layout a result layout / cell-order.

Parameters:
  • attrs – the DenseArray attributes to subselect over. If attrs is None (default) all array attributes will be returned. Array attributes can be defined by name or by positional index.
  • coords – if True, return array of coodinate value (default False).
  • order – ‘C’, ‘F’, or ‘G’ (row-major, col-major, tiledb global order)
Returns:

A proxy Query object that can be used for indexing into the DenseArray over the defined attributes, in the given result layout (order).

Raises:

ValueError – array is not opened for reads (mode = ‘r’)

Raises:

tiledb.TileDBError

Example:

>>> # Subselect on attributes when reading:
>>> with tempfile.TemporaryDirectory() as tmp:
...     dom = tiledb.Domain(ctx, tiledb.Dim(ctx, domain=(0, 9), tile=2, dtype=np.uint64))
...     schema = tiledb.ArraySchema(ctx, domain=dom,
...         attrs=(tiledb.Attr(ctx, name="a1", dtype=np.int64),
...                tiledb.Attr(ctx, name="a2", dtype=np.int64)))
...     tiledb.DenseArray.create(tmp + "/array", schema)
...     with tiledb.DenseArray(ctx, tmp + "/array", mode='w') as A:
...         A[0:10] = {"a1": np.zeros((10)), "a2": np.ones((10))}
...     with tiledb.DenseArray(ctx, tmp + "/array", mode='r') as A:
...         # Access specific attributes individually.
...         A.query(attrs=("a1",))[0:5]
array([0, 0, 0, 0, 0])
read_direct(self, unicode name=None)

Read attribute directly with minimal overhead, returns a numpy ndarray over the entire domain

Parameters:attr_name (str) – read directly to an attribute name (default <anonymous>)
Return type:numpy.ndarray
Returns:numpy.ndarray of attr_name values over the entire array domain
Raises:tiledb.TileDBError
subarray(self, selection, coords=False, attrs=None, order=None)

Retrieve data cells for an item or region of the array.

Optionally subselect over attributes, return dense result coordinate values, and specify a layout a result layout / cell-order.

Parameters:
  • selection – tuple of scalar and/or slice objects
  • coords – if True, return array of coordinate value (default False).
  • attrs – the DenseArray attributes to subselect over. If attrs is None (default) all array attributes will be returned. Array attributes can be defined by name or by positional index.
  • order – ‘C’, ‘F’, or ‘G’ (row-major, col-major, tiledb global order)
Returns:

If the dense array has a single attribute than a Numpy array of corresponding shape/dtype is returned for that attribute. If the array has multiple attributes, a collections.OrderedDict is with dense Numpy subarrays for each attribute.

Raises:

IndexError – invalid or unsupported index selection

Raises:

tiledb.TileDBError

write_direct(self, ndarray array)

Write directly to given array attribute with minimal checks, assumes that the numpy array is the same shape as the array’s domain

Parameters:array (np.ndarray) – Numpy contigous dense array of the same dtype and shape and layout of the DenseArray instance
Raises:ValueError – array is not contiguous
Raises:tiledb.TileDBError

Sparse Array

class tiledb.SparseArray(*args, **kw)

Class representing a sparse TileDB array.

Inherits properties and methods of tiledb.Array.

__getitem__(selection)

Retrieve nonempty cell data for an item or region of the array

Parameters:selection (tuple) – An int index, slice or tuple of integer/slice objects, specifying the selected subarray region for each dimension of the SparseArray.
Return type:collections.OrderedDict
Returns:An OrderedDict is returned with “coords” coordinate values being the first key. “coords” is a Numpy record array representation of the coordinate values of non-empty attribute cells. Nonempty attribute values are returned as Numpy 1-d arrays.
Raises:IndexError – invalid or unsupported index selection
Raises:tiledb.TileDBError

Example:

>>> import tiledb, numpy as np, tempfile
>>> ctx = tiledb.Ctx()
>>> # Write to multi-attribute 2D array
>>> with tempfile.TemporaryDirectory() as tmp:
...     dom = tiledb.Domain(ctx,
...         tiledb.Dim(ctx, name="y", domain=(0, 9), tile=2, dtype=np.uint64),
...         tiledb.Dim(ctx, name="x", domain=(0, 9), tile=2, dtype=np.uint64))
...     schema = tiledb.ArraySchema(ctx, domain=dom, sparse=True,
...         attrs=(tiledb.Attr(ctx, name="a1", dtype=np.int64),
...                tiledb.Attr(ctx, name="a2", dtype=np.int64)))
...     tiledb.SparseArray.create(tmp + "/array", schema)
...     with tiledb.SparseArray(ctx, tmp + "/array", mode='w') as A:
...         # Write in the twp cells (0,0) and (2,3) only.
...         I, J = [0, 2], [0, 3]
...         # Write to each attribute
...         A[I, J] = {"a1": np.array([1, 2]),
...                    "a2": np.array([3, 4])}
...     with tiledb.SparseArray(ctx, tmp + "/array", mode='r') as A:
...         # Return an OrderedDict with cell coordinates
...         A[0:3, 0:10]
...         # Return the NumpyRecord array of TileDB cell coordinates
...         A[0:3, 0:10]["coords"]
...         # Return just the "x" coordinates values
...         A[0:3, 0:10]["coords"]["x"]
OrderedDict([('coords', array([(0, 0), (2, 3)], dtype=[('y', '<u8'), ('x', '<u8')])), ('a1', array([1, 2])), ('a2', array([3, 4]))])
array([(0, 0), (2, 3)], dtype=[('y', '<u8'), ('x', '<u8')])
array([0, 3], dtype=uint64)

With a floating-point array domain, index bounds are inclusive, e.g.:

>>> # Return nonempty cells within a floating point array domain (fp index bounds are inclusive):
>>> # A[5.0:579.9]
__setitem__(selection, value)

Set / update sparse data cells

Parameters:
  • selection (tuple) – N coordinate value arrays (dim0, dim1, …) where N in the ndim of the SparseArray, The format follows numpy sparse (point) indexing semantics.
  • value (dict or numpy.ndarray) – a dictionary of nonempty array attribute values, values must able to be converted to 1-d numpy arrays. if the number of attributes is one, than a 1-d numpy array is accepted.
Raises:
  • IndexError – invalid or unsupported index selection
  • ValueError – value / coordinate length mismatch
Raises:

tiledb.TileDBError

Example:

>>> import tiledb, numpy as np, tempfile
>>> ctx = tiledb.Ctx()
>>> # Write to multi-attribute 2D array
>>> with tempfile.TemporaryDirectory() as tmp:
...     dom = tiledb.Domain(ctx,
...         tiledb.Dim(ctx, domain=(0, 1), tile=2, dtype=np.uint64),
...         tiledb.Dim(ctx, domain=(0, 1), tile=2, dtype=np.uint64))
...     schema = tiledb.ArraySchema(ctx, domain=dom, sparse=True,
...         attrs=(tiledb.Attr(ctx, name="a1", dtype=np.int64),
...                tiledb.Attr(ctx, name="a2", dtype=np.int64)))
...     tiledb.SparseArray.create(tmp + "/array", schema)
...     with tiledb.SparseArray(ctx, tmp + "/array", mode='w') as A:
...         # Write in the corner cells (0,0) and (1,1) only.
...         I, J = [0, 1], [0, 1]
...         # Write to each attribute
...         A[I, J] = {"a1": np.array([1, 2]),
...                    "a2": np.array([3, 4])}
query(self, attrs=None, coords=True, order='C')

Construct a proxy Query object for easy subarray queries of cells for an item or region of the array across one or more attributes.

Optionally subselect over attributes, return dense result coordinate values, and specify a layout a result layout / cell-order.

Parameters:
  • attrs – the SparseArray attributes to subselect over. If attrs is None (default) all array attributes will be returned. Array attributes can be defined by name or by positional index.
  • coords – if True, return array of coodinate value (default False).
  • order – ‘C’, ‘F’, or ‘G’ (row-major, col-major, tiledb global order)
Returns:

A proxy Query object that can be used for indexing into the SparseArray over the defined attributes, in the given result layout (order).

Example:

>>> import tiledb, numpy as np, tempfile
>>> ctx = tiledb.Ctx()
>>> # Write to multi-attribute 2D array
>>> with tempfile.TemporaryDirectory() as tmp:
...     dom = tiledb.Domain(ctx,
...         tiledb.Dim(ctx, name="y", domain=(0, 9), tile=2, dtype=np.uint64),
...         tiledb.Dim(ctx, name="x", domain=(0, 9), tile=2, dtype=np.uint64))
...     schema = tiledb.ArraySchema(ctx, domain=dom, sparse=True,
...         attrs=(tiledb.Attr(ctx, name="a1", dtype=np.int64),
...                tiledb.Attr(ctx, name="a2", dtype=np.int64)))
...     tiledb.SparseArray.create(tmp + "/array", schema)
...     with tiledb.SparseArray(ctx, tmp + "/array", mode='w') as A:
...         # Write in the twp cells (0,0) and (2,3) only.
...         I, J = [0, 2], [0, 3]
...         # Write to each attribute
...         A[I, J] = {"a1": np.array([1, 2]),
...                    "a2": np.array([3, 4])}
...     with tiledb.SparseArray(ctx, tmp + "/array", mode='r') as A:
...         A.query(attrs=("a1",), coords=False, order='G')[0:3, 0:10]
subarray(self, selection, coords=True, attrs=None, order=None)

Retrieve coordinate and data cells for an item or region of the array.

Optionally subselect over attributes, return sparse result coordinate values, and specify a layout a result layout / cell-order.

Parameters:
  • selection – tuple of scalar and/or slice objects
  • coords – if True, return array of coordinate value (default True).
  • attrs – the SparseArray attributes to subselect over. If attrs is None (default) all array attributes will be returned. Array attributes can be defined by name or by positional index.
  • order – ‘C’, ‘F’, or ‘G’ (row-major, col-major, tiledb global order)
Returns:

An OrderedDict is returned with “coords” coordinate values being the first key. “coords” is a Numpy record array representation of the coordinate values of non-empty attribute cells. Nonempty attribute values are returned as Numpy 1-d arrays.

Key-value Schema

class tiledb.KVSchema(Ctx ctx, domain=None, attrs=(), capacity=None)

Schema class for TileDB key-value (assocative) arrays.

Note: Only string-valued attributes are currently supported on KVs.

Parameters:
Raises:

tiledb.TileDBError

attr(self, key)

Returns an Attr instance given an int index or string label

Parameters:key (int or str) – attribute index (positional or associative)
Return type:tiledb.Attr
Returns:The KVSchema attribute at index or with the given name (label)
Raises:TypeError – invalid key type
capacity

The KV array capacity

Return type:int
Raises:tiledb.TileDBError
dump(self)

Dumps a string representation of the array object to standard output (STDOUT)

static load(Ctx ctx, uri)

Loads a persisted KV array at given URI, returns an KV class instance

nattr

The number of KV attributes

Return type:int
Raises:tiledb.TileDBError

Key-value

class tiledb.KV(Ctx ctx, uri, attrs=(), buffered_items=None)

Class representing a TileDB KV (key-value) array.

Parameters:
  • ctx (Ctx) – A TileDB Context
  • uri (str) – URI to persistent KV resource
  • attrs (tuple(tiledb.Attr, ..)) – one or more KV value attributes
Raises:

TypeError – invalid uri or attrs type

Raises:

tiledb.TileDBError

attr(self, key)

Returns an Attr instance given an int index or string label

Parameters:key (int or str) – attribute index (positional or associative)
Return type:tiledb.Attr
Returns:The KV attribute at index or with the given name (label)
Raises:TypeError – invalid key type
consolidate(self)

Consolidates KV array updates for increased read performance

Raises:tiledb.TileDBError
static create(Ctx ctx, uri, KVSchema schema)

Creates a persistent KV at the given URI, returns a KV class instance

dict(self)

Return a dict representation of the KV array object

Return type:dict
Returns:Python dict of keys and attribute value (tuples)
flush(self)

Flush any buffered writes to the KV array.

nattr

The number of KV array attributes

Return type:int
Raises:tiledb.TileDBError
reopen(self)

Reopens a key-value store

Reopening the array is useful when there were updates to the key-value store after it got opened.

Raises:tiledb.TileDBError
update(self, *args, **kw)

Update a KV object from dict/iterable,

Has the same semantics as Python dict’s .update() method

Object Management

tiledb.group_create(Ctx ctx, path)

Create a TileDB Group object at the specified path (URI)

Parameters:
  • ctx (tiledb.Ctx) – The TileDB Context
  • path (str) – path (URI) of the TileDB Group to be created
Return type:

str

Returns:

The path (URI) of the created TileDB Group

Raises:

TypeError – cannot convert path to unicode string

Raises:

tiledb.TileDBError

tiledb.object_type(Ctx ctx, path)

Returns the TileDB object type at the specified path (URI)

Parameters:
  • ctx (tiledb.Ctx) – The TileDB Context
  • path (str) – path (URI) of the TileDB resource
Return type:

str

Returns:

object type string

Raises:

TypeError – cannot convert path to unicode string

tiledb.remove(Ctx ctx, path)

Removes (deletes) the TileDB object at the specified path (URI)

Parameters:
  • ctx (tiledb.Ctx) – The TileDB Context
  • path (str) – path (URI) of the TileDB resource
Raises:

TypeError – path cannot be converted to a unicode string

Raises:

tiledb.TileDBError

tiledb.move(Ctx ctx, old_path, new_path)

Moves a TileDB resource (group, array, key-value).

Parameters:
  • ctx (tiledb.Ctx) – The TileDB Context
  • old_path (str) – path (URI) of the TileDB resource to move
  • new_path (str) – path (URI) of the destination
Raises:

TypeError – path cannot be converted to a unicode string

Raises:

TileDBError

tiledb.ls(Ctx ctx, path, func)

Lists TileDB resources and applies a callback that have a prefix of path (one level deep).

Parameters:
  • ctx (tiledb.Ctx) – TileDB context
  • path (str) – URI of TileDB group object
  • func (function) – callback to execute on every listed TileDB resource, URI resource path and object type label are passed as arguments to the callback
Raises:

TypeError – cannot convert path to unicode string

Raises:

tiledb.TileDBError

tiledb.walk(Ctx ctx, path, func, order='preorder')

Recursively visits TileDB resources and applies a callback that have a prefix of path

Parameters:
  • ctx (tiledb.Ctx) – The TileDB context
  • path (str) – URI of TileDB group object
  • func (function) – callback to execute on every listed TileDB resource, URI resource path and object type label are passed as arguments to the callback
  • order (str) – ‘preorder’ (default) or ‘postorder’ tree traversal
Raises:
Raises:

tiledb.TileDBError

VFS

class tiledb.VFS(Ctx ctx, config=None)

TileDB VFS class

Encapsulates the TileDB VFS module instance with a specific configuration (config).

Parameters:
  • ctx (tiledb.Ctx) – The TileDB Context
  • config (tiledb.Config or dict) – Override ctx VFS configurations with updated values in config.
close(self, FileHandle fh)

Closes a VFS FileHandle object

Parameters:fh (FileHandle) – An opened VFS FileHandle
Return type:FileHandle
Returns:closed VFS FileHandle
Raises:tiledb.TileDBError
config(self)

Returns the Config instance associated with the VFS

create_bucket(self, uri)

Create an object store bucket at the given URI

Parameters:uri (str) – full URI of bucket resource to be created.
Return type:str
Returns:created bucket URI
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
create_dir(self, uri)

Create a VFS directory at the given URI

Parameters:uri (str) – URI of directory to be created
Return type:str
Returns:URI of created VFS directory
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
empty_bucket(self, uri)

Empty an object store bucket of all objects at the given URI

This function blocks until all objects are verified to be removed from the given bucket.

Parameters:uri (str) – URI of bucket resource to be emptied
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
file_size(self, uri)

Returns the size (in bytes) of a VFS file at the given URI

Parameters:uri (str) – URI of a VFS file resource
Return type:int
Returns:file size in number of bytes
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
is_bucket(self, uri)

Returns True if the URI resource is a valid object store bucket

Parameters:uri (str) – URI of bucket resource
Return type:bool
Returns:True if given URI is a valid object store bucket, False otherwise
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
is_dir(self, uri)

Returns True if the given URI is a VFS directory object

Parameters:uri (str) – URI of the directory resource
Return type:bool
Returns:True if uri is a VFS directory, False otherwise
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
is_empty_bucket(self, uri)

Returns true if the object store bucket is empty (contains no objects).

If the bucket is versioned, this returns the status of the latest bucket version state.

Parameters:uri (str) – URI of bucket resource
Return type:bool
Returns:True if bucket at given URI is empty, False otherwise
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
is_file(self, uri)

Returns True if the given URI is a VFS file object

Parameters:uri (str) – URI of the file resource
Return type:bool
Returns:True if uri is a VFS file, False otherwise
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
move_dir(self, old_uri, new_uri)

Moves a VFS dir from old URI to new URI

Parameters:
  • old_uri (str) – Existing VFS file or directory resource URI
  • new_uri (str) – URI to move existing VFS resource to
  • force (bool) – if VFS resource at new_uri exists, delete the resource and overwrite
Return type:

str

Returns:

new URI of VFS resource

Raises:

TypeError – cannot convert old_uri/new_uri to unicode string

Raises:

tiledb.TileDBError

move_file(self, old_uri, new_uri)

Moves a VFS file from old URI to new URI

Parameters:
  • old_uri (str) – Existing VFS file or directory resource URI
  • new_uri (str) – URI to move existing VFS resource to
  • force (bool) – if VFS resource at new_uri exists, delete the resource and overwrite
Return type:

str

Returns:

new URI of VFS resource

Raises:

TypeError – cannot convert old_uri/new_uri to unicode string

Raises:

tiledb.TileDBError

open(self, uri, mode=None)

Opens a VFS file resource for reading / writing / appends at URI

If the file did not exist upon opening, a new file is created.

Parameters:
  • uri (str) – URI of VFS file resource
  • str (mode) – ‘r’ for opening the file to read, ‘w’ to write, ‘a’ to append
Return type:

FileHandle

Returns:

VFS FileHandle

Raises:
Raises:

tiledb.TileDBError

read(self, FileHandle fh, offset, nbytes)

Read nbytes from an opened VFS FileHandle at a given offset

Parameters:
  • fh (FileHandle) – An opened VFS FileHandle in ‘r’ mode
  • offset (int) – offset position in bytes to read from
  • nbytes (int) – number of bytes to read
Return type:

bytes()

Returns:

read bytes

Raises:

tiledb.TileDBError

readinto(self, FileHandle fh, bytes buffer, offset, nbytes)

Read nbytes from an opened VFS FileHandle at a given offset into a preallocated bytes buffer

Parameters:
  • fh (FileHandle) – An opened VFS FileHandle in ‘r’ mode
  • buffer (bytes) – A preallocated bytes buffer object
  • offset (int) – offset position in bytes to read from
  • nbytes (int) – number of bytes to read
Returns:

bytes buffer

Raises:

ValueError – invalid offset or nbytes values

Raises:

tiledb.TileDBError

remove_bucket(self, uri)

Remove an object store bucket at the given URI

Parameters:uri (str) – URI of bucket resource to be removed.
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
..note:
Consistency is not enforced for bucket removal so although this function will return immediately on success, the actual removal of the bucket make take some (indeterminate) amount of time.
remove_dir(self, uri)

Removes a VFS directory at the given URI

Parameters:uri (str) – URI of the directory resource to remove
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
remove_file(self, uri)

Removes a VFS file at the given URI

Parameters:uri (str) – URI of a VFS file resource
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
supports(self, scheme)

Returns true if the given URI scheme (storage backend) is supported

Parameters:scheme (str) – scheme component of a VFS resource URI (ex. ‘file’ / ‘hdfs’ / ‘s3’)
Return type:bool
Returns:True if the linked libtiledb version supports the storage backend, False otherwise
Raises:ValueError – VFS storage backend is not supported
sync(self, FileHandle fh)

Sync / flush an opened VFS FileHandle to storage backend

Parameters:fh (FileHandle) – An opened VFS FileHandle in ‘w’ or ‘a’ mode
Raises:tiledb.TileDBError
touch(self, uri)

Creates an empty VFS file at the given URI

Parameters:uri (str) – URI of a VFS file resource
Return type:str
Returns:URI of touched VFS file
Raises:TypeError – cannot convert uri to unicode string
Raises:tiledb.TileDBError
write(self, FileHandle fh, buff)

Writes buffer to opened VFS FileHandle

Parameters:
  • fh (FileHandle) – An opened VFS FileHandle in ‘w’ mode
  • buff – a Python object that supports the byte buffer protocol
Raises:

TypeError – cannot convert buff to bytes

Raises:

tiledb.TileDBError

Version

tiledb.libtiledb.version()

Return the version of the linked libtiledb shared library

Return type:tuple
Returns:Semver version (major, minor, rev)

Statistics

tiledb.stats_enable()

Enable TileDB internal statistics.

tiledb.stats_disable()

Disable TileDB internal statistics.

tiledb.stats_reset()

Reset all TileDB internal statistics to 0.

tiledb.stats_dump()

Prints all TileDB internal statistics values to standard output.