Using TileDB Statistics

A lot of performance optimization for TileDB programs involves minimizing wasted work. TileDB comes with an internal statistics reporting system that can help identify potential areas of performance improvement for your TileDB programs, including reducing wasted work.

Full programs
Program Links
using_tiledb_stats statscpp statspy

The TileDB statistics can be enabled and disabled at runtime, and a report can be dumped at any point. A typical situation is to enable the statistics immediately before submitting a query, submit the query, and then immediately dump the report. This can be done like so:

C++

tiledb::Stats::enable();
query.submit();
tiledb::Stats::dump(stdout);
tiledb::Stats::disable();

Python

tiledb.stats_enable()
data = A[:]
tiledb.stats_dump()
tiledb.stats_disable()

With the dump call, a report containing the gathered statistics will be printed. The report prints values of many individual counters, followed by a high-level summary printed at the end of the report. Typically the summary contains the necessary information to make high-level performance tuning decisions. An example summary is shown below:

Summary:
--------
Hardware concurrency: 4
Reads:
  Read query submits: 1
  Tile cache hit ratio: 0 / 1  (0.0%)
  Fixed-length tile data copy-to-read ratio: 4 / 1000 bytes (0.0%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 4 / 1000 bytes (0.0%)
  Read compression ratio: 1245 / 1274 bytes (1.0x)
Writes:
  Write query submits: 0
  Tiles written: 0
  Write compression ratio: 0 / 0 bytes

Each item is explained separately below.

Hardware concurrency
The amount of available hardware-level concurrency (cores plus hardware threads).
Read query submits
The number of times a read query submit call was made.
Tile cache hit ratio
Ratio expressing utilization of the tile cache. The denominator indicates how many tiles in total were fetched from disk to satisfy a read query. After a tile is fetched once, it is placed in the internal tile cache; the numerator indicates how many tiles were requested in order to satisfy a read query that were hits in the tile cache. In the above example summary, a single tile was fetched (which missed the cache because it was the first access to that tile). If the same tile was accessed again for a subsequent query, it could hit in the cache, increasing the ratio 1/2. Higher ratios are better.
Fixed-length tile data copy-to-read ratio
Ratio expressing a measurement of wasted work for reads of fixed-length attributes. The denominator is the total number of (uncompressed) bytes of fixed-length attribute data fetched from disk. The numerator is the number of those bytes that were actually copied into the query buffers to satisfy a read query. In the above example, 1000 bytes of fixed-length data were read from disk and only 4 of those bytes were used to satisfy a read query, indicating a large amount of wasted work. Higher ratios are better.
Var-length tile data copy-to-read ratio
This ratio is the same as the previous, but for variable-length attribute data. This is tracked separately because variable-length attributes can often be read performance bottlenecks, as their size is by nature unpredictable. Higher ratios are better.
Total tile data copy-to-read ratio
The overall copy-to-read ratio, where the numerator is the sum of the fixed- and variable-length numerators, and the denominator is the sum of the fixed- and variable-length denominators.
Read compression ratio
Ratio expressing the effective compression ratio of data read from disk. The numerator is the total number of bytes returned from disk reads after decompressing. The denominator is the total number of bytes read from disk, whether compressed or decompressed. This is different than the tile copy-to-read ratios due to extra read operations for array and fragment metadata.
Write query submits
The number of times a write query submit call was made.
Tiles written
The total number of tiles written, over all write queries.
Write compression ratio
Ratio expressing the effective compression ratio of data written to disk. The numerator is the total number of uncompressed bytes requested to be written to disk. The denominator is the total number of bytes written from disk, after compression.

Read example

As a simple example, we will examine the effect of different dense array tiling configurations on read queries. Intuitively, the closer the tile shape aligns with the shape of read queries to the array, the higher performance we expect to see. This is because TileDB fetches from storage only tiles overlapping with the subarray query. The bigger the overlap between the subarray query and the tile(s), the less the wasted work. We will use the statistics report to gain an exact understanding of the wasted work.

For each of the following experiments, we will read from a 2D dense array containing about 0.5GB of data. The array will be 12,000 rows by 12,000 columns with a single uncompressed int32 attribute. The array is created as follows:

C++

Context ctx;
ArraySchema schema(ctx, TILEDB_DENSE);
Domain dom(ctx);
dom.add_dimension(
       Dimension::create<uint32_t>(ctx, "row", {{1, 12000}}, row_tile_extent))
    .add_dimension(
        Dimension::create<uint32_t>(ctx, "col", {{1, 12000}}, col_tile_extent));
schema.set_domain(dom);
schema.add_attribute(
    Attribute::create<int32_t>(ctx, "a", {TILEDB_NO_COMPRESSION, -1}));
Array::create(array_uri, schema);

Python

ctx = tiledb.Ctx()
dom = tiledb.Domain(ctx,
                   tiledb.Dim(ctx, name="rows", domain=(1, 12000), tile=row_tile_extent, dtype=np.int32),
                   tiledb.Dim(ctx, name="cols", domain=(1, 12000), tile=col_tile_extent, dtype=np.int32))

schema = tiledb.ArraySchema(ctx, domain=dom, sparse=False,
                           attrs=[tiledb.Attr(ctx, name="a", dtype=np.int32)])

# Create the (empty) array on disk.
tiledb.DenseArray.create(array_name, schema)

The total array size on disk then is 12000 * 12000 * 4 bytes, about 550 MB.

As a first example, suppose we configured the schema such that the array is composed of a single tile, i.e.:

C++

dom.add_dimension(
      Dimension::create<uint32_t>(ctx, "row", {{1, 12000}}, 12000))
   .add_dimension(
       Dimension::create<uint32_t>(ctx, "col", {{1, 12000}}, 12000));

Python

dom = tiledb.Domain(ctx,
                   tiledb.Dim(ctx, name="rows", domain=(1, 12000), tile=12000, dtype=np.int32),
                   tiledb.Dim(ctx, name="cols", domain=(1, 12000), tile=12000, dtype=np.int32))

With this array schema, the entire array is composed of a single tile. Thus, any read query (regardless of the subarray) will fetch the entire array from disk. We will issue a read query of the first 3,000 rows (subarray [1:3000, 1:12000]) which is 25% of the cells in the array:

C++

std::vector<uint32_t> subarray = {1, 3000, 1, 12000};
Context ctx;
Array array(ctx, array_name, TILEDB_READ);
Query query(ctx, array);
std::vector<int32_t> values(
    array.max_buffer_elements(subarray)["a"].second);
query.set_subarray(subarray).set_buffer("a", values);

Stats::enable();
query.submit();
Stats::dump(stdout);
Stats::disable();

Python

ctx = tiledb.Ctx()
with tiledb.DenseArray(ctx, array_name, mode='r') as A:
    # Read a slice of 3,000 rows.
    tiledb.stats_enable()
    data = A[1:3001, 1:12001]
    tiledb.stats_dump()
    tiledb.stats_disable()

The report printed for this experiment is:

Summary:
--------
Hardware concurrency: 4
Reads:
  Read query submits: 1
  Tile cache hit ratio: 0 / 1  (0.0%)
  Fixed-length tile data copy-to-read ratio: 144000000 / 576000000 bytes (25.0%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 144000000 / 576000000 bytes (25.0%)
  Read compression ratio: 576000245 / 576000274 bytes (1.0x)
Writes:
  Write query submits: 0
  Tiles written: 0
  Write compression ratio: 0 / 0 bytes

We can see that during the time the statistics were being gathered, there was a single read query submitted (our read query). The denominator of the tile cache hit ratio tells us that the single read query accessed a single tile, as expected (since the entire array is a single tile).

The “fixed-length tile data copy-to-read ratio” metric expresses the “wasted work” measurement, namely the number of bytes copied into our query buffers to fulfill the read query, divided by the number of bytes read from disk. In this experiment, 144,000,000 bytes (the int32_t fixed-length attribute values for the subarray [1:3000, 1:12000]) were copied to the query buffers, but we read 576,000,000 tile data bytes from disk (576,000,000 = 12000 * 12000 * 4 bytes). This copy-to-read ratio tells us 25% of the work done by TileDB to satisfy the read query was useful.

Now let’s modify the array such that each row corresponds to a single tile, i.e.:

C++

dom.add_dimension(
      Dimension::create<uint32_t>(ctx, "row", {{1, 12000}}, 1))
   .add_dimension(
       Dimension::create<uint32_t>(ctx, "col", {{1, 12000}}, 12000));

Python

dom = tiledb.Domain(ctx,
                   tiledb.Dim(ctx, name="rows", domain=(1, 12000), tile=1, dtype=np.int32),
                   tiledb.Dim(ctx, name="cols", domain=(1, 12000), tile=12000, dtype=np.int32))

When reading the subarray [1:3000, 1:12000] as in the previous experiment, we see the following statistics:

Reads:
  Read query submits: 1
  Tile cache hit ratio: 0 / 3000  (0.0%)
  Fixed-length tile data copy-to-read ratio: 144000000 / 144000000 bytes (100.0%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 144000000 / 144000000 bytes (100.0%)
  Read compression ratio: 144384213 / 144035239 bytes (1.0x)

Now the denominator of the tile cache hit ratio tells us that 3,000 tiles were accessed, which is as expected because we requested 3,000 rows. Note also the difference in the copy-to-read ratio. We still copy 144,000,000 bytes (since the query is the same), but the amount of data is reduced from the entire array to only the tiles (rows) required, which is 12000 * 3000 * 4 = 144,000,000 bytes. This yields a 100% useful work (no wasted work) metric.

You may notice the “read compression ratio” metric reports more bytes read and used than just the tile data. The difference is accounted for by the array and fragment metadata, which TileDB must also read in order to determine which tiles should be read and decompressed. Metadata reads are not included in the “copy-to-read” ratios.

Finally, we will issue two overlapping queries back-to-back, first the same [1:3000, 1:12000] subarray followed by subarray [2000:4000, 1:12000], i.e.:

C++

Context ctx;
std::vector<uint32_t> subarray1 = {1, 3000, 1, 12000},
                      subarray2 = {2000, 4000, 1, 12000};
Array array(ctx, array_name, TILEDB_READ);
std::vector<int32_t> values1(array.max_buffer_elements(subarray1)["a"].second),
                     values2(array.max_buffer_elements(subarray2)["a"].second);
Query query1(ctx, array), query2(ctx, array);
query1.set_subarray(subarray1).set_buffer("a", values1);
query2.set_subarray(subarray2).set_buffer("a", values2);

Stats::enable();
query1.submit();
query2.submit();
Stats::dump(stdout);
Stats::disable();

Python

ctx = tiledb.Ctx()
with tiledb.DenseArray(ctx, array_name, mode='r') as A:
    tiledb.stats_enable()
    data1 = A[1:3001, 1:12001]
    data2 = A[2000:4001, 1:12001]
    tiledb.stats_dump()
    tiledb.stats_disable()

This yields the following report:

Reads:
  Read query submits: 2
  Tile cache hit ratio: 1 / 5001  (0.0%)
  Fixed-length tile data copy-to-read ratio: 240048000 / 240000000 bytes (100.0%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 240048000 / 240000000 bytes (100.0%)
  Read compression ratio: 240384213 / 240035239 bytes (1.0x)

Several things have changed, most notably now there was one hit in the tile cache out of the 5,001 tiles accessed. However, we may have expected that 1,001 tiles would hit in the cache, since the two queries overlapped on rows 2000–3000 (inclusive). The reason we do not see this in the statistics is that the default tile cache configuration does not allow many tiles to be cached. Let’s increase the tile cache size to 100MB and repeat the experiment:

C++

Config config;
config["sm.tile_cache_size"] = 100 * 1024 * 1024;  // 100 MB
Context ctx(config);
std::vector<uint32_t> subarray1 = {1, 3000, 1, 12000},
                      subarray2 = {2000, 4000, 1, 12000};
Array array(ctx, array_name, TILEDB_READ);
std::vector<int32_t> values1(array.max_buffer_elements(subarray1)["a"].second),
                     values2(array.max_buffer_elements(subarray2)["a"].second);
Query query1(ctx, array), query2(ctx, array);
query1.set_subarray(subarray1).set_buffer("a", values1);
query2.set_subarray(subarray2).set_buffer("a", values2);

Stats::enable();
query1.submit();
query2.submit();
Stats::dump(stdout);
Stats::disable();

Python

ctx = tiledb.Ctx({'sm.tile_cache_size': 100 * 1024 * 1024})
with tiledb.DenseArray(ctx, array_name, mode='r') as A:
    tiledb.stats_enable()
    data1 = A[1:3001, 1:12001]
    data2 = A[2000:4001, 1:12001]
    tiledb.stats_dump()
    tiledb.stats_disable()

The stats summary now reads:

Reads:
  Read query submits: 2
  Tile cache hit ratio: 752 / 5001  (15.0%)
  Fixed-length tile data copy-to-read ratio: 240048000 / 203952000 bytes (117.7%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 240048000 / 203952000 bytes (117.7%)
  Read compression ratio: 204336213 / 203987239 bytes (1.0x)

We now have many more hits in the cache. Also notice that the copy-to-read ratio now exceeds 100%, because although the same number of bytes were copied into the query buffers, many of those bytes did not have to be read from disk twice (as they were hits in the cache).

Write example

The write statistics summary is less in-depth compared to the read summary. We will take a look at the example of writing the above 12,000 by 12,000 array with synthetic attribute data when each row is a single tile:

C++

Array array(ctx, array_name, TILEDB_WRITE);
Query query(ctx, array);
std::vector<int32_t> values(12000 * 12000);
for (unsigned i = 0; i < values.size(); i++) {
  values[i] = i;
}
query.set_layout(TILEDB_ROW_MAJOR).set_buffer("a", values);
Stats::enable();
query.submit();
Stats::dump(stdout);
Stats::disable();

Python

ctx = tiledb.Ctx()
# Open the array and write to it.
with tiledb.DenseArray(ctx, array_name, mode='w') as A:
    data = np.arange(12000 * 12000)
    tiledb.stats_enable()
    A[:] = data
    tiledb.stats_dump()
    tiledb.stats_disable()

With attribute a uncompressed as before, this gives the following report in the summary:

Writes:
  Write query submits: 1
  Tiles written: 12000
  Write compression ratio: 576384146 / 576035065 bytes (1.0x)

As expected, because each row was a single tile, writing 12,000 rows causes 12,000 tiles to be written. Because a is uncompressed, the compression ratio is nearly exactly 1.0x (the small amount of difference is due to the array and fragment metadata being compressed independently of the attribute).

If we enable compression on the a attribute when creating the array schema, e.g. bzip2 at its default compression level, we see the change in the report:

Writes:
  Write query submits: 1
  Tiles written: 12000
  Write compression ratio: 576384146 / 52992598 bytes (10.9x)

Because our synthetic array data is very predictable, bzip2 does a good job compressing it, and this is reflected in the reported compression ratio.

Full statistics report

In general, the summary report may be enough to reveal potential sources of large performance flaws. In addition, accompanying every stats dump is a list of all of the individual internal performance counters that TileDB tracks. Each of the counter names is prefixed with the system it measures, e.g. vfs_* counters measure details of the TileDB VFS system, compressor_* measures details of the various compressors, etc. Some of these counters are self-explanatory, and others are intended primarily for TileDB developers to diagnose internal performance issues.

Example full statistics report

===================================== TileDB Statistics Report =======================================

Individual function statistics:
  Function name                                                          # calls       Total time (ns)
  ----------------------------------------------------------------------------------------------------
  compressor_blosc_compress,                                                   0,                   0
  compressor_blosc_decompress,                                                 0,                   0
  compressor_bzip_compress,                                                12000,         84484067674
  compressor_bzip_decompress,                                                  0,                   0
  compressor_dd_compress,                                                      0,                   0
  compressor_dd_decompress,                                                    0,                   0
  compressor_gzip_compress,                                                    6,             3141858
  compressor_gzip_decompress,                                                  0,                   0
  compressor_lz4_compress,                                                     0,                   0
  compressor_lz4_decompress,                                                   0,                   0
  compressor_rle_compress,                                                     0,                   0
  compressor_rle_decompress,                                                   0,                   0
  compressor_zstd_compress,                                                    0,                   0
  compressor_zstd_decompress,                                                  0,                   0
  cache_lru_evict,                                                             0,                   0
  cache_lru_insert,                                                            0,                   0
  cache_lru_read,                                                              0,                   0
  cache_lru_read_partial,                                                      0,                   0
  reader_compute_cell_ranges,                                                  0,                   0
  reader_compute_dense_cell_ranges,                                            0,                   0
  reader_compute_dense_overlapping_tiles_and_cell_ranges,                      0,                   0
  reader_compute_overlapping_coords,                                           0,                   0
  reader_compute_overlapping_tiles,                                            0,                   0
  reader_compute_tile_coordinates,                                             0,                   0
  reader_copy_fixed_cells,                                                     0,                   0
  reader_copy_var_cells,                                                       0,                   0
  reader_dedup_coords,                                                         0,                   0
  reader_dense_read,                                                           0,                   0
  reader_fill_coords,                                                          0,                   0
  reader_init_tile_fragment_dense_cell_range_iters,                            0,                   0
  reader_next_subarray_partition,                                              0,                   0
  reader_read,                                                                 0,                   0
  reader_read_all_tiles,                                                       0,                   0
  reader_sort_coords,                                                          0,                   0
  reader_sparse_read,                                                          0,                   0
  writer_check_coord_dups,                                                     0,                   0
  writer_check_coord_dups_global,                                              0,                   0
  writer_compute_coord_dups,                                                   0,                   0
  writer_compute_coord_dups_global,                                            0,                   0
  writer_compute_coords_metadata,                                              0,                   0
  writer_compute_write_cell_ranges,                                        12000,            20427826
  writer_create_fragment,                                                      1,              320940
  writer_global_write,                                                         0,                   0
  writer_init_global_write_state,                                              0,                   0
  writer_init_tile_dense_cell_range_iters,                                     1,            11207721
  writer_ordered_write,                                                        1,         87014290508
  writer_prepare_full_tiles_fixed,                                             0,                   0
  writer_prepare_full_tiles_var,                                               0,                   0
  writer_prepare_tiles_fixed,                                                  0,                   0
  writer_prepare_tiles_ordered,                                                1,           468057969
  writer_prepare_tiles_var,                                                    0,                   0
  writer_sort_coords,                                                          0,                   0
  writer_unordered_write,                                                      0,                   0
  writer_write,                                                                1,         87014293320
  writer_write_tiles,                                                          1,         86400299458
  sm_array_close,                                                              0,                   0
  sm_array_open,                                                               0,                   0
  sm_read_from_cache,                                                          0,                   0
  sm_write_to_cache,                                                           0,                   0
  sm_query_submit,                                                             1,         87014315821
  tileio_read,                                                                 0,                   0
  tileio_write,                                                            12000,         86387230601
  tileio_compress_tile,                                                    12001,         84596499202
  tileio_compress_one_tile,                                                12001,         84593672539
  tileio_decompress_tile,                                                      0,                   0
  tileio_decompress_one_tile,                                                  0,                   0
  vfs_abs_path,                                                                4,               85652
  vfs_close_file,                                                              2,              424409
  vfs_constructor,                                                             0,                   0
  vfs_create_bucket,                                                           0,                   0
  vfs_create_dir,                                                              1,               76902
  vfs_create_file,                                                             0,                   0
  vfs_destructor,                                                              0,                   0
  vfs_empty_bucket,                                                            0,                   0
  vfs_file_size,                                                               0,                   0
  vfs_filelock_lock,                                                           0,                   0
  vfs_filelock_unlock,                                                         0,                   0
  vfs_init,                                                                    0,                   0
  vfs_is_bucket,                                                               0,                   0
  vfs_is_dir,                                                                  2,               51775
  vfs_is_empty_bucket,                                                         0,                   0
  vfs_is_file,                                                                 0,                   0
  vfs_ls,                                                                      0,                   0
  vfs_move_file,                                                               0,                   0
  vfs_move_dir,                                                                0,                   0
  vfs_open_file,                                                               0,                   0
  vfs_read,                                                                    0,                   0
  vfs_remove_bucket,                                                           0,                   0
  vfs_remove_file,                                                             0,                   0
  vfs_remove_dir,                                                              0,                   0
  vfs_supports_fs,                                                             0,                   0
  vfs_sync,                                                                    0,                   0
  vfs_write,                                                               12002,          1782145962
  vfs_s3_fill_file_buffer,                                                     0,                   0
  vfs_s3_write_multipart,                                                      0,                   0

Individual counter statistics:
  Counter name                                                             Value
  ------------------------------------------------------------------------------
  cache_lru_inserts,                                                           0
  cache_lru_read_hits,                                                         0
  cache_lru_read_misses,                                                       0
  reader_num_attr_tiles_touched,                                               0
  reader_num_fixed_cell_bytes_copied,                                          0
  reader_num_fixed_cell_bytes_read,                                            0
  reader_num_var_cell_bytes_copied,                                            0
  reader_num_var_cell_bytes_read,                                              0
  writer_num_attr_tiles_written,                                           12000
  sm_contexts_created,                                                         0
  sm_query_submit_layout_col_major,                                            0
  sm_query_submit_layout_row_major,                                            1
  sm_query_submit_layout_global_order,                                         0
  sm_query_submit_layout_unordered,                                            0
  sm_query_submit_read,                                                        0
  sm_query_submit_write,                                                       1
  tileio_read_cache_hits,                                                      0
  tileio_read_num_bytes_read,                                                  0
  tileio_read_num_resulting_bytes,                                             0
  tileio_write_num_bytes_written,                                       52992598
  tileio_write_num_input_bytes,                                        576384146
  vfs_read_total_bytes,                                                        0
  vfs_write_total_bytes,                                                52992598
  vfs_read_num_parallelized,                                                   0
  vfs_posix_write_num_parallelized,                                            0
  vfs_win32_write_num_parallelized,                                            0
  vfs_s3_num_parts_written,                                                    0
  vfs_s3_write_num_parallelized,                                               0

The “function statistics” report the number of calls and amount of time in nanoseconds for each instrumented function. It is important to note that the time reported for these counters is aggregated across all threads. For example, if 10 threads invoke vfs_write and each thread’s call takes 100 ns, then the reported time for vfs_write will be 1000 ns, even though the average time was much less.

The “counter statistics” report the values of individual counters. The summary statistics are directly derived from these counter statistics.