Performance

The latency tiers that govern a Tonbo Artifacts mount, and why metadata-heavy agent workloads run an order of magnitude faster on this architecture.

Tonbo Artifacts is tuned for metadata-heavy workloads: directory trees walked by recursive find and grep, many small file opens, frequent attribute lookups, write-edit-reread loops. Most of the wall clock in these workloads is spent on metadata round-trips, not on moving bytes. The architecture optimizes accordingly.

A representative workload

The pattern this design is tuned for looks like:

A workspace with tens of thousands of files, single-digit GB total
Deep, sparse directories, such as a code repository, a hierarchical document store, or a structured dataset
An AI agent that runs broad recursive searches, opens many small files in sequence, then writes a handful of outputs

This shape is the common case for AI coding and analyst agents. It is dominated by getattr, readdir, and open. Each of these moves a few bytes, but they happen thousands of times per session.

Three latency tiers

Tier	What's resolved from	Typical latency	When it dominates
Hot	The mount client's local cache	sub-millisecond	Re-walking paths the agent has already touched
Warm	The metadata store; chunks already in the local chunk cache	low single-digit milliseconds	First time a path is touched in this mount session
Cold	Chunks fetched from S3 on demand	tens of milliseconds	First read of a file's contents

On its first traversal, a typical agent session is mostly warm. The metadata store answers from memory and the mount client populates its local cache as it goes. Every traversal after that is mostly hot. Cold latency applies only to file content that has not yet been pulled from S3 in this mount session; metadata is never on the cold path.

Why agent workloads favor this architecture

Agent workloads have two properties that the architecture exploits:

The metadata working set fits in memory. Tens of thousands of files of metadata is megabytes, not gigabytes. The metadata store keeps it resident; the mount client's local cache keeps the actively-walked subset even closer to the FUSE syscall.
The same paths are revisited. Agents loop: read-think-edit-read-think-edit. Each loop revisits much of the same directory tree. Aggressive metadata caching turns these revisits into in-process lookups.

The cold-path cost of a first metadata touch is a single network round-trip from the mount client to the metadata store. After that, the path is warm; on the next pass it is hot.

Compared to per-client metadata caching

Designs that cache metadata only inside each mount client have a different cost profile on deep recursive operations:

Every mount starts with a cold cache. A second mount on the same workspace re-walks the tree from scratch.
A miss on a deep path is a chain of round-trips out of the client back to whatever upstream holds the truth.
Concurrent agents on different VMs cannot share warm state with each other.

Tonbo Artifacts caches metadata at the metadata store itself, not only inside each mount. The store is the source of truth, kept warm by every mount that touches it. A second mount on the same workspace hits a pre-warmed view on its first walk; concurrent agents on different VMs converge on the same cached state.

The gap is largest on operations that walk deep, sparse trees, which is what agents do constantly.

What we don't optimize for

The same architectural choices imply trade-offs. Workloads where this is not the design point:

Streaming multi-GB sequential reads of a single file. The S3 data path serves these adequately, but a direct S3 client with range-read parallelism will be faster.
High-rate concurrent writers to the same file from multiple mounts. The metadata plane serializes metadata updates; on the data plane, last write wins at the chunk level. Agents rarely hit this; pipelines that fan out writes to one file from many machines do. See Concurrent writes for the exact semantics.

Benchmark methodology

The characteristics above are validated against AI-coding-agent workloads on customer datasets. Detailed benchmark methodology, dataset shapes, and head-to-head numbers against alternative remote filesystems are available on request.