Performance
The latency tiers that govern a Tonbo Artifacts mount, and why metadata-heavy agent workloads run an order of magnitude faster on this architecture.
Tonbo Artifacts is tuned for metadata-heavy workloads: directory
trees walked by recursive find and grep, many small file opens,
frequent attribute lookups, write-edit-reread loops. Most of the wall
clock in these workloads is spent on metadata round-trips, not on
moving bytes. The architecture optimizes accordingly.
A representative workload
The pattern this design is tuned for looks like:
- A workspace with tens of thousands of files, single-digit GB total
- Deep, sparse directories, such as a code repository, a hierarchical document store, or a structured dataset
- An AI agent that runs broad recursive searches, opens many small files in sequence, then writes a handful of outputs
This shape is the common case for AI coding and analyst agents. It is
dominated by getattr, readdir, and open. Each of these moves a
few bytes, but they happen thousands of times per session.
Three latency tiers
| Tier | What's resolved from | Typical latency | When it dominates |
|---|---|---|---|
| Hot | The mount client's local cache | sub-millisecond | Re-walking paths the agent has already touched |
| Warm | The metadata store; chunks already in the local chunk cache | low single-digit milliseconds | First time a path is touched in this mount session |
| Cold | Chunks fetched from S3 on demand | tens of milliseconds | First read of a file's contents |
On its first traversal, a typical agent session is mostly warm. The metadata store answers from memory and the mount client populates its local cache as it goes. Every traversal after that is mostly hot. Cold latency applies only to file content that has not yet been pulled from S3 in this mount session; metadata is never on the cold path.
Why agent workloads favor this architecture
Agent workloads have two properties that the architecture exploits:
- The metadata working set fits in memory. Tens of thousands of files of metadata is megabytes, not gigabytes. The metadata store keeps it resident; the mount client's local cache keeps the actively-walked subset even closer to the FUSE syscall.
- The same paths are revisited. Agents loop: read-think-edit-read-think-edit. Each loop revisits much of the same directory tree. Aggressive metadata caching turns these revisits into in-process lookups.
The cold-path cost of a first metadata touch is a single network round-trip from the mount client to the metadata store. After that, the path is warm; on the next pass it is hot.
Compared to per-client metadata caching
Designs that cache metadata only inside each mount client have a different cost profile on deep recursive operations:
- Every mount starts with a cold cache. A second mount on the same workspace re-walks the tree from scratch.
- A miss on a deep path is a chain of round-trips out of the client back to whatever upstream holds the truth.
- Concurrent agents on different VMs cannot share warm state with each other.
Tonbo Artifacts caches metadata at the metadata store itself, not only inside each mount. The store is the source of truth, kept warm by every mount that touches it. A second mount on the same workspace hits a pre-warmed view on its first walk; concurrent agents on different VMs converge on the same cached state.
The gap is largest on operations that walk deep, sparse trees, which is what agents do constantly.
What we don't optimize for
The same architectural choices imply trade-offs. Workloads where this is not the design point:
- Streaming multi-GB sequential reads of a single file. The S3 data path serves these adequately, but a direct S3 client with range-read parallelism will be faster.
- High-rate concurrent writers to the same file from multiple mounts. The metadata plane serializes metadata updates; on the data plane, last write wins at the chunk level. Agents rarely hit this; pipelines that fan out writes to one file from many machines do. See Concurrent writes for the exact semantics.
Benchmark methodology
The characteristics above are validated against AI-coding-agent workloads on customer datasets. Detailed benchmark methodology, dataset shapes, and head-to-head numbers against alternative remote filesystems are available on request.