What's in your bucket
What Tonbo Artifacts writes into a BYO bucket, what never appears there, and how to estimate object counts for a security or cost review.
If you brought your own bucket, you probably want to know what we write into it, both to verify the product is doing what it says and for security review.
What's in the bucket
When you mount a workspace and write to it, what lands in your bucket is immutable chunks. Under the workspace's prefix:
your-bucket/
your-org/my-workspace/
chunks/
0/0/1_0_4194304
0/0/2_0_4194304
0/0/3_0_1234567
0/1/4_0_4194304
0/1/5_0_4194304
...
Path convention: chunks/{slice_id / 1_000_000}/{slice_id / 1_000}/{slice_id}_{block_index}_{block_size}. The two intermediate directories spread chunks for S3 listing performance; they're not semantically meaningful.
File naming:
slice_id: monotonic integer assigned by the metadata service.block_index: position of this block within the slice (0for the first ≤4 MiB block,1for the next, and so on). Small files have a single block, so you'll mostly see0.block_size: actual byte size of this chunk. Default block is 4 MiB (4_194_304); the trailing chunk of a file can be smaller.
Inspect from your host:
aws s3 ls s3://your-bucket/your-org/my-workspace/chunks/ --recursive --summarize
2026-05-22 18:00:01 4194304 your-org/my-workspace/chunks/0/0/1_0_4194304
2026-05-22 18:00:01 4194304 your-org/my-workspace/chunks/0/0/2_0_4194304
2026-05-22 18:00:01 1234567 your-org/my-workspace/chunks/0/0/3_0_1234567
...
Total Objects: 247
Total Size: 1023421238
You can read the total size and object count from the bucket directly. Each 4 MiB chunk is one object, so a roughly 1 GiB workspace shows ~250 objects.
What's not in the bucket
Everything except raw chunk bytes lives in the metadata service. You won't find any of this in your bucket:
- File and directory names, paths, and tree structure
mtime,atime,ctime, file size- POSIX permissions, owner UID/GID
- Symlinks, extended attributes (xattr)
- Workspace name, handle, labels
- Grants (who can mount this workspace)
- Mount sessions (active tokens and their scopes)
What follows from this:
- You can't reconstruct a file from the bucket alone. A 4 MiB chunk is anonymous bytes with no association to a file path. The metadata service has to interpret it.
- You can't rename or restructure files by manipulating S3. Renaming
foo.txttobar.txtis a metadata-only operation; the chunks don't move. - Chunks aren't deduplicated by content. Each slice gets a fresh id, so writing the same bytes twice (even within one file) produces separate chunks.
This is the structural difference from a "files as S3 objects" approach (Archil, s3fs, goofys): those keep bytes readable by any S3-aware tool with nothing else running, at the cost of turning metadata operations (listing a deep tree, statting many files) into S3 API calls.
Estimating bucket size
| Workspace logical size | Object count |
|---|---|
| 100 MB | ~25 |
| 1 GB | ~250 |
| 10 GB | ~2,500 |
| 100 GB | ~25,000 |
| 1 TB | ~250,000 |
There is no cross-workspace dedup: 20 workspaces holding the same data hold 20 copies of the chunks. Within a workspace, the metadata service tracks which slices are still referenced; a chunk becomes eligible for garbage collection once nothing references its slice, after overwrites, deletes, or slice compaction.
Cleanup on workspace delete
- Managed workspaces: deleting the workspace triggers a best-effort delete of every chunk under its prefix. Failures during the bulk delete don't block the workspace from transitioning to
deleted; orphan chunks get reaped by a background GC. - BYO workspaces: deleting the workspace clears metadata-service state but doesn't touch your bucket. Chunks remain under the prefix. To free storage, list the prefix and delete from your bucket yourself:
aws s3 rm s3://your-bucket/your-org/my-workspace/ --recursive.
Relation to other concepts
- Workspace: the unit whose contents live under one prefix in the bucket.
- Bring your own bucket: how the bucket gets attached in the first place.
- Concurrent writes: why writes show up as chunks, not files.