Import data

Move existing data into a Tonbo Artifacts workspace from a Mac folder, an S3 bucket, or local disk.

A native artifacts migrate command is on the v1 list; for v0, use the existing battle-tested S3 / rsync tooling. Pick the recipe that matches where your data lives now.

From a Mac folder

The v0 mount client is Linux-only, so Mac → Tonbo Artifacts is a two-hop: stage the data to your bucket first, then read it into the mount from a Linux host.

  1. On the Mac: stage to your Tigris bucket

    Install rclone:

    brew install rclone

    Configure it to talk to Tigris:

    rclone config create tigris s3 \
        provider=Other \
        access_key_id=<tigris-access-key> \
        secret_access_key=<tigris-secret-key> \
        endpoint=https://fly.storage.tigris.dev \
        region=auto

    Upload, with parallel transfers + progress:

    rclone copy /Users/<you>/cases tigris:panta-cases-staging/ \
        --transfers 32 --progress

    For 6 GB / 44k files this typically takes 10–30 minutes depending on uplink bandwidth.

  2. On a Linux host with the workspace mounted: pull from staging
    aws s3 cp s3://panta-cases-staging/ /mnt/work/ \
        --recursive \
        --endpoint-url https://fly.storage.tigris.dev
  3. Validate, then drop the staging bucket

    Pick a representative file and verify cold + warm reads (see Validation below). Once you're satisfied, you can delete the staging bucket; the data lives in your Artifacts workspace's bucket plus Tonbo's metadata service.

From an S3 source

If your data already lives in another S3-compatible bucket (Tigris, R2, AWS S3, MinIO), single hop:

aws s3 cp s3://<source-bucket>/ /mnt/work/ \
    --recursive \
    --endpoint-url https://<source-endpoint>

The Tonbo workspace's bucket gets the chunks (because /mnt/work is the FUSE mount, writes flow through to your configured bucket). The metadata (inode tree, chunk pointers) lands in Tonbo's Redis.

From a local disk / NFS / etc.

rsync -avP --info=progress2 /local/source/ /mnt/work/

For directory trees with deep parallelism opportunities:

find /local/source -type f \
  | parallel -j 32 rsync -aR --info=progress2 {} /mnt/work/

(Requires GNU parallel. The -aR flag preserves relative paths so the structure mirrors under /mnt/work.)

Validation

Always sanity-check after a bulk import:

# Pick a file that's in your real workload (not synthetic).
TARGET=/mnt/work/<some-real-file>
ls -la "$TARGET"

# Cold read: first time pulls the chunk.
time cat "$TARGET" >/dev/null

# Warm read: should be single-digit ms.
time cat "$TARGET" >/dev/null

After 60 s of idle, confirm zero object-storage errors via the mount's stats file:

sleep 60
grep -E '^(juicefs_fuse_ops_io_errors_EIO|juicefs_object_request_errors|juicefs_staging_blocks)' \
    /mnt/work/.stats

All values should be 0 (or stable at zero) after the idle window.

What about an artifacts migrate command?

A first-class wrapper that handles all three patterns above with progress, validation, and resume in one CLI is on the v1 list. v0 intentionally leans on the existing tools because they're mature and your Linux host already has them.

If your migration ergonomics are blocking your benchmark or production cutover, ping us. We'll prioritise based on what's actually painful.