--batch-size¶

Why we added this flag: embedding APIs have rate and token limits; controlling batch size lets you trade throughput for stability.

What it does¶

Sets the maximum number of documents per embedding request (default: 128).
Works alongside --token-limit to prevent oversized batches.
Smaller batches reduce retry storms when your model registry contains very large documents.

Typical usage¶

idxr vectorize index \
  --model "$IDXR_MODEL" \
  --partition-manifest workdir/partitions/manifest.json \
  --partition-out-dir workdir/chroma_partitions \
  --batch-size 300 \
  --truncation-strategy middle_out

Tips¶

Increase cautiously—monitor OpenAI rate limits and Chroma ingest timing.
Pair with --parallel-partitions to achieve higher concurrency without breaching per-request limits.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search