collections

The "collections" array in workload.json defines the Solr collections to create before the benchmark starts.

Syntax

{
  "collections": [
    {
      "name": "<collection-name>",
      "configset-path": "<path>",
      "num-shards": 1,
      "replication-factor": 1,
      "tlog-replicas": 0,
      "pull-replicas": 0
    }
  ]
}

Fields

Field Type Required Default Description
name string Yes The collection name. Must be a valid Solr collection name.
configset-path string No Path relative to the workload directory pointing to a configset directory. If provided, the configset is uploaded to Solr/ZooKeeper before the collection is created. If omitted, the configset named by configset (or the collection name) must already exist on the server.
num-shards integer No 1 Number of shards for the collection.
replication-factor integer No 1 Number of NRT (near-real-time) replicas per shard. NRT replicas participate in leader elections.
tlog-replicas integer No 0 Number of TLOG replicas per shard. TLOG replicas buffer updates in a transaction log.
pull-replicas integer No 0 Number of Pull replicas per shard. Pull replicas are read-only and receive index segments from the leader.

Example

{
  "collections": [
    {
      "name": "nyc_taxis",
      "configset-path": "configsets/nyc_taxis",
      "num-shards": 2,
      "replication-factor": 1,
      "tlog-replicas": 1,
      "pull-replicas": 0
    }
  ]
}

Notes

  • When configset-path is provided, the directory must contain at minimum schema.xml and solrconfig.xml.
  • For SolrCloud, the configset is uploaded to ZooKeeper before the collection is created.
  • If the collection already exists when the benchmark starts, it is deleted and recreated so that benchmarks are repeatable.
  • See the Apache Solr Reference Guide: Collections API for background.