- Aggregation >
- Aggregation Pipeline >
- Aggregation Pipeline and Sharded Collections
Aggregation Pipeline and Sharded Collections¶
On this page
The aggregation pipeline supports operations on sharded collections. This section describes behaviors specific to the aggregation pipeline and sharded collections.
Behavior¶
Changed in version 3.2.
If the pipeline starts with an exact $match
on a shard key,
the entire pipeline runs on the matching shard only. Previously, the
pipeline would have been split, and the work of merging it would have
to be done on the primary shard.
For aggregation operations that must run on multiple shards, if the
operations do not require running on the database’s primary shard,
these operations will route the results to a random shard to merge the
results to avoid overloading the primary shard for that database. The
$out
stage and the $lookup
stage require
running on the database’s primary shard.
Optimization¶
When splitting the aggregation pipeline into two parts, the pipeline is split to ensure that the shards perform as many stages as possible with consideration for optimization.
To see how the pipeline was split, include the explain
option in the
db.collection.aggregate()
method.
Optimizations are subject to change between releases.