OPTIONS

Aggregation Pipeline Optimization

Aggregation pipeline operations have an optimization phase which attempts to reshape the pipeline for improved performance.

To see how the optimizer transforms a particular aggregation pipeline, include the explain option in the db.collection.aggregate() method.

Optimizations are subject to change between releases.

Projection Optimization

The aggregation pipeline can determine if it requires only a subset of the fields in the documents to obtain the results. If so, the pipeline will only use those required fields, reducing the amount of data passing through the pipeline.

Pipeline Sequence Optimization

$sort + $match Sequence Optimization

When you have a sequence with $sort followed by a $match, the $match moves before the $sort to minimize the number of objects to sort. For example, if the pipeline consists of the following stages:

{ $sort: { age : -1 } },
{ $match: { status: 'A' } }

During the optimization phase, the optimizer transforms the sequence to the following:

{ $match: { status: 'A' } },
{ $sort: { age : -1 } }

$skip + $limit Sequence Optimization

When you have a sequence with $skip followed by a $limit, the $limit moves before the $skip. With the reordering, the $limit value increases by the $skip amount.

For example, if the pipeline consists of the following stages:

{ $skip: 10 },
{ $limit: 5 }

During the optimization phase, the optimizer transforms the sequence to the following:

{ $limit: 15 },
{ $skip: 10 }

This optimization allows for more opportunities for $sort + $limit Coalescence, such as with $sort + $skip + $limit sequences. See $sort + $limit Coalescence for details on the coalescence and $sort + $skip + $limit Sequence for an example.

For aggregation operations on sharded collections, this optimization reduces the results returned from each shard.

$redact + $match Sequence Optimization

When possible, when the pipeline has the $redact stage immediately followed by the $match stage, the aggregation can sometimes add a portion of the $match stage before the $redact stage. If the added $match stage is at the start of a pipeline, the aggregation can use an index as well as query the collection to limit the number of documents that enter the pipeline. See Pipeline Operators and Indexes for more information.

For example, if the pipeline consists of the following stages:

{ $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } },
{ $match: { year: 2014, category: { $ne: "Z" } } }

The optimizer can add the same $match stage before the $redact stage:

{ $match: { year: 2014 } },
{ $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } },
{ $match: { year: 2014, category: { $ne: "Z" } } }

$project + $skip or $limit Sequence Optimization

New in version 3.2.

When you have a sequence with $project followed by either $skip or $limit, the $skip or $limit moves before $project. For example, if the pipeline consists of the following stages:

{ $sort: { age : -1 } },
{ $project: { status: 1, name: 1 } },
{ $limit: 5 }

During the optimization phase, the optimizer transforms the sequence to the following:

{ $sort: { age : -1 } },
{ $limit: 5 }
{ $project: { status: 1, name: 1 } },

This optimization allows for more opportunities for $sort + $limit Coalescence, such as with $sort + $limit sequences. See $sort + $limit Coalescence for details on the coalescence.

Pipeline Coalescence Optimization

When possible, the optimization phase coalesces a pipeline stage into its predecessor. Generally, coalescence occurs after any sequence reordering optimization.

$sort + $limit Coalescence

When a $sort immediately precedes a $limit, the optimizer can coalesce the $limit into the $sort. This allows the sort operation to only maintain the top n results as it progresses, where n is the specified limit, and MongoDB only needs to store n items in memory [1]. See $sort Operator and Memory for more information.

[1]The optimization will still apply when allowDiskUse is true and the n items exceed the aggregation memory limit.

$limit + $limit Coalescence

When a $limit immediately follows another $limit, the two stages can coalesce into a single $limit where the limit amount is the smaller of the two initial limit amounts. For example, a pipeline contains the following sequence:

{ $limit: 100 },
{ $limit: 10 }

Then the second $limit stage can coalesce into the first $limit stage and result in a single $limit stage where the limit amount 10 is the minimum of the two initial limits 100 and 10.

{ $limit: 10 }

$skip + $skip Coalescence

When a $skip immediately follows another $skip, the two stages can coalesce into a single $skip where the skip amount is the sum of the two initial skip amounts. For example, a pipeline contains the following sequence:

{ $skip: 5 },
{ $skip: 2 }

Then the second $skip stage can coalesce into the first $skip stage and result in a single $skip stage where the skip amount 7 is the sum of the two initial limits 5 and 2.

{ $skip: 7 }

$match + $match Coalescence

When a $match immediately follows another $match, the two stages can coalesce into a single $match combining the conditions with an $and. For example, a pipeline contains the following sequence:

{ $match: { year: 2014 } },
{ $match: { status: "A" } }

Then the second $match stage can coalesce into the first $match stage and result in a single $match stage

{ $match: { $and: [ { "year" : 2014 }, { "status" : "A" } ] } }

$lookup + $unwind Coalescence

New in version 3.2.

When a $unwind immediately follows another $lookup, and the $unwind operates on the as field of the $lookup, the optimizer can coalesce the $unwind into the $lookup stage. This avoids creating large intermediate documents.

For example, a pipeline contains the following sequence:

{
  $lookup: {
    from: "otherCollection",
    as: "resultingArray",
    localField: "x",
    foreignField: "y"
  }
},
{ $unwind: "$resultingArray"}

The optimizer can coalesce the $unwind stage into the $lookup stage. If you run the aggregation with explain option, the explain output shows the coalesced stage:

{
  $lookup: {
    from: "otherCollection",
    as: "resultingArray",
    localField: "x",
    foreignField: "y",
    unwinding: { preserveNullAndEmptyArrays: false }
  }
}

Examples

The following examples are some sequences that can take advantage of both sequence reordering and coalescence. Generally, coalescence occurs after any sequence reordering optimization.

$sort + $skip + $limit Sequence

A pipeline contains a sequence of $sort followed by a $skip followed by a $limit:

{ $sort: { age : -1 } },
{ $skip: 10 },
{ $limit: 5 }

First, the optimizer performs the $skip + $limit Sequence Optimization to transforms the sequence to the following:

{ $sort: { age : -1 } },
{ $limit: 15 }
{ $skip: 10 }

The $skip + $limit Sequence Optimization increases the $limit amount with the reordering. See $skip + $limit Sequence Optimization for details.

The reordered sequence now has $sort immediately preceding the $limit, and the pipeline can coalesce the two stages to decrease memory usage during the sort operation. See $sort + $limit Coalescence for more information.

$limit + $skip + $limit + $skip Sequence

A pipeline contains a sequence of alternating $limit and $skip stages:

{ $limit: 100 },
{ $skip: 5 },
{ $limit: 10 },
{ $skip: 2 }

The $skip + $limit Sequence Optimization reverses the position of the { $skip: 5 } and { $limit: 10 } stages and increases the limit amount:

{ $limit: 100 },
{ $limit: 15},
{ $skip: 5 },
{ $skip: 2 }

The optimizer then coalesces the two $limit stages into a single $limit stage and the two $skip stages into a single $skip stage. The resulting sequence is the following:

{ $limit: 15 },
{ $skip: 7 }

See $limit + $limit Coalescence and $skip + $skip Coalescence for details.

See also

explain option in the db.collection.aggregate()

Was this page helpful?

Yes No

Thank you for your feedback!

We're sorry! You can Report a Problem to help us improve this page.