- Reference >
- Operators >
- Aggregation Pipeline Operators >
- Group Accumulator Operators >
- $stdDevSamp (aggregation)
$stdDevSamp (aggregation)¶
On this page
Definition¶
-
$stdDevSamp
¶ New in version 3.2.
Calculates the sample standard deviation of the input values. Use if the values encompass a sample of a population of data from which to generalize about the population.
$stdDevSamp
ignores non-numeric values.If the values represent the entire population of data or you do not wish to generalize about a larger population, use
$stdDevPop
instead.$stdDevSamp
is available in the$group
and$project
stages.When used in the
$group
stage,$stdDevSamp
has the following syntax and returns the sample standard deviation of the specified expression for a group of documents that share the same group by key:{ $stdDevSamp: <expression> }
When used in the
$project
stage,$stdDevSamp
returns the sample standard deviation of the specified expression or list of expressions for each document and has one of two syntaxes:$stdDevSamp
has one specified expression as its operand:{ $stdDevSamp: <expression> }
$stdDevSamp
has a list of specified expressions as its operand:{ $stdDevSamp: [ <expression1>, <expression2> ... ] }
The argument for
$stdDevSamp
can be any expression as long as it resolves to an array. For more information on expressions, see Expressions.
Behavior¶
Non-numeric Values¶
$stdDevSamp
ignores non-numeric values. If all operands for a
sum are non-numeric, $stdDevSamp
returns null
.
Single Value¶
If the sample consists of a single numeric value, $stdDevSamp
returns null
.
Array Operand¶
In the $group
stage, if the expression resolves to an
array, $stdDevSamp
treats the operand as a non-numerical value.
In the $project
stage:
- With a single expression as its operand, if the expression resolves
to an array,
$stdDevSamp
traverses into the array to operate on the numerical elements of the array to return a single value. - With a list of expressions as its operand, if any of the expressions
resolves to an array,
$stdDevSamp
does not traverse into the array but instead treats the array as a non-numerical value.
Example¶
A collection users
contains documents with the following fields:
{_id: 0, username: "user0", age: 20}
{_id: 1, username: "user1", age: 42}
{_id: 2, username: "user2", age: 28}
...
To calculate the standard deviation of a sample of users, following
aggregation operation first uses the $sample
pipeline to
sample 100 users, and then uses $stdDevSamp
calculates the
standard deviation for the sampled users.
db.users.aggregate(
[
{ $sample: { size: 100 } },
{ $group: { _id: null, ageStdDev: { $stdDevSamp: "$age" } } }
]
)
The operation returns a result like the following:
{ "_id" : null, "ageStdDev" : 7.811258386185771 }