This version is still in development and is not considered stable yet. For the latest stable version, please use Spring Data MongoDB 4.4.5! |
Group Operations
As an alternative to using Map-Reduce to perform data aggregation, you can use the group
operation which feels similar to using SQL’s group by query style, so it may feel more approachable vs. using Map-Reduce. Using the group operations does have some limitations, for example it is not supported in a shared environment and it returns the full result set in a single BSON object, so the result should be small, less than 10,000 keys.
Spring provides integration with MongoDB’s group operation by providing methods on MongoOperations to simplify the creation and running of group operations. It can convert the results of the group operation to a POJO and also integrates with Spring’s Resource abstraction abstraction. This will let you place your JavaScript files on the file system, classpath, http server or any other Spring Resource implementation and then reference the JavaScript resources via an easy URI style syntax, e.g. 'classpath:reduce.js;. Externalizing JavaScript code in files if often preferable to embedding them as Java strings in your code. Note that you can still pass JavaScript code as Java strings if you prefer.
Example Usage
In order to understand how group operations work the following example is used, which is somewhat artificial. For a more realistic example consult the book 'MongoDB - The definitive guide'. A collection named group_test_collection
created with the following rows.
{ "_id" : ObjectId("4ec1d25d41421e2015da64f1"), "x" : 1 }
{ "_id" : ObjectId("4ec1d25d41421e2015da64f2"), "x" : 1 }
{ "_id" : ObjectId("4ec1d25d41421e2015da64f3"), "x" : 2 }
{ "_id" : ObjectId("4ec1d25d41421e2015da64f4"), "x" : 3 }
{ "_id" : ObjectId("4ec1d25d41421e2015da64f5"), "x" : 3 }
{ "_id" : ObjectId("4ec1d25d41421e2015da64f6"), "x" : 3 }
We would like to group by the only field in each row, the x
field and aggregate the number of times each specific value of x
occurs. To do this we need to create an initial document that contains our count variable and also a reduce function which will increment it each time it is encountered. The Java code to run the group operation is shown below
GroupByResults<XObject> results = mongoTemplate.group("group_test_collection",
GroupBy.key("x").initialDocument("{ count: 0 }").reduceFunction("function(doc, prev) { prev.count += 1 }"),
XObject.class);
The first argument is the name of the collection to run the group operation over, the second is a fluent API that specifies properties of the group operation via a GroupBy
class. In this example we are using just the intialDocument
and reduceFunction
methods. You can also specify a key-function, as well as a finalizer as part of the fluent API. If you have multiple keys to group by, you can pass in a comma separated list of keys.
The raw results of the group operation is a JSON document that looks like this
{
"retval" : [ { "x" : 1.0 , "count" : 2.0} ,
{ "x" : 2.0 , "count" : 1.0} ,
{ "x" : 3.0 , "count" : 3.0} ] ,
"count" : 6.0 ,
"keys" : 3 ,
"ok" : 1.0
}
The document under the "retval" field is mapped onto the third argument in the group method, in this case XObject which is shown below.
public class XObject {
private float x;
private float count;
public float getX() {
return x;
}
public void setX(float x) {
this.x = x;
}
public float getCount() {
return count;
}
public void setCount(float count) {
this.count = count;
}
@Override
public String toString() {
return "XObject [x=" + x + " count = " + count + "]";
}
}
You can also obtain the raw result as a Document
by calling the method getRawResults
on the GroupByResults
class.
There is an additional method overload of the group method on MongoOperations
which lets you specify a Criteria
object for selecting a subset of the rows. An example which uses a Criteria
object, with some syntax sugar using static imports, as well as referencing a key-function and reduce function javascript files via a Spring Resource string is shown below.
import static org.springframework.data.mongodb.core.mapreduce.GroupBy.keyFunction;
import static org.springframework.data.mongodb.core.query.Criteria.where;
GroupByResults<XObject> results = mongoTemplate.group(where("x").gt(0),
"group_test_collection",
keyFunction("classpath:keyFunction.js").initialDocument("{ count: 0 }").reduceFunction("classpath:groupReduce.js"), XObject.class);
include:../:aggregation-framework.adoc[]