Organizational Research By

Surprising Reserch Topic

calculated group by fields in mongodb


calculated group by fields in mongodb  using -'java,mongodb,spring-data,aggregation-framework'

For this example from the MongoDB documentation, how do I write the query using MongoTemplate?

db.sales.aggregate(
   [
      {
        $group : {
           _id : { month: { $month: "$date" }, day: { $dayOfMonth: "$date" }, year: { $year: "$date" } },
           totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } },
           averageQuantity: { $avg: "$quantity" },
           count: { $sum: 1 }
        }
      }
   ]
)


Or in general, how do I group by a calculated field?
    

asked Sep 26, 2015 by vimaldas2005
0 votes
7 views



Related Hot Questions

1 Answer

0 votes

You can actually do something like this with "project" first, but to me it's a little counter-intuitive to require a $project stage before hand:

    Aggregation agg = newAggregation(
        project("quantity")
            .andExpression("dayOfMonth(date)").as("day")
            .andExpression("month(date)").as("month")
            .andExpression("year(date)").as("year")
            .andExpression("price * quantity").as("totalAmount"),
        group(fields().and("day").and("month").and("year"))
            .avg("quantity").as("averavgeQuantity")
            .sum("totalAmount").as("totalAmount")
            .count().as("count")
    );

Like I said, counter-intuitive as you should just be able to declare all of this under $group stage, but the helpers don't seem to work this way. The serialization comes out a bit funny ( wraps the date operator arguments with arrays ) but it does seem to work. But still, this is two pipeline stages rather than one.

What is the problem with this? Well by separating the stages the stages the "project" portion forces the processing of all of the documents in the pipeline in order to get the calculated fields, that means it passes through everything before moving on to the group stage.

The difference in processing time can be clearly seen by running the queries in both forms. With a separate project stage, on my hardware takes three times longer to execute than the query where all fields are calculated during the "group" operation.

So it seems the only present way to construct this properly is by building the pipeline object yourself:

    ApplicationContext ctx =
            new AnnotationConfigApplicationContext(SpringMongoConfig.class);
    MongoOperations mongoOperation = (MongoOperations) ctx.getBean("mongoTemplate");

    BasicDBList pipeline = new BasicDBList();
    String[] multiplier = { "$price", "$quantity" };

    pipeline.add(
        new BasicDBObject("$group",
            new BasicDBObject("_id",
                new BasicDBObject("month", new BasicDBObject("$month", "$date"))
                    .append("day", new BasicDBObject("$dayOfMonth", "$date"))
                    .append("year", new BasicDBObject("$year", "$date"))
            )
            .append("totalPrice", new BasicDBObject(
                "$sum", new BasicDBObject(
                    "$multiply", multiplier
                )
            ))
            .append("averageQuantity", new BasicDBObject("$avg", "$quantity"))
            .append("count",new BasicDBObject("$sum",1))
        )
    );

    BasicDBObject aggregation = new BasicDBObject("aggregate","collection")
        .append("pipeline",pipeline);

    System.out.println(aggregation);

    CommandResult commandResult = mongoOperation.executeCommand(aggregation);

Or if all of that seems to terse to you, then you can always work with the JSON source and parse that. But of course, it has to be valid JSON:

    String json = "[" +
        "{ \"$group\": { "+
            "\"_id\": { " +
                "\"month\": { \"$month\": \"$date\" }, " +
                "\"day\": { \"$dayOfMonth\":\"$date\" }, " +
                "\"year\": { \"$year\": \"$date\" } " +
            "}, " +
            "\"totalPrice\": { \"$sum\": { \"$multiply\": [ \"$price\", \"$quantity\" ] } }, " +
            "\"averageQuantity\": { \"$avg\": \"$quantity\" }, " +
            "\"count\": { \"$sum\": 1 } " +
        "}}" +
    "]";

    BasicDBList pipeline = (BasicDBList)com.mongodb.util.JSON.parse(json);
answered Sep 26, 2015 by devkumargupta

...