Organizational Research By

Surprising Reserch Topic

upsert multiple records with mongodb


upsert multiple records with mongodb  using -'ruby,mongodb,mongomapper,upsert'

I'm trying to get MongoDB to upsert multiple records with the following query, ultimately using MongoMapper and the Mongo ruby driver.

db.foo.update({event_id: { $in: [1,2]}}, {$inc: {visit:1}}, true, true)


This works fine if all the records exist, but does not create new records for records that do not exist.  The following command has the desired effect from the shell, but is probably not ideal from the ruby driver.

[1,2].forEach(function(id) {db.foo.update({event_id: id}, {$inc: {visit:1}}, true, true) });


I could loop through each id I want to insert from within ruby, but that would necessitate a trip to the database for each item.  Is there a way to upsert multiple items from the ruby driver with only a single trip to the database?  What's the best practice here?  Using mongomapper and the ruby driver, is there a way to send multiple updates in a single batch, generating something like the following?

db.foo.update({event_id: 1}, {$inc: {visit:1}}, true); db.foo.update({event_id: 2}, {$inc: {visit:1}}, true);


Sample Data:

Desired data after command if two records exist.

{ "_id" : ObjectId("4d6babbac0d8bb8238d02099"), "event_id" : 1, "visit" : 11 }
{ "_id" : ObjectId("4d6baf56c0d8bb8238d0209a"), "event_id" : 2, "visit" : 2 }


Actual data after command if two records exist.

{ "_id" : ObjectId("4d6babbac0d8bb8238d02099"), "event_id" : 1, "visit" : 11 }
{ "_id" : ObjectId("4d6baf56c0d8bb8238d0209a"), "event_id" : 2, "visit" : 2 }


Desired data after command if only the record with event_id 1 exists.

{ "_id" : ObjectId("4d6babbac0d8bb8238d02099"), "event_id" : 1, "visit" : 2 }
{ "_id" : ObjectId("4d6baf56c0d8bb8238d0209a"), "event_id" : 2, "visit" : 1 }


Actual data after command if only the record with event_id 1 exists.

{ "_id" : ObjectId("4d6babbac0d8bb8238d02099"), "event_id" : 1, "visit" : 2 }

    

asked Sep 29, 2015 by badhwar.rohit
0 votes
6 views



Related Hot Questions

3 Answers

0 votes

This - correctly - will not insert any records with event_id 1 or 2 if they do not already exist

db.foo.update({event_id: { $in: [1,2]}}, {$inc: {visit:1}}, true, true)

This is because the objNew part of the query (see http://www.mongodb.org/display/DOCS/Updating#Updating-UpsertswithModifiers) does not have a value for field event_id. As a result, you will need at least X+1 trips to the database, where X is the number of event_ids, to ensure that you insert a record if one does not exist for a particular event_id (the +1 comes from the query above, which increases the visits counter for existing records). To say it in a different way, how does MongoDB know you want to use value 2 for the event_id and not 1? And why not 6?

W.r.t. batch insertion with ruby, I think it is possible as the following link suggests - although I've only used the Java driver: Batch insert/update using Mongoid?

answered Sep 29, 2015 by patilkiran.101
0 votes

What you are after is the Find and Modify command with the upsert option set to true. See the example from the Mongo test suite (same one linked to in the Find and Modify docs) for an example that looks very much like what you describe in your question.

answered Sep 29, 2015 by akhilesh
0 votes

I found a way to do this using the eval operator for server-side code execution. Here is the code snippit:

def batchpush(body, item_opts = {})
    @batch << {
        :body => body,
        :duplicate_key => item_opts[:duplicate_key] || Mongo::Dequeue.generate_duplicate_key(body),
        :priority => item_opts[:priority] || @config[:default_priority]
    }
end

def batchprocess()
    js = %Q|
        function(batch) {
            var nowutc = new Date();
            var ret = [];
            for(i in batch){
                e = batch[i];
                //ret.push(e);
                var query = {
                    'duplicate_key': e.duplicate_key,
                    'complete': false,
                    'locked_at': null
                };
                var object = {
                    '$set': {
                        'body': e.body,
                        'inserted_at': nowutc,
                        'complete': false,
                        'locked_till': null,
                        'completed_at': null,
                        'priority': e.priority,
                        'duplicate_key': e.duplicate_key,
                        'completecount': 0
                    },
                    '$inc': {'count': 1}
                };

                db.#{collection.name}.update(query, object, true);
            }
            return ret;
        }
    |
    cmd = BSON::OrderedHash.new
    cmd['$eval'] = js
    cmd['args'] = [@batch]
    cmd['nolock'] = true
    result = collection.db.command(cmd)
    @batch.clear
    #pp result
end

Multiple items are added with batchpush(), and then batchprocess() is called. The data is sent as an array, and the commands are all executed. This code is used in the MongoDequeue GEM, in this file.

Only one request is made, and all the upserts happen server-side.

answered Sep 29, 2015 by ankitarajoria4

...