Organizational Research By

Surprising Reserch Topic

multi collection multi document transactions in mongodb

multi collection multi document transactions in mongodb  using -'mongodb,transactions'

I realise that MongoDB, by it's very nature, doesn't and probably never will support these kinds of transactions.  However, I have found that I do need to use them in a somewhat limited fashion, so I've come up with the following solution, and I'm wondering: is this the best way of doing it, and can it be improved upon? (before I go and implement it in my app!)

Obviously the transaction is controlled via the application (in my case, a Python web app).  For each document in this transaction (in any collection), the following fields are added:

'lock_status': bool (true = locked, false = unlocked),
'data_old': dict (of any old values - current values really - that are being changed),
'data_new': dict (of values replacing the old (current) values - should be an identical list to data_old),
'change_complete': bool (true = the update to this specific document has occurred and was successful),
'transaction_id': ObjectId of the parent transaction

In addition, there is a transaction collection which stores documents detailing each transaction in progress.  They look like:

    '_id': ObjectId,
    'date_added': datetime,
    'status': bool (true = all changes successful, false = in progress),
    'collections': array of collection names involved in the transaction

And here's the logic of the process. Hopefully it works in such a way that if it's interupted, or fails in some other way, it can be rolled back properly.

1: Set up a transaction document

2: For each document that is affected by this transaction:

Set lock_status to true (to 'lock' the document from being modified)
Set data_old and data_new to their old and new values
Set change_complete to false
Set transaction_id to the ObjectId of the transaction document we just made

3: Perform the update. For each document affected:

Replace any affected fields in that document with the data_new values
Set change_complete to true

4: Set the transaction document's status to true (as all data has been modified successfully)

5: For each document affected by the transaction, do some clean up:

remove the data_old and data_new, as they're no longer needed
set lock_status to false (to unlock the document)

6: Remove the transaction document set up in step 1 (or as suggested, mark it as complete)

I think that logically works in such a way that if it fails at any point, all data can be either rolled back or the transaction can be continued (depending on what you want to do).  Obviously all rollback/recovery/etc. is performed by the application and not the database, by using the transaction documents and the documents in the other collections with that transaction_id.

Is there any glaring error in this logic that I've missed or overlooked?  Is there a more efficient way of going about it (e.g. less writing/reading from the database)?

asked Sep 30, 2015 by kinnari
0 votes

Related Hot Questions

1 Answer

0 votes
answered Sep 30, 2015 by mtabakade