Organizational Research By

Surprising Reserch Topic

how can i delete duplicates in mongodb


how can i delete duplicates in mongodb  using -'mongodb,indexing,duplicates,duplicate-removal'

I have a large collection (~2.7 million documents) in mongodb, and there are a lot of duplicates. I tried running ensureIndex({id:1}, {unique:true, dropDups:true}) on the collection. Mongo churns away at it for a while before it decides that too many dups on index build with dropDups=true.

How can I add the index and get rid of the duplicates? Or the other way around, what's the best way to delete some dups so that mongo can successfully build the index?

For bonus points, why is there a limit to the number of dups that can be dropped?
    
asked Sep 26, 2015 by thiru
0 votes
48 views



Related Hot Questions



Government Jobs Opening


...