For bonus points, why is there a limit to the number of dups that can be dropped?
MongoDB is likely doing this to defend itself. If you
dropDups on the wrong field, you could hose the entire dataset and lock down the DB with delete operations (which are "as expensive" as writes).
How can I add the index and get rid of the duplicates?
So the first question is why are you creating a unique index on the
MongoDB creates a default
_id field that is automatically unique and indexed. By default MongoDB populates the
_id with an
ObjectId, however, you can override this with whatever value you like. So if you have a ready set of ID values, you can use those.
If you cannot re-import the values, then copy them to a new collection while changing
_id. You can then drop the old collection and rename the new one. (note that you will get a bunch of "duplicate key errors", ensure that your code catches and ignores them)