In this post, I’m going to insert a bunch of dummy records into a test collection, remove some (accidentally!) and then recover them from the online oplog.
First of all, let’s setup a basic directory structure, then start a stand-alone MongoDB instance for the test (Windows environment):
C:\Users\Administrator>mkdir d:\TestRepSet\db d:\TestRepSet\log d:\TestRepSet\backup C:\Users\Administrator>start mongod --port 27020 --dbpath d:\TestRepSet\db --replSet TestRepSet --oplogSize 128 --logpath d:\TestRepSet\log\TestRepSet.log
Next, we connect to the new instance and configure it as a single node replica set (oplog is only used for replica sets, so wouldn’t exist on a single, non-replicated instance):
C:\Users\Administrator>mongo --port 27020 MongoDB shell version v3.6.3 connecting to: mongodb://127.0.0.1:27020/ MongoDB server version: 3.6.3 // Configure replica set > rsconf = { _id: "TestRepSet", members: [{ _id: 0, host: "mdb1.vbox:27020" }] } { "_id" : "TestRepSet", "members" : [ { "_id" : 0, "host" : "mdb1.u-dom1.vbox:27020" } ] } // Initiate the set > rs.initiate( rsconf ) { "ok" : 1, "operationTime" : Timestamp(1523890232, 1), "$clusterTime" : { "clusterTime" : Timestamp(1523890232, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } TestRepSet:SECONDARY> TestRepSet:SECONDARY> show dbs admin 0.000GB config 0.000GB local 0.000GB TestRepSet:PRIMARY> use test switched to db test // Inserting 100 dummy records TestRepSet:PRIMARY> for(var i=0;i<100;i++) {db.dummyNamesCollection.insert({_id:i})} WriteResult({ "nInserted" : 1 }) // Lets insert another 100 dummy records, this time with names for(var i=0;i<200;i++) {db.dummyNamesCollection.insert({_id:i, "name" : "Iron Man"})} WriteResult({ "nInserted" : 1 }) // Check the document count for reference TestRepSet:PRIMARY> db.dummyNamesCollection.count() 200 // Remove 100 of those 'name' documents...accidentally! TestRepSet:PRIMARY> db.dummyNamesCollection.remove({"name" : "Iron Man"}) WriteResult({ "nRemoved" : 100 }) TestRepSet:PRIMARY> db.dummyNamesCollection.count() 100
Opps! …we didn’t mean to delete the 100 documents, so let’s mine the online oplog for the original insert operations (“op”:”i”) against the test.dummyNamesCollection collection:
//Switch to local which is there the oplog collection resides TestRepSet:PRIMARY> use local switched to db local // Limit the fetch to just the first 5 for now TestRepSet:PRIMARY> db.oplog.rs.find({"ns":"test.dummyNamesCollection","op":"i","o.name" : "Iron Man"}).limit(5) { "ts" : Timestamp(1524038245, 63), "t" : NumberLong(1), "h" : NumberLong("-596019791399272412"), "v" : 2, "op" : "i", "ns" : "test.dummyNamesCollection", "ui" : UUID("0e623d9c-722c-41d5-a5e6-83947cc2466e"), "wall" : ISODate("2018-04-18T07:57:25.065Z"), "o" : { "_id" : 100, "name" : "Iron Man" } } { "ts" : Timestamp(1524038245, 64), "t" : NumberLong(1), "h" : NumberLong("8668567148886123100"), "v" : 2, "op" : "i", "ns" : "test.dummyNamesCollection", "ui" : UUID("0e623d9c-722c-41d5-a5e6-83947cc2466e"), "wall" : ISODate("2018-04-18T07:57:25.065Z"), "o" : { "_id" : 101, "name" : "Iron Man" } } { "ts" : Timestamp(1524038245, 65), "t" : NumberLong(1), "h" : NumberLong("-614151834877236093"), "v" : 2, "op" : "i", "ns" : "test.dummyNamesCollection", "ui" : UUID("0e623d9c-722c-41d5-a5e6-83947cc2466e"), "wall" : ISODate("2018-04-18T07:57:25.065Z"), "o" : { "_id" : 102, "name" : "Iron Man" } } { "ts" : Timestamp(1524038245, 66), "t" : NumberLong(1), "h" : NumberLong("-1959048418001192557"), "v" : 2, "op" : "i", "ns" : "test.dummyNamesCollection", "ui" : UUID("0e623d9c-722c-41d5-a5e6-83947cc2466e"), "wall" : ISODate("2018-04-18T07:57:25.065Z"), "o" : { "_id" : 103, "name" : "Iron Man" } } { "ts" : Timestamp(1524038245, 67), "t" : NumberLong(1), "h" : NumberLong("-4277427315951537561"), "v" : 2, "op" : "i", "ns" : "test.dummyNamesCollection", "ui" : UUID("0e623d9c-722c-41d5-a5e6-83947cc2466e"), "wall" : ISODate("2018-04-18T07:57:25.065Z"), "o" : { "_id" : 104, "name" : "Iron Man" } } TestRepSet:PRIMARY> db.oplog.rs.find({op : "i", ns : "test.dummyNamesCollection", "o.name" : "Iron Man"}, {"o" : 1}).limit(5) { "o" : { "_id" : 100, "name" : "Iron Man" } } { "o" : { "_id" : 101, "name" : "Iron Man" } } { "o" : { "_id" : 102, "name" : "Iron Man" } } { "o" : { "_id" : 103, "name" : "Iron Man" } } { "o" : { "_id" : 104, "name" : "Iron Man" } } // Check all 100 are there with a count TestRepSet:PRIMARY> db.oplog.rs.find({"ns":"test.dummyNamesCollection","op":"i","o.name" : "Iron Man"}).count() 100
We have all 100 insert operations available, so let’s store them in an array and use that for our inserts:
TestRepSet:PRIMARY> var deletedDocs = db.oplog.rs.find({op : "i", ns : "test.dummyNamesCollection", "o.name" : "Iron Man"}, {"o" : 1}).toArray() TestRepSet:PRIMARY> deletedDocs.length 100 // Check how many documents we currently have (remember there were 200 originally) TestRepSet:PRIMARY> use test switched to db test TestRepSet:PRIMARY> db.dummyNamesCollection.count() 100 // Use a for loop to re-insert the documents using the array for (var i = 0; i < deletedDocs.length; i++) { db.dummyNamesCollection.insert({_id : deletedDocs[i].o._id, name : deletedDocs[i].o.name}); } // Check how many documents we have again TestRepSet:PRIMARY> db.dummyNamesCollection.count() 200 // Verification that 100 of those are the ones just recovered TestRepSet:PRIMARY> db.dummyNamesCollection.count({"name" : "Iron Man"}) 100
All done, we have the records back! 🙂
A few important things to note…
1) If a delayed secondary member is available, and contains the missing/updated data, you can always pull the records from there instead.
Here’s a quick example using mongoexport to read and write the recovery operations out to a JSON file, then an import of the JSON into the collection (which could be adapted if pulling from another member):
mongoexport --port 27020 -d local -c records -q '{"ns":"test.dummyNamesCollection","op":"i","o.name" : "Iron Man"}' --out D:\TestRepSet\backup\deletedDocs.json mongoimport --port 27020 -d test --collection dummyNamesCollection --file deletedDocs.json
2) oplog queries are full scans. On a busy system this could take some time, and could potentially impact performance for whilst the scan takes place.
3) Remember, there is no guarantee the inserts will still exist in the oplog. The oplog is a capped collection (and only available as part of a replica set), so the oldest records will be aged out of the collection when needed, depending on how busy the environment is and the sizing of the oplog (replication window).
It might be worthwhile taking a quick dump of the oplog before proceeding, just in case the operations age out before you get to re-insert them (that would be really unlucky, but entirely possible!). This can be done from any member of the replica set.
4) Depending on the output from the oplog, you may need to filter out duplicates etc. before applying the missing documents/changes – the above is a very basic demonstration.
5) Remember, the oplog is an internal system object, and so the format could change between releases.
Hope this helps!