Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read operation to server failed on database local #123

Closed
dblado opened this issue Sep 12, 2013 · 41 comments
Closed

read operation to server failed on database local #123

dblado opened this issue Sep 12, 2013 · 41 comments
Milestone

Comments

@dblado
Copy link

dblado commented Sep 12, 2013

I've been seeing the following exception in elasticsearch today:

[2013-09-12 22:36:49,823][ERROR][river.mongodb ] [M] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more

I did add two indexes to my mongo collection last night. In the mongo I see the following query executing (the client is the elasticsearch river):

    {
        "opid" : 3150,
        "active" : true,
        "secs_running" : 40,
        "op" : "query",
        "ns" : "local.oplog.rs",
        "query" : {
            "$query" : {
                "$and" : [
                    {
                        "$or" : [
                            {
                                "ns" : "vendop.vendor"
                            },
                            {
                                "ns" : "vendop.$cmd"
                            }
                        ]
                    },
                    {
                        "ts" : {
                            "$gt" : Timestamp(1378589134, 45)
                        }
                    }
                ]
            },
            "$orderby" : {
                "$natural" : 1
            }
        },
        "client" : "10.159.61.227:38940",
        "desc" : "conn103",
        "threadId" : "0x7f31d9c25700",
        "connectionId" : 103,
        "locks" : {
            "^" : "r",
            "^local" : "R"
        },
        "waitingForLock" : false,
        "numYields" : 171,
        "lockStats" : {
            "timeLockedMicros" : {
                "r" : NumberLong(79863946),
                "w" : NumberLong(0)
            },
            "timeAcquiringMicros" : {
                "r" : NumberLong(40534881),
                "w" : NumberLong(0)
            }
        }
    }
]

}

one of these queries takes a long time to execute:

Thu Sep 12 22:37:21.277 [conn87] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909236201096610 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 3873 locks(micros) r:1835294560 nreturned:92 reslen:15638 932067ms

Thu Sep 12 22:37:21.277 [conn90] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909235441195695 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 3099 locks(micros) r:1480829880 nreturned:92 reslen:15638 751916ms

Thu Sep 12 22:37:21.277 [conn91] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909237130526342 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 2894 locks(micros) r:1362870376 nreturned:92 reslen:15638 691878ms

Thu Sep 12 22:37:21.278 [conn85] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909235755883542 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 4315 locks(micros) r:2072460163 nreturned:92 reslen:15638 1052126ms

I'm on es 0.90.3, river-mongodb 1.7.0, mongodb 2.4.5.

any ideas what is causing the queries from river-mongodb to mongo to take so long to run?

@dblado
Copy link
Author

dblado commented Sep 13, 2013

updated w/ more log info:

The org.elasticsearch.action.NoShardAvailableActionException is also new. Any help would be greatly appreciated.

[2013-09-13 04:06:41,811][INFO ][node ] [Spiral] version[0.90.3], pid[11057], build[5c38d60/2013-08-06T13:18:31Z]
[2013-09-13 04:06:41,812][INFO ][node ] [Spiral] initializing ...
[2013-09-13 04:06:41,873][INFO ][plugins ] [Spiral] loaded [mongodb-river, mapper-attachments, lang-javascript], sites [river-mongodb, head]
[2013-09-13 04:06:44,635][INFO ][node ] [Spiral] initialized
[2013-09-13 04:06:44,635][INFO ][node ] [Spiral] starting ...
[2013-09-13 04:06:44,738][INFO ][transport ] [Spiral] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.159.61.227:9300]}
[2013-09-13 04:06:47,774][INFO ][cluster.service ] [Spiral] new_master [Spiral][1-o0xVXqTw-hIS9zNXLnZA][inet[/10.159.61.227:9300]], reason: zen-disco-join (elected_as_master)
[2013-09-13 04:06:47,810][INFO ][discovery ] [Spiral] elasticsearch/1-o0xVXqTw-hIS9zNXLnZA
[2013-09-13 04:06:47,838][INFO ][http ] [Spiral] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.159.61.227:9200]}
[2013-09-13 04:06:47,839][INFO ][node ] [Spiral] started
[2013-09-13 04:06:47,861][DEBUG][action.get ] [Spiral] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 04:06:48,773][INFO ][gateway ] [Spiral] recovered [2] indices into cluster_state
[2013-09-13 04:06:49,122][DEBUG][action.get ] [Spiral] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 04:06:49,344][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Prefix: [[Spiral] [mongodb][vendors_river] ] - name: [river.mongodb]
[2013-09-13 04:06:49,355][INFO ][river.mongodb ] mongoServersSettings: [{port=27017, host=db.prod.vendop.com}]
[2013-09-13 04:06:49,356][INFO ][river.mongodb ] Server: db.prod.vendop.com - 27017
[2013-09-13 04:06:49,359][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 04:06:49,730][TRACE][river.mongodb ] initialTimestamp script returned: 1379045214687
[2013-09-13 04:06:49,738][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] Using mongodb server(s): host [db.prod.vendop.com], port [27017]
[2013-09-13 04:06:49,739][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] MongoDB River Plugin version: [1.7.0]
[2013-09-13 04:06:49,739][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] starting mongodb stream. options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [500], gridfs [false], filter [], db [vendop], collection [vendor], script [null], indexing to [vendors]/[vendop]
[2013-09-13 04:06:49,964][TRACE][river.mongodb ] [Spiral] [mongodb][vendors_river] serverStatus: { "serverUsed" : "ip-10-181-140-155/10.181.140.155:27017" , "host" : "ip-10-181-140-155" , "version" : "2.4.5" , "process" : "mongod" , "pid" : 1719 , "uptime" : 25144.0 , "uptimeMillis" : 25144689 , "uptimeEstimate" : 24948.0 , "localTime" : { "$date" : "2013-09-13T04:06:49.934Z"} , "asserts" : { "regular" : 0 , "warning" : 0 , "msg" : 0 , "user" : 0 , "rollovers" : 0} , "backgroundFlushing" : { "flushes" : 419 , "total_ms" : 366 , "average_ms" : 0.8735083532219571 , "last_ms" : 0 , "last_finished" : { "$date" : "2013-09-13T04:06:45.362Z"}} , "connections" : { "current" : 32 , "available" : 19968 , "totalCreated" : 476} , "cursors" : { "totalOpen" : 34 , "clientCursors_size" : 34 , "timedOut" : 388 , "totalNoTimeout" : 23} , "dur" : { "commits" : 88 , "journaledMB" : 0.0 , "writeToDataFilesMB" : 0.0 , "compression" : 0.0 , "commitsInWriteLock" : 0 , "earlyCommits" : 0 , "timeMs" : { "dt" : 3026 , "prepLogBuffer" : 0 , "writeToJournal" : 0 , "writeToDataFiles" : 0 , "remapPrivateView" : 0}} , "extra_info" : { "note" : "fields vary by platform" , "heap_usage_bytes" : 64792968 , "page_faults" : 258863} , "globalLock" : { "totalTime" : 25144690000 , "lockTime" : 22065038 , "currentQueue" : { "total" : 0 , "readers" : 0 , "writers" : 0} , "activeClients" : { "total" : 23 , "readers" : 23 , "writers" : 0}} , "indexCounters" : { "accesses" : 1050 , "hits" : 1050 , "misses" : 0 , "resets" : 0 , "missRatio" : 0.0} , "locks" : { "." : { "timeLockedMicros" : { "R" : 1720996 , "W" : 22065038} , "timeAcquiringMicros" : { "R" : 2022298 , "W" : 859466476}} , "admin" : { "timeLockedMicros" : { } , "timeAcquiringMicros" : { }} , "local" : { "timeLockedMicros" : { "r" : 820853527560 , "w" : 66890} , "timeAcquiringMicros" : { "r" : 417564575283 , "w" : 894218}} , "vendop" : { "timeLockedMicros" : { "r" : 3367283 , "w" : 1696108} , "timeAcquiringMicros" : { "r" : 1756863 , "w" : 140851}} , "config" : { "timeLockedMicros" : { "r" : 23137 , "w" : 0} , "timeAcquiringMicros" : { "r" : 1159 , "w" : 0}}} , "network" : { "bytesIn" : 807799 , "bytesOut" : 15995589 , "numRequests" : 12997} , "opcounters" : { "insert" : 55 , "query" : 2600 , "update" : 11 , "delete" : 1 , "getmore" : 0 , "command" : 11593} , "opcountersRepl" : { "insert" : 0 , "query" : 0 , "update" : 0 , "delete" : 0 , "getmore" : 0 , "command" : 0} , "recordStats" : { "accessesNotInMemory" : 337 , "pageFaultExceptionsThrown" : 324 , "config" : { "accessesNotInMemory" : 0 , "pageFaultExceptionsThrown" : 0} , "local" : { "accessesNotInMemory" : 1 , "pageFaultExceptionsThrown" : 1} , "vendop" : { "accessesNotInMemory" : 336 , "pageFaultExceptionsThrown" : 323}} , "repl" : { "setName" : "set1" , "ismaster" : true , "secondary" : false , "hosts" : [ "ip-10-181-140-155:27017"] , "primary" : "ip-10-181-140-155:27017" , "me" : "ip-10-181-140-155:27017"} , "writeBacksQueued" : false , "mem" : { "bits" : 64 , "resident" : 7313 , "virtual" : 33443 , "supported" : true , "mapped" : 16409 , "mappedWithJournal" : 32818} , "metrics" : { "document" : { "deleted" : 1 , "inserted" : 55 , "returned" : 42419 , "updated" : 11} , "getLastError" : { "wtime" : { "num" : 0 , "totalMillis" : 0} , "wtimeouts" : 0} , "operation" : { "fastmod" : 3 , "idhack" : 247 , "scanAndOrder" : 17} , "queryExecutor" : { "scanned" : 7532378605} , "record" : { "moves" : 3} , "repl" : { "apply" : { "batches" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0} , "buffer" : { "count" : 0 , "maxSizeBytes" : 268435456 , "sizeBytes" : 0} , "network" : { "bytes" : 0 , "getmores" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0 , "readersCreated" : 3} , "oplog" : { "insert" : { "num" : 68 , "totalMillis" : 11} , "insertBytes" : 13725} , "preload" : { "docs" : { "num" : 0 , "totalMillis" : 0} , "indexes" : { "num" : 0 , "totalMillis" : 0}}} , "ttl" : { "deletedDocuments" : 0 , "passes" : 419}} , "ok" : 1.0}
[2013-09-13 04:06:49,964][TRACE][river.mongodb ] [Spiral] [mongodb][vendors_river] process: mongod
[2013-09-13 04:06:50,005][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:06:50,009][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

[2013-09-13 04:07:50,074][ERROR][river.mongodb ] [Spiral] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 04:07:50,077][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:07:50,078][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}
[2013-09-13 04:08:50,138][ERROR][river.mongodb ] [Spiral] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 04:08:50,141][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:08:50,142][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

@richardwilly98
Copy link
Owner

How big is "oplog.rs" collection?
The river is using a tailable cursor to query this capped collection so that should be fast.
Regarding the second exception "org.elasticsearch.action.NoShardAvailableActionException" it is a new known issue with ES 0.90.3. It does not seem to affect the river but I am currently looking at it.

@dblado
Copy link
Author

dblado commented Sep 13, 2013

the oplog.rs collection is 9553 (I believe):

set1:PRIMARY> db.printReplicationInfo()
configured oplog size: 9553.9705078125MB
log length start to end: 2219103secs (616.42hrs)
oplog first event time: Sun Aug 18 2013 21:05:59 GMT+0000 (UTC)
oplog last event time: Fri Sep 13 2013 13:31:02 GMT+0000 (UTC)
now: Fri Sep 13 2013 15:58:46 GMT+0000 (UTC)
set1:PRIMARY> db.getReplicationInfo()
{
"logSizeMB" : 9553.9705078125,
"usedMB" : 8992.25,
"timeDiff" : 2219103,
"timeDiffHours" : 616.42,
"tFirst" : "Sun Aug 18 2013 21:05:59 GMT+0000 (UTC)",
"tLast" : "Fri Sep 13 2013 13:31:02 GMT+0000 (UTC)",
"now" : "Fri Sep 13 2013 15:58:58 GMT+0000 (UTC)"
}

@richardwilly98
Copy link
Owner

@dblado I was actually asking the number of rows in "oplog.rs" (sorry for the confusion).

The way the river monitors this collection is the recommended way by Mongo [1] to access mapped collection using a tailable cursor.

I have the feeling something is not correctly setup in your MongoDB instance.

[1] - http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/

@dblado
Copy link
Author

dblado commented Sep 13, 2013

no worries -- db.oplog.rs.count() shows 18878152 records

On Fri, Sep 13, 2013 at 9:16 AM, Richard Louapre
[email protected]:

@dblado https://github.com/dblado I was actually asking the number of
rows in "oplog.rs" (sorry for the confusion).

The way the river monitors this collection is the recommended way by Mongo
[1] to access mapped collection using a tailable cursor.

I have the feeling something is not correctly setup in your MongoDB
instance.

[1] - http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24405936
.

@richardwilly98
Copy link
Owner

Hi,

There are actually hardcoded values for "connect timeout" and "socket timeout" in the current implementation... That seems to be the issue has it takes more than 15 seconds in your scenario.

    // TODO: MongoClientOptions should be configurable
    private Mongo getMongoClient() {
        if (mongo == null) {
            Builder builder = MongoClientOptions.builder()
                    .autoConnectRetry(true).connectTimeout(15000)
                    .socketKeepAlive(true).socketTimeout(60000);
            if (definition.isMongoUseSSL()) {
                builder.socketFactory(getSSLSocketFactory());
            }

            MongoClientOptions mco = builder.build();
            mongo = new MongoClient(definition.getMongoServers(), mco);
        }
        return mongo;
    }

I will make a change to be able to specify these parameters in the river settings.

@dblado
Copy link
Author

dblado commented Sep 13, 2013

Yea but I don't the indexing through the river to be taking so darn long --
I want as near real time as possible. I have the db and es on 2 m1.large
instances in EC2 -- could my problem be undersized hardware or nodes?

On Fri, Sep 13, 2013 at 9:37 AM, Richard Louapre
[email protected]:

Hi,

There are actually hardcoded values for "connect timeout" and "socket
timeout" in the current implementation... That seems to be the issue has it
takes more than 15 seconds in your scenario.

// TODO: MongoClientOptions should be configurable
private Mongo getMongoClient() {
    if (mongo == null) {
        Builder builder = MongoClientOptions.builder()
                .autoConnectRetry(true).connectTimeout(15000)
                .socketKeepAlive(true).socketTimeout(60000);
        if (definition.isMongoUseSSL()) {
            builder.socketFactory(getSSLSocketFactory());
        }

        MongoClientOptions mco = builder.build();
        mongo = new MongoClient(definition.getMongoServers(), mco);
    }
    return mongo;
}

I will make a change to be able to specify these parameters in the river
settings.


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24407363
.

@dblado
Copy link
Author

dblado commented Sep 13, 2013

the queries that the river sends to mongo are taking 20+ minutes to execute. the load on both m1.large instances are near 0 since i'm the only one using it at the moment. there might be a few other hits to the db and es from the app but not much

@richardwilly98
Copy link
Owner

Can you do an .explain() on the query to see if it can be optimized or indexes can be created?
I will have a new SNAPSHOT ready for testing soon (if you are interested).

@dblado
Copy link
Author

dblado commented Sep 13, 2013

not sure how to do the explain...I thought I could do:

set1:PRIMARY> db.oplog.rs.find({ $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }).explain()
Fri Sep 13 17:24:15.723 JavaScript execution failed: SyntaxError: Unexpected number

I just copied one of the offending queries from the mongo logfile

@dblado
Copy link
Author

dblado commented Sep 13, 2013

ya, send me a jar file w/ the timeout hardcode increased (or customizable) and I'll give it a try -- maybe once the river receives the response to this specific query my problem will go away. Although I did set an initial timestamp on the river to force to ignore old oplog entries

@richardwilly98
Copy link
Owner

Here is the snapshot [1]. Please give it a try.

No need to change the river settings.

Just stop ES replace the jar files and restart.

[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip

@dblado
Copy link
Author

dblado commented Sep 13, 2013

still seeing the read timed out exception:

[2013-09-13 17:31:45,243][INFO ][node ] [Jimmy Woo] version[0.90.3], pid[13379], build[5c38d60/2013-08-06T13:18:31Z]
[2013-09-13 17:31:45,244][INFO ][node ] [Jimmy Woo] initializing ...
[2013-09-13 17:31:45,306][INFO ][plugins ] [Jimmy Woo] loaded [mongodb-river, mapper-attachments, lang-javascript], sites [river-mongodb, head]
[2013-09-13 17:31:48,056][INFO ][node ] [Jimmy Woo] initialized
[2013-09-13 17:31:48,056][INFO ][node ] [Jimmy Woo] starting ...
[2013-09-13 17:31:48,157][INFO ][transport ] [Jimmy Woo] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.159.61.227:9300]}
[2013-09-13 17:31:51,193][INFO ][cluster.service ] [Jimmy Woo] new_master [Jimmy Woo][rLYcLpO8QtyDJ1JaGrRvIg][inet[/10.159.61.227:9300]], reason: zen-disco-join (elected_as_master)
[2013-09-13 17:31:51,229][INFO ][discovery ] [Jimmy Woo] elasticsearch/rLYcLpO8QtyDJ1JaGrRvIg
[2013-09-13 17:31:51,246][INFO ][http ] [Jimmy Woo] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.159.61.227:9200]}
[2013-09-13 17:31:51,246][INFO ][node ] [Jimmy Woo] started
[2013-09-13 17:31:51,281][DEBUG][action.get ] [Jimmy Woo] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 17:31:52,232][INFO ][gateway ] [Jimmy Woo] recovered [2] indices into cluster_state
[2013-09-13 17:31:52,396][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Prefix: [[Jimmy Woo] [mongodb][vendors_river] ] - name: [river.mongodb]
[2013-09-13 17:31:52,406][INFO ][river.mongodb ] mongoServersSettings: [{port=27017, host=db.prod.vendop.com}]
[2013-09-13 17:31:52,406][INFO ][river.mongodb ] Server: db.prod.vendop.com - 27017
[2013-09-13 17:31:52,568][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 17:31:52,827][TRACE][river.mongodb ] initialTimestamp script returned: 1379093517797
[2013-09-13 17:31:52,835][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using mongodb server(s): host [db.prod.vendop.com], port [27017]
[2013-09-13 17:31:52,836][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] MongoDB River Plugin version: [1.7.0]
[2013-09-13 17:31:52,836][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] starting mongodb stream. options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [500], gridfs [false], filter [], db [vendop], collection [vendor], script [null], indexing to [vendors]/[vendop]
[2013-09-13 17:31:53,286][TRACE][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] serverStatus: { "serverUsed" : "ip-10-181-140-155/10.181.140.155:27017" , "host" : "ip-10-181-140-155" , "version" : "2.4.5" , "process" : "mongod" , "pid" : 4808 , "uptime" : 41183.0 , "uptimeMillis" : 41182854 , "uptimeEstimate" : 40862.0 , "localTime" : { "$date" : "2013-09-13T17:31:53.170Z"} , "asserts" : { "regular" : 0 , "warning" : 0 , "msg" : 0 , "user" : 0 , "rollovers" : 0} , "backgroundFlushing" : { "flushes" : 686 , "total_ms" : 136 , "average_ms" : 0.19825072886297376 , "last_ms" : 0 , "last_finished" : { "$date" : "2013-09-13T17:31:30.401Z"}} , "connections" : { "current" : 31 , "available" : 19969 , "totalCreated" : 699} , "cursors" : { "totalOpen" : 34 , "clientCursors_size" : 34 , "timedOut" : 652 , "totalNoTimeout" : 20} , "dur" : { "commits" : 88 , "journaledMB" : 0.0 , "writeToDataFilesMB" : 0.0 , "compression" : 0.0 , "commitsInWriteLock" : 0 , "earlyCommits" : 0 , "timeMs" : { "dt" : 3033 , "prepLogBuffer" : 0 , "writeToJournal" : 0 , "writeToDataFiles" : 0 , "remapPrivateView" : 0}} , "extra_info" : { "note" : "fields vary by platform" , "heap_usage_bytes" : 64683304 , "page_faults" : 372578} , "globalLock" : { "totalTime" : 41182884000 , "lockTime" : 22903163 , "currentQueue" : { "total" : 0 , "readers" : 0 , "writers" : 0} , "activeClients" : { "total" : 20 , "readers" : 20 , "writers" : 0}} , "indexCounters" : { "accesses" : 318 , "hits" : 318 , "misses" : 0 , "resets" : 0 , "missRatio" : 0.0} , "locks" : { "." : { "timeLockedMicros" : { "R" : 2835632 , "W" : 22903163} , "timeAcquiringMicros" : { "R" : 3015339 , "W" : 1469808177}} , "admin" : { "timeLockedMicros" : { } , "timeAcquiringMicros" : { }} , "local" : { "timeLockedMicros" : { "r" : 1429902579128 , "w" : 58855} , "timeAcquiringMicros" : { "r" : 727932420060 , "w" : 227207}} , "vendop" : { "timeLockedMicros" : { "r" : 2435005 , "w" : 1234970} , "timeAcquiringMicros" : { "r" : 828842 , "w" : 55977}} , "test" : { "timeLockedMicros" : { "r" : 18719 , "w" : 0} , "timeAcquiringMicros" : { "r" : 90 , "w" : 0}}} , "network" : { "bytesIn" : 1105334 , "bytesOut" : 10207728 , "numRequests" : 18142} , "opcounters" : { "insert" : 9 , "query" : 2973 , "update" : 5 , "delete" : 1 , "getmore" : 0 , "command" : 16927} , "opcountersRepl" : { "insert" : 0 , "query" : 0 , "update" : 0 , "delete" : 0 , "getmore" : 0 , "command" : 0} , "recordStats" : { "accessesNotInMemory" : 102 , "pageFaultExceptionsThrown" : 93 , "local" : { "accessesNotInMemory" : 1 , "pageFaultExceptionsThrown" : 1} , "test" : { "accessesNotInMemory" : 0 , "pageFaultExceptionsThrown" : 0} , "vendop" : { "accessesNotInMemory" : 101 , "pageFaultExceptionsThrown" : 92}} , "repl" : { "setName" : "set1" , "ismaster" : true , "secondary" : false , "hosts" : [ "ip-10-181-140-155:27017"] , "primary" : "ip-10-181-140-155:27017" , "me" : "ip-10-181-140-155:27017"} , "writeBacksQueued" : false , "mem" : { "bits" : 64 , "resident" : 7283 , "virtual" : 33602 , "supported" : true , "mapped" : 16489 , "mappedWithJournal" : 32978} , "metrics" : { "document" : { "deleted" : 1 , "inserted" : 9 , "returned" : 72010 , "updated" : 5} , "getLastError" : { "wtime" : { "num" : 0 , "totalMillis" : 0} , "wtimeouts" : 0} , "operation" : { "fastmod" : 1 , "idhack" : 82 , "scanAndOrder" : 6} , "queryExecutor" : { "scanned" : 12572869515} , "record" : { "moves" : 3} , "repl" : { "apply" : { "batches" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0} , "buffer" : { "count" : 0 , "maxSizeBytes" : 268435456 , "sizeBytes" : 0} , "network" : { "bytes" : 0 , "getmores" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0 , "readersCreated" : 3} , "oplog" : { "insert" : { "num" : 14 , "totalMillis" : 5} , "insertBytes" : 4388} , "preload" : { "docs" : { "num" : 0 , "totalMillis" : 0} , "indexes" : { "num" : 0 , "totalMillis" : 0}}} , "ttl" : { "deletedDocuments" : 0 , "passes" : 686}} , "ok" : 1.0}
[2013-09-13 17:31:53,286][TRACE][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] process: mongod
[2013-09-13 17:31:53,346][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 17:31:53,350][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

[2013-09-13 17:32:53,383][ERROR][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 17:32:53,386][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 17:32:53,387][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

@dblado
Copy link
Author

dblado commented Sep 13, 2013

this explain seems to be doing something:
set1:PRIMARY> db.oplog.rs.find({ "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}).explain()

it's been running a couple minutes already but I guess that's normal because the explain() still runs the query normally just outputs additional data??

@richardwilly98
Copy link
Owner

I don't believe you are have correctly the SNAPHOT version. The log file should show:
MongoDB River Plugin version: [1.7.1-SNAPSHOT] instead of MongoDB River Plugin version: [1.7.0]

Did you replace elasticsearch-river-mongodb-1.7.0.jar by elasticsearch-river-mongodb-1.7.1-SNAPSHOT.jar in $ES_HOME/plugins/river-mongodb?

@dblado
Copy link
Author

dblado commented Sep 13, 2013

my bad -- I didn't delete 1.7.0 just copied in the SNAPSHOT -- running again

On Fri, Sep 13, 2013 at 11:09 AM, Richard Louapre
[email protected]:

I don't believe you are have correctly the SNAPHOT version. The log file
should show:
MongoDB River Plugin version: [1.7.1-SNAPSHOT] instead of MongoDB River
Plugin version: [1.7.0]

Did you replace elasticsearch-river-mongodb-1.7.0.jar by
elasticsearch-river-mongodb-1.7.1-SNAPSHOT.jar in
$ES_HOME/plugins/river-mongodb?


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24413525
.

@dblado
Copy link
Author

dblado commented Sep 13, 2013

in the startup of ES I'm seeing:

[2013-09-13 18:14:26,106][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 18:14:26,514][TRACE][river.mongodb ] initialTimestamp script returned: 1379096071448

but the query to mongo doesn't seem to respect the initial timestamp:
[2013-09-13 18:14:26,997][DEBUG][river.mongodb ] [Topspin] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

@dblado
Copy link
Author

dblado commented Sep 13, 2013

here is the explain:

set1:PRIMARY> db.oplog.rs.find({ "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}).explain()

{
"cursor" : "ForwardCappedCursor",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 18878164,
"nscanned" : 18878164,
"nscannedObjectsAllPlans" : 18878164,
"nscannedAllPlans" : 18878164,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 3989,
"nChunkSkips" : 0,
"millis" : 930870,
"indexBounds" : {

},
"server" : "ip-10-181-140-155:27017"

}

@dblado
Copy link
Author

dblado commented Sep 13, 2013

wow, just had LOTS of activity in the ES log:
seems like everything updated now in the index
http://pastebin.com/SyVfbrwY

@dblado
Copy link
Author

dblado commented Sep 13, 2013

things appear to be coming in live now -- so the last_ts in the river was GMT: Sat, 07 Sep 2013 21:25:34 GMT. According to my log files my issue has actually been happening for a while I just never noticed until yesterday. I'm guessing river queries were taking longer and longer each time because it was trying to play catchup??

@richardwilly98
Copy link
Owner

The initial sync can definitely take a long time but hopefully it should be fast once all data are in sync.
Is it what you are seeing?

Regarding the initial timestamp can you please provide your river settings so I can take a look?

@dblado
Copy link
Author

dblado commented Sep 13, 2013

here is my river setting:

{
"index": {
"name": "vendors"
},
"_enabled": true,
"_name": "vendors_river",
"mongodb": {
"db": "vendop",
"servers": [
{
"port": 27017,
"host": "db.prod.vendop.com"
}
],
"collection": "vendor",
"options": {
"initial_timestamp": {
"script_type": "js",
"script": "var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());"
},
"secondary_read_preference": true
}
},
"type": "mongodb"
}

@dblado
Copy link
Author

dblado commented Sep 13, 2013

as soon as it caught the river was near live but it's slowing down -- response time from mongo is 400000+ ms

@richardwilly98
Copy link
Owner

Did you try to register the river multiple times? Did you set "initial_timestamp" in one of your last version of the river settings?

"initial_timestamp" is used only if _last_ts does not exist.

So you want to use this parameter make sure the previous river settings have been deleted first:

  • curl -XDELETE localhost:9200/_river/vendors_river

@richardwilly98
Copy link
Owner

-- response time from mongo is 400000+ ms

Can you tell which query is slowing down?

@dblado
Copy link
Author

dblado commented Sep 13, 2013

ahh ok, will remember to delete the river and not use multiple versions to
modify the timestamp -- didn't see that in the docs.

seems all the queries form the river are slow still:

Fri Sep 13 18:20:51.854 [conn744] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188027466574947012 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 2320 locks(micros) r:1053631876 nreturned:101
reslen:17333 536445ms
Fri Sep 13 18:20:51.854 [conn744] end connection 10.159.61.227:39294 (17
connections now open)
Fri Sep 13 18:20:51.854 [conn741] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188027467292149267 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 2965 locks(micros) r:1408357829 nreturned:101
reslen:17333 716624ms
Fri Sep 13 18:20:51.854 [conn741] end connection 10.159.61.227:39291 (16
connections now open)
Fri Sep 13 18:20:51.858 [conn737] query local.oplog.rs query: { query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: { $ts: 1378589134.0, $inc: 45.0 } } } ] }, $explain: true }
ntoreturn:0 ntoskip:0 nscanned:18878164 keyUpdates:0 numYields: 3989
locks(micros) r:1831009723 nreturned:1 reslen:401 930874ms
Fri Sep 13 18:20:54.785 [conn754] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188039966367486858 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 1492 locks(micros) r:762280729 nreturned:101
reslen:17333 387781ms
Fri Sep 13 18:20:54.785 [conn746] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188039965764332883 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 1576 locks(micros) r:824228690 nreturned:101
reslen:17333 419325ms

looks like those are all for the same timestamp

On Fri, Sep 13, 2013 at 12:15 PM, Richard Louapre
[email protected]:

-- response time from mongo is 400000+ ms

Can you tell which query is slowing down?


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24417964
.

@richardwilly98
Copy link
Owner

It looks like there might be an undocumented QUERYOPTION_OPLOGREPLAY option [1] to make query on oplog.rs more efficient.

I will test it and let you know.

Thanks,
Richard.

[1] - https://jira.mongodb.org/browse/JAVA-771

@richardwilly98
Copy link
Owner

Can you please try this version [1]?

QUERYOPTION_OPLOGREPLAY has been implemented.

[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip

@dblado
Copy link
Author

dblado commented Sep 14, 2013

everything seems to have stabilized w/ the most recent change to use QUERYOPTION_OPLOGREPLAY. WIll continue monitoring and let you know if I have other issues.

Thanks so much for the quick turnaround.

@richardwilly98
Copy link
Owner

@dblado could you please provide the response time of the query using QUERYOPTION_OPLOGREPLAY?

@dblado
Copy link
Author

dblado commented Sep 15, 2013

What is odd is that the oplog is now empty...even when I make changes to the db...so nothing is being indexed??

@dblado
Copy link
Author

dblado commented Sep 15, 2013

oh nevermind, I was looking at localhost :)

here are log entries:

Sun Sep 15 21:13:42.290 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:114 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:13:47.302 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:93 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:13:52.314 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:124 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:13:57.325 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:103 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:14:02.337 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:137 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:07.349 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:135 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:12.361 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:161 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:17.373 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:106 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:33.228 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:114 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:38.240 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:111 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:43.252 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:122 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:16:48.263 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:104 nreturned:0 reslen:20 5010ms

@dblado
Copy link
Author

dblado commented Sep 16, 2013

each query is still taking about 5 seconds -- considering that there are no updated records in the db in the last few hours that seems a bit high to me

@richardwilly98
Copy link
Owner

Could you please provide the response time for these 3 queries (sorry I don't have a MongoDB instance with a huge oplog.rs):

  • db.oplog.rs.find({ $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } })
  • db.oplog.rs.find({ $query: { ts: { $gt: Timestamp 1379278843000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } })
  • db.oplog.rs.find({ $query: { ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } })

@dblado
Copy link
Author

dblado commented Sep 16, 2013

Had to modify them to:

db.oplog.rs.find({ $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp(1379278843,1) } }, $orderby: { $natural: 1 } })

db.oplog.rs.find({ $query: { ts: { $gt: Timestamp(1379278843,1) }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } })

db.oplog.rs.find({ $query: { ts: { $gt: Timestamp(1379278843,1) } }, $orderby: { $natural: 1 } })

waiting for the results

@dblado
Copy link
Author

dblado commented Sep 16, 2013

results:

Mon Sep 16 16:50:40.649 [conn15] query local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 919 locks(micros) r:613458050 nreturned:0 reslen:20 308179ms

Mon Sep 16 16:52:50.190 [conn17] query local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 1062 locks(micros) r:701058589 nreturned:0 reslen:20 351803ms

Mon Sep 16 16:52:50.190 [conn16] query local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379278843000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 1164 locks(micros) r:770496332 nreturned:0 reslen:20 386413ms

@dblado
Copy link
Author

dblado commented Sep 16, 2013

the river is producing these queries right now @ 5seconds per query:

Mon Sep 16 17:03:09.425 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:108 nreturned:0 reslen:20 5010ms
Mon Sep 16 17:03:12.546 [conn7] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:12.546 [conn7] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:12.546 [conn3] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:12.547 [conn3] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:14.437 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:157 nreturned:0 reslen:20 5011ms
Mon Sep 16 17:03:17.547 [conn7] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:17.547 [conn7] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:17.547 [conn3] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:17.548 [conn3] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:19.448 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:92 nreturned:0 reslen:20 5010ms

@richardwilly98
Copy link
Owner

5 seconds seems still a lot to me.

I have changed the order if the index filter in oplog.rs: ts comes first.

Can you please give a try to this version [1]?

[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip

@dblado
Copy link
Author

dblado commented Sep 18, 2013

yea 5 seconds is still a lot of time.

new jar didn't change it

Wed Sep 18 21:11:05.118 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:188 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:09.905 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:09.905 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:09.986 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:09.986 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:10.130 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:187 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:14.908 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:14.908 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:14.988 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:14.988 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:15.142 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:182 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:19.910 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:19.910 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:19.989 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:19.989 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:20.154 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:187 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:22.439 [DataFileSync] flushing mmaps took 0ms for 17 files
Wed Sep 18 21:11:24.913 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:24.913 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:24.990 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:24.990 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:25.166 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:195 nreturned:0 reslen:20 5011ms

@richardwilly98
Copy link
Owner

@dblado I have just create a new post in MongoDB user group [1].

[1] - https://groups.google.com/forum/#!topic/mongodb-user/E7BSv624nBg

@richardwilly98
Copy link
Owner

@dblado just a quick note according to Asya Kamsky the log should be interpreted as:
Wed Sep 18 21:11:25.166 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:195 nreturned:0 reslen:20 5011ms

r:195 -> it took 195 microseconds, the rest of the time was spent waiting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants