-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to index only documents which oplog transactions after certain datetime. #102
Comments
Hi, In the current release there is no way to do that by configuration. The last timestamp is stored in the For example: curl -XGET localhost:9200/_river/river76/mydb76.mycollec76?pretty=true
{
"_index" : "_river",
"_type" : "river76",
"_id" : "mydb76.mycollec76",
"_version" : 1,
"exists" : true, "_source" : {"mongodb":{"_last_ts":"{ \"$ts\" : 1373913931 ,
\"$inc\" : 1}"}}
} So you could probably set this value before to create the river settings. I could also probably add a parameter Thanks, |
Hi, Thank you for the quick response. To give you a quick idea of my set up I am using ElasticSearch with mongodb river on a .net application for all the operations on ElasticSearch I am using PlainElastic.Net as .net client. When doing the step you mentioned about setting timestamp I am getting a JsonParserError because of the way the .net client executes the command but when I am using the curl command I am able to do it. The only way I could be able to implement it in my application would be to add that in the options. Could you please add the last_timestamp parameter in the options. Thanks, Aditya |
Hi, I will include this feature in the next release. Thanks, |
Hi, It was a mistake from my end regarding setting _last_ts. I was not building the JSON in the proper format. In my oplog the number of documents(inserted) for a collection called "queryreadyproducts" are 811499. I set the _last_ts from an operation which happened towards the end of the collection. I am having the following problems:
Please go through the steps and let me know where I am making a mistake. Here are the log of steps I followed from the beginning:
PUT http://localhost:9200/queryreadyproducts
PUT http://localhost:9200/queryreadyproducts/queryreadyproduct/_mapping
PUT http://192.168.100.34:9200/_river/queryreadyproducts/brandviewdata.queryreadyproducts?pretty=true
PUT http://localhost:9200/_river/queryreadyproducts/brandviewdata.queryreadyproducts Thanks, Aditya A |
Hi, Please try: curl XPUT http://192.168.100.34:9200/_river/queryreadyproducts/brandviewdata.queryreadyproducts
{
"mongodb": {
"_last_ts": "{
\"$ts\": 1373647861,
\"$inc\": 1
}"
}
} _last_ts is BSONTimestamp. It seems that a BSONTimestamp object serialized in Json should have this format. |
Example in javascript (river will process the document with last timestamp of now + 5 seconds): "options": { "initial_timestamp": { "script_type": "js", "script": "var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());" } },
It's a very useful feature! Thank you for implementing it! Could you document it as well? |
Wiki has been updated [1]. |
Hi,
When indexing the river gets all the oplog entries for the mongodb collection and indexes them. I am wondering is there a way where we can query the oplog based on its 'ts' field to index or update the index for those operations filtered on the 'ts' field.
Thanks,
Aditya
The text was updated successfully, but these errors were encountered: