A utility to tail a Sharded Replicated MongoDb cluster's oplog as datasource & populate given Elasticsearch cluster from it.
The application can be ran as a docker container using the following command.
docker run -v /PATH/TO/CONFIG/app_conf.json:/app_conf.json --name Mongo-ES-DataSync -itd harshitandro/mongo-es-datasync:TAGNAME
- /PATH/TO/CONFIG/app_conf.json should be the path to application config file in prescribed format given below.
- Docker image tag can be found at project's dockerhub repo .
The config file has the following format:
{
"application": {
"lastTimestampToResume": 0,
"logLevel": "INFO"
},
"elasticsearch": {
"elasticURL": "127.0.0.1:9200",
"batchProcessingSize": 1000
},
"db": {
"mongo": {
"dbsToMonitor": [
"test"
],
"queryRouterAddr": "127.0.0.1:27007",
"auth": {
"username": "",
"password": "",
"source": ""
}
}
}
}
application.lastTimestampToResume
determines the mongoDB oplog timestamp from which the application should resume sync to ES.elasticsearch.elasticURL
is the elastic URL of your ES cluster.elasticsearch.batchProcessingSize
is the size of batch of docs to process at once in ES.db.mongo.dbsToMonitor
is a list of dbs in your mongo which are to be synced to ES.db.mongo.queryRouterAddr
is the address of your mongoDb's Query Router.- [OPTIONAL]
db.mongo.auth
defines the authentication credentials for the given MongoDB.db.mongo.auth.username
defines the authentication username.db.mongo.auth.password
defines the authentication password.db.mongo.auth.source
defines the authentication database of the given username.
Following feature set is currently present:
- Oplog tailing from Sharded Replicated MongoDb cluster.
- Auto resume from last operation state in case of application restart.
- Auto reconnect in case of connectivity failure from either mongoDB or Elasticsearch.
- Batch Processing for Elasticsearch & MongoDB.
- Authentication for mongoDB connections.
- Authentication for elasticsearch connections.
- Sharded Replicated MongoDb Cluster Version 3.6 & above
- Elasticsearch Version 7.x