-
Notifications
You must be signed in to change notification settings - Fork 649
ElasticSearch Plugin
Store full account and object data into indexed elastisearch database.
There are 2 main problems this plug-in tries to solve:
- The amount of RAM needed to run a full node with all the account history. Huge.
- Fast search inside operation fields directly querying the ES database.
Elastic search was selected for the main following reasons:
- Open source.
- Fast.
- Index oriented.
- Easy to install and start using.
- Send data from c++ using curl.
- Scalable and decentralized nodes of data possibilities.
The elasticsearch
plugin when active is connected to each block the node receives. Operations are extracted in a similar logic of the classic account_history_plugin
but sent to the ES database instead of storing internally. All fields from the operation are indexed for fast search.
The es-objects
plugin when active is connected to config specified objects types(limit order objects, asset objects, etc).
Both plugins work in a similar way, data is collected in plugin internal database until a good amount of them(configurable) is available, then is sent as a _bulk
operation to ES. _bulk
needs to be big when replaying but much more smaller when we are in sync to display real time data to end users.
Optimal numbers for speed/performance can depend on hardware, default values are provided.
It is very recommended that you use SSD disks in your node if you are trying to synchronize bitshares mainnet. It will make the task a lot faster. Still, the process of synchronizing the mainnet can take a few days.
You need 1T of space to be safe for a while, 32G or more of RAM is recommended.
After elasticsearch is installed increase heap size depending in your RAM:
$ vi config/jvm.options
..
# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
-Xms12g
-Xmx12g
...
You need to have bitshares-core and its dependencies installed(https://github.com/bitshares/bitshares-core#getting-started).
In ubuntu 18.04 all the dependencies for elasticsearch database are installed by default. Just get the last version(or desired version) at:
https://www.elastic.co/downloads/elasticsearch
$ tar xvzf elasticsearch-7.4.0-linux-x86_64.tar.gz
$ cd elasticsearch-7.4.0/
$./bin/elasticsearch
ES will listen in 127.0.0.1:9200
. Try http://127.0.0.1:9200/ in your browser and you should see some info about the database if the service started correctly.
You can put the binary as a service, program haves a --daemonize
option, can run inside screen
or any other option that suits you in order to keep the database running.
Please note ES does not run as root, make a normal user account by and proceed after:
adduser elastic
Clone the bitshares repo and install bitshares:
git clone https://github.com/bitshares/bitshares-core
cd bitshares-core
git checkout -t origin/develop
git submodule update --init --recursive
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo .
make
Start node with elasticsearch plugins enabled with default options:
./programs/witness_node --plugins "elasticsearch es-objects"
The ES plugin have the following parameters passed by command line:
-
elasticsearch-node-url
- The url of elasticsearch - default:http://localhost:9200/
-
elasticsearch-bulk-replay
- The number of lines(ops * 2) to send to database in replay state - default:10000
-
elasticsearch-bulk-sync
- The number of lines(ops * 2) to send to database at syncronized state - default:100
-
elasticsearch-visitor
- Index visitor additional inside op data - default:false
-
elasticsearch-basic-auth
- Send auth data i nthe form "username:password" - default: no auth "" -
elasticsearch-index-prefix
- A prefix for your indexes - default: "bitshares-"
ES plugin is not active by default, we need to start it with the plugins
parameter. An example of starting a node with ES plugin on the simplest form with all the default options will be:
programs/witness_node/witness_node --plugins "witness elasticsearch"
Note elasticsearch
plugin and account_history
plugin can not run the 2 at the same time.
A few minutes after the node start the first batch of 5000 ops will be inserted to the database. If you are in a desktop linux you may want to install https://github.com/mobz/elasticsearch-head (only works with elasticsearch 5) and see the database from the web browser to make sure if it is working. This is optional.
If you only have command line available you can query the database directly throw curl as:
root@NC-PH-1346-07:~/bitshares/elastic/bitshares-core# curl -X GET 'http://localhost:9200/bitshares-*/data/_count?pretty=true' -H 'Content-Type: application/json' -d '
{
"query" : {
"bool" : { "must" : [{"match_all": {}}] }
}
}
'
{
"count" : 360000,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
}
}
root@NC-PH-1346-07:~/bitshares/elastic/bitshares-core#
360000 records are inserted at this point of the replay in ES, means it is working.
Important: Replay with ES plugin will be always slower than the "save to ram" account_history_plugin
so expect to wait a lot more to be in sync than usual. With the recommended hardware the synchronization can take 30 hours.
A synchronized node will look like this(screen capture 02/08/2018):
root@NC-PH-1346-07:~# curl -X GET 'http://localhost:9200/bitshares-*/data/_count?pretty=true' -H 'Content-Type: application/json' -d '
{
"query" : {
"bool" : { "must" : [{"match_all": {}}] }
}
}
'
{
"count" : 391390823,
"_shards" : {
"total" : 175,
"successful" : 175,
"skipped" : 0,
"failed" : 0
}
}
root@NC-PH-1346-07:~#
Important: We have reports of the need of more than 250G of disk space at 2018-08-02 to save all the history and the logs for them. Please make sure you have enough disk before synchronizing.
The plugin creates monthly indexes in the ES database, index names are as graphene-2016-05
and contain all the operations made inside the monthly period.
List your indexes as:
NC-PH-1346-07:~# curl -X GET 'http://localhost:9200/_cat/indices'
yellow open bitshares-2018-02 voS1uchzSxqxqkiaKNrEYg 5 1 18984984 0 10.8gb 10.8gb
yellow open bitshares-2018-06 D6wyX58lRyG3QOflmPwJZw 5 1 28514130 0 15.6gb 15.6gb
yellow open bitshares-2017-10 73xRTA-fSTm479H4kOENuw 5 1 9326346 0 5.2gb 5.2gb
yellow open bitshares-2016-08 -MMp3VGGRZqG2YL1LQunbg 5 1 551835 0 270.1mb 270.1mb
yellow open bitshares-2016-07 Ao56gO9LQ-asMhX50rbcCg 5 1 609087 0 303.2mb 303.2mb
yellow open bitshares-2018-05 9xuof-PiRQWburpW8ZXHVg 5 1 29665610 0 17.3gb 17.3gb
yellow open bitshares-2017-01 SpfwEzGcSoy9Hd6c6fzv2g 5 1 1197124 0 659mb 659mb
yellow open bitshares-2017-12 tF5af4OvTLqcx3IYUJSQig 5 1 13244366 0 7.5gb 7.5gb
yellow open bitshares-2016-03 yy91IvyATOCEoFHjgDbalg 5 1 597461 0 297.4mb 297.4mb
yellow open bitshares-2015-12 z-ZAZqHsQL2EDNpf3_ghGA 5 1 349985 0 151.3mb 151.3mb
yellow open bitshares-2017-07 OOr_xW4STsCm3sev1xtTRQ 5 1 17890903 0 9.6gb 9.6gb
yellow open bitshares-2016-04 jt9q50ADQuylV4l25zGAaw 5 1 413798 0 205.6mb 205.6mb
yellow open bitshares-2016-11 mWz7DpjSQyqJ_rL8gtMqWw 5 1 495556 0 260mb 260mb
yellow open bitshares-2016-12 2qht_wrXTUmNqDvczpHYzw 5 1 917034 0 506.6mb 506.6mb
yellow open bitshares-2016-10 vAMb0kW6Stqz6CNbuu7PEQ 5 1 416570 0 208.8mb 208.8mb
yellow open bitshares-2015-11 ETNFuF3sTPe-gTSzX3bdIg 5 1 301079 0 131.9mb 131.9mb
yellow open bitshares-2017-08 73Q2Asw-Rf228oQLoSCLGw 5 1 9916248 0 5.6gb 5.6gb
yellow open bitshares-2016-05 3c95AvKcQk2puBwVt_HIqQ 5 1 498493 0 246mb 246mb
yellow open bitshares-2017-02 lsiiz7PmS2q9_P2BQpNkNQ 5 1 1104282 0 586.7mb 586.7mb
yellow open bitshares-2017-11 4pqwIRdWSwSe5198YNz-Nw 5 1 14107174 0 8gb 8gb
yellow open bitshares-2018-07 fdmfXLqSTESODyLI_7cjXg 5 1 133879948 0 51.3gb 51.3gb
yellow open bitshares-2016-06 Is11IdcnT8mfBPpoLUjJyw 5 1 656358 0 330.3mb 330.3mb
yellow open bitshares-2018-04 MEA8fCsgSbOVXa0Z05cfsA 5 1 20940461 0 11.9gb 11.9gb
yellow open bitshares-2018-03 fMjxhFwHSP-6ewrl0Ns6ZQ 5 1 20335546 0 12gb 12gb
yellow open bitshares-2017-09 o-b2Bf3LR0-J1kUiv4FpHA 5 1 11075939 0 6.3gb 6.3gb
yellow open bitshares-2018-01 jw9rYlmTSvuLC1hHcYyU4Q 5 1 19396703 0 11.2gb 11.2gb
yellow open bitshares-2018-08 EDRxQvxhQJe3Vam_FxZMWg 5 1 8038498 0 3gb 3gb
yellow open bitshares-2016-09 fo2AL0y7T_q_HtXEYCv35Q 5 1 409164 0 203.4mb 203.4mb
yellow open bitshares-2016-01 3sjjs-4oQMm5HG-vUTuyoA 5 1 372772 0 168.7mb 168.7mb
yellow open bitshares-2017-03 ZxjWksRyTaGstm6T2Kxl9A 5 1 2167788 0 1.1gb 1.1gb
yellow open bitshares-2016-02 toWbFwI-RB2wEGrR8873rQ 5 1 468174 0 222.7mb 222.7mb
yellow open bitshares-2017-05 IEZQ-rtmQU2kKNcRb58Egg 5 1 10278394 0 5.6gb 5.6gb
yellow open bitshares-2017-04 S1h2eBGiS3quNJU7CqPR7Q 5 1 3316120 0 1.8gb 1.8gb
yellow open bitshares-2017-06 0HYkECRbSwGDrmDFof8nqA 5 1 10795239 0 6gb 6gb
yellow open bitshares-2015-10 XyKOlrTWSK6vQgdXm8SAtQ 5 1 161004 0 84.5mb 84.5mb
root@NC-PH-1346-07:~#
If you don't see any index here then something is wrong with the bitshares-core node setup with elasticsearch plugin.
By default data indexes will be created with default elasticsearch settings. Node owner can tweak the default settings for all the bitshares-*
indexes before the addition of any data.
An example of a good index configuration is as follows:
todo
After your node is in sync you are in possession of a full node without the ram issues. A synchronized witness_node with ES will be using less than 10 gigs of ram:
total 8604280K
root@NC-PH-1346-07:~# pmap 2183
What client side apps can do with this new data is kind of unlimited to client developer imagination but lets check some real world examples to see the benefits of this new feature.
References: https://github.com/bitshares/bitshares-core/issues/358 https://github.com/bitshares/bitshares-core/issues/413 https://github.com/bitshares/bitshares-core/pull/405 https://github.com/bitshares/bitshares-core/pull/379 https://github.com/bitshares/bitshares-core/pull/430 https://github.com/bitshares/bitshares-ui/issues/68
This is one of the issues that has been requested constantly. It can be easily queried with ES plugin by calling the _search endpoint doing:
curl -X GET 'http://localhost:9200/bitshares-*/data/_search?pretty=true' -H 'Content-Type: application/json' -d '
{
"query" : {
"bool" : { "must" : [{"term": { "account_history.account.keyword": "1.2.282"}}, {"range": {"block_data.block_time": {"gte": "2015-10-26T00:00:00", "lte": "2015-10-29T23:59:59"}}}] }
}
}
'
Note Response is removed from the samples to save space in the document. If you are here you may want to see the response in your own place.
https://github.com/bitshares/bitshares-core/issues/61
curl -X GET 'http://localhost:9200/bitshares-*/data/_search?pretty=true' -H 'Content-Type: application/json' -d '
{
"query" : {
"bool" : { "must" : [{"term": { "account_history.account.keyword": "1.2.356589"}}, {"range": {"block_data.block_num": {"gte": "17824289", "lte": "17824290"}
}}] }
}
}
'
Refs: https://github.com/bitshares/bitshares-core/pull/373
The get_transaction_id
can be done as:
curl -X GET 'http://localhost:9200/bitshares-*/data/_search?pretty=true' -H 'Content-Type: application/json' -d '
{
"query" : {
"bool" : { "must" : [{"term": { "block_data.block_num": 19421114}},{"term": { "operation_history.trx_in_block": 0}}] }
}
}
'
The above will return all ops inside trx, if you only need the trx_id field you can add source
and just return the fields you need:
curl -X GET 'http://localhost:9200/bitshares-*/data/_search?pretty=true' -H 'Content-Type: application/json' -d '
{
"_source": ["block_data.trx_id"],
"query" : {
"bool" : { "must" : [{"term": { "block_data.block_num": 19421114}},{"term": { "operation_history.trx_in_block": 0}}] }
}
}
'
The get_transaction_from_id
is very easy:
curl -X GET 'http://localhost:9200/bitshares-*/data/_search?pretty=true' -H 'Content-Type: application/json' -d '
{
"query" : {
"bool" : { "must" : [{"term": { "block_data.trx_id": "6f2d5064637391089127aa9feb36e2092347466c"}}] }
}
}
'
The reader will need to learn more about elasticsearch and lucene query language in order to make more complex queries.
All needed can be found at https://www.elastic.co/guide/en/elasticsearch/reference/6.2/index.html
By the same team of elasticsearch there is a front end named kibana
(https://www.elastic.co/products/kibana). It is very easy to install and can do pretty good stuff like getting very detailed stats of the blockchain network.
Just as an example, it will be easy to index asset of trading operations by extending the visitor code of them. point 3 of https://github.com/bitshares/bitshares-core/issues/358 request trading pair, can be solved by indexing the asset of the trading ops as mentioned.
Remember ES already have all the needed info in the op
text field of the operation_history
object. Client can get all the ops of an account, loop throw them and convert the op
string into json being able to filter by the asset or any other field needed. There is no need to index everything but it is possible.
By using the op_type
= create
on each bulk line we send to the database and as we use an unique ID(ath id(2.9.X)) the plugin will not index any operation twice. If the node is on a replay, the plugin will start adding to database when it find a new record and never before.
It is not recommended to expose the elasticsearch api fully to the internet. Instead, applications will connect to a wrapper for data:
https://github.com/oxarbitrage/bitshares-es-wrapper
Elasticsearch database will listen in localhost and the wrapper in the same machine will expose the reduced set of API calls to the internet.