Update doc about listener's protocols

manticoresoftware · Aug 18, 2020 · b6beb3b · b6beb3b
1 parent f08dc78
commit b6beb3b
Show file tree

Hide file tree

Showing 5 changed files with 69 additions and 17 deletions.
diff --git a/manual/Connecting_to_the_server/HTTP.md b/manual/Connecting_to_the_server/HTTP.md
@@ -3,19 +3,21 @@
 You can connect to Manticore Search using the HTTP protocol.
 
 ## Configuration
-**The HTTP protocol is by default available on port 9308**. 
+**The HTTP protocol is by default available on ports 9308 and 9312**. It shares same port as api using multiprotocol feature.
 
 In the searchd section of the configuration file the HTTP port can be defined with directive `listen` like this:
 
 ```ini
 searchd {
 ...
-   listen = 127.0.0.1:9308:http
+   listen = 127.0.0.1:9308
+   listen = 127.0.0.1:9312:http
 ...
 }
 ```
 
-There are no special requirements and any HTTP client can be used to connect to Manticore. 
+Both lines are valid and equal by meaning (except port num). `http` is just left for back-compatibility, and both lines in example above
+ defines listeners which will serve all api/http/https protocols. There are no special requirements and any HTTP client can be used to connect to Manticore. 
 
 All endpoints respond with `application/json` content type. Most endpoints use JSON payload for requests, however there are some exceptions that use NDJSON or simple URL encoded payload.
 
@@ -29,22 +31,32 @@ A separate HTTP interface can be used to perform 'VIP' connections. A connection
 ```ini
 searchd {
 ...
-   listen = 127.0.0.1:9308:http
-   listen = 127.0.0.1:9318:http_vip
+   listen = 127.0.0.1:9308
+   listen = 127.0.0.1:9318:_vip
 ...
 }
 ``` 
-The HTTP protocol also supports [SSL encryption](Security/SSL.md). In this case the connection type used will be `https`:
+The HTTP protocol also supports [SSL encryption](Security/SSL.md). It may be used on the same port as http. Daemon just determines protocol by first
+few bytes came from client and behaves according to it. However, https is about security, so you can strength listener by using special connection type `https`.
 
 ```ini
 searchd {
 ...
-   listen = 127.0.0.1:9308:http
+   listen = 127.0.0.1:9308
    listen = 127.0.0.1:9443:https
 ...
 }
 ``` 
 
+Here you can connect using https to both ports. However to 9308 you may also connect using http, or provide that point as remote agent in distr indexes.
+When trying to connect with http, it will just work. For https, if valid key/cert provided in config, it also will just work.
+If no valid key/cert provided, but client tries to connect via https - it will silently fallback to unsecured http.
+In the same time 9443 is strictly glued to https only. When trying to connect using http, it will answer with code 403.
+If clients tries to connect with https, but daemon can't serve it by any reason (most probably - because it has no valid key/cert ) - it will answer
+with 526 error code. No silent fall-back to unencrypted http will happen anyway.
+
+Apart ssl encryption there is no difference between http and https.
+
 ## Connecting with cURL
 Performing a quick search is as easy as:
 

diff --git a/manual/Connecting_to_the_server/MySQL_protocol.md b/manual/Connecting_to_the_server/MySQL_protocol.md
@@ -45,7 +45,7 @@ mysql -P9306 -h0
 
 ## Secured MySQL connection
 
-The MySQL protocol supports [SSL encryption](Security/SSL.md). The secured connections  can be made on the same `mysql` listening port.
+The MySQL protocol supports [SSL encryption](Security/SSL.md). The secured connections can be made on the same `mysql` listening port.
 
 ## Compressed MySQL connection
 

diff --git a/manual/References.md b/manual/References.md
@@ -386,7 +386,7 @@ To be put to section `searchd {}` in configuration file:
   * [listen_tfo](Creating_an_index/Creating_a_distributed_index/Remote_indexes.md#listen_tfo) - Allows TCP_FASTOPEN flag for all listeners
   * [log](Server_settings/Searchd.md#log) - Path to Manticore server log file
   * [max_batch_queries](Server_settings/Searchd.md#max_batch_queries) - Limits the amount of queries per batch
-  * [max_connections](Server_settings/Searchd.md#max_connections) - Maximum amount of worker threads
+  * [max_connections](Server_settings/Searchd.md#max_connections) - Maximum amount of active connections
   * [max_filters](Server_settings/Searchd.md#max_filters) - Maximum allowed per-query filter count
   * [max_filter_values](Server_settings/Searchd.md#max_filter_values) - Maximum allowed per-filter values count
   * [max_open_files](Server_settings/Searchd.md#max_open_files) - Maximum num of files which allowed to be opened by server

diff --git a/manual/Security/SSL.md b/manual/Security/SSL.md
@@ -69,4 +69,26 @@ When done you can verify the key and certificate files were generated correctly:
 
 ```bash
 openssl verify -CAfile ca-cert.pem server-cert.pem
-```
+```
+
+## Secured behavior
+
+When SSL config is present and valid, so that used SSL lib can recognize it and establish secured layer of connection,
+following achievements are available and may be used:
+
+ - you can connect to multiprotocol port with https and run queries. Both query and answer will be send with ssl encryption.
+ - you can connect to dedicated `https` port with http and run queries. Connection will be secured. (attempt to connect to this port via plain http will be rejected with 403 error code).
+ - you can connect to mysql port with mysql client using secured connection. Session will be secured. Note, that cli `mysql` program tries to use ssl by
+   default, so usual connect to the daemon in case it has valid SSL config most probably will be secured. You may check it running 'status' command in cli.
+
+When SSL config is NOT valid by any reason, which daemon detects by the fact that secured connection can't be established (apart non-valid config it might be other reasons, like just unability to load appropriate SSL lib at all, because of any reason - non-compatible, absent, etc.) followihg things will not work or work non-secured way:
+
+- you can connect to multiprotocol port with https. But conection will be silently fall-back to unencrypted http.
+- you can't connect to dedicated `https` port at all. Connection will be rejected with 526 error code.
+- connect to mysql port via mysql client will not propagate possibility of ssl securing. So, if client demands it, it will fail. If not - it will use plain mysql41 or compressed connection.
+
+NOTE!
+
+- binary API connections (as connections from old clients, or inter-daemons master-agent connections) are never secured.
+- replication is managed by third-party provider, which has to be set up separately. However SST stage of replication in current realization is done by binary API connection, and so, never secured.
+- with this 'never secured' statement, however, you still can use any external proxies (like legacy tunneling with ssh) which will provide encryption.
diff --git a/manual/Server_settings/Searchd.md b/manual/Server_settings/Searchd.md
@@ -355,11 +355,11 @@ You can also specify a protocol handler (listener) to be used for connections on
   - other Manticore agents (i.e. a remote distributed index)
   - clients via HTTP and HTTPS
   This is a default setting and mostly you need to specify another `listen` only for connecting via MySQL protocol and for replication.
-* `mysql` - MySQL protocol for connections from MySQL clients. More details on MySQL protocol support can be found in [mysql_protocol_support_and_sphinxql](Connecting_to_the_server/HTTP.md#SQL-over-HTTP) section.
+* `mysql` - MySQL protocol for connections from MySQL clients. More details on MySQL protocol support can be found in [mysql_protocol_support_and_sphinxql](Connecting_to_the_server/HTTP.md#SQL-over-HTTP) section. Shortly: you may connect with mysql41 and compressed proto. If ssl is available (i.e. lib is present, config is valid), then you may connect with mysql41 and compressed via ssl.
 * `replication` - replication protocol, used for nodes communication. More details can be found in [replication](Creating_a_cluster/Setting_up_replication/Setting_up_replication.md) section.
-* `http` - HTTP protocol. Use it to allow only http connections.
+* `http` - same as **Not specified**. Manticore will accept connections at this port from remote agents and clients via HTTP and HTTPS.
 * `https` - HTTPS protocol. It uses OpenSSL library to encrypt HTTP traffic. More details can be found in [SSL](Security/SSL.md) section. Use it to allow only http connections.
-* `sphinx` - Binary protocol. Use it to allow only connections from remote Manticore agents or clients based on binary protocol.
+* `sphinx` - legacy binary protocol. Use it to serve connections from remote sphinxSE clients.
 
 Adding a `_vip` suffix to a protocol (for instance `mysql_vip` or `http_vip`) makes all connections to that port bypass the thread pool and always forcibly create a new dedicated thread. That's useful for managing in case of a severe overload when the server would either stall or not let you connect via a regular port.
 
@@ -379,13 +379,30 @@ listen = localhost:9306:mysql
 listen = 127.0.0.1:9308:http
 listen = 192.168.0.1:9320-9328:replication
 listen = 127.0.0.1:9443:https
+listen = 127.0.0.1:9312:sphinx
 ```
 <!-- end -->
 
-There can be multiple listen directives, `searchd` will listen for client connections on all specified ports and sockets.  If no `listen` directives are found then the server will listen on ports **9308** and **9312** for connections from remote agents and non-mysql based clients and on port **9306** for MySQL connections.
+There can be multiple listen directives, `searchd` will listen for client connections on all specified ports and sockets.  Default config provided in package defines listening on ports **9308** and **9312** for connections from remote agents and non-mysql based clients and on port **9306** for MySQL connections. If no `listen` directives are found then the server will listen on port **9312** for connections from remote agents and non-mysql based clients and on port **9306** for MySQL connections.
 
 Unix-domain sockets are not supported on Windows.
 
+`sphinx` listener is runaround for sphinxSE clients, as last can't work with common multiprotocol listeners. Such listener will also work with any other sphinx API clients, but since we made compatible improvement to sphinx API, consider multiprotocol as better choise for them (see below).
+
+#### Couple words about sphinx API proto
+
+Legacy sphinx protocol has 2 phases: handshake exchanging and data flow. Handshake consists of packet of 4 bytes from client, and packet of 4 bytes from daemon, with only one purpose - client determines that remote is real sphinx daemon, daemon determines that remote is real sphinx client. Main dataflow is quite simple: let's both sides declare their handshakes, and opposite check them. That exchanging with short packets implies using special TCP_NODELAY flag, which switches off nagle tcp algorithm and declares, that tcp connection will be performed as dialogue of small packages. 
+However it is not strictly define, who speaks first in this negotiation. Historically all clients over API, which is still available in 'api' folder in the sources, speak first their handshake. Also, all clients by implementation send handshake, then read 4 bytes from daemon, then send request and read answer from daemon.
+When we improved sphinx protocol compatible way, we considered these things:
+1. Usually master-agent communication is established from known client to known host on known port. So, it is quite not possible, that endpoint will provide wrong handshake. So, we may implicitly assume, that both sides are valid and really speak in sphinx proto. 
+2. From this assumption we may 'glue' handshake to the real request and send it in the one packet. If backend is legacy sphinx daemon - it will just read this glued packed as 4 bytes of hadshake, then request body. Since they both came in one packet, backend socket has -1 RTT, and frontend buffer still works despite that fact usual way.
+3. Continuing assumption: since 'query' packet is quite small, and hadshake even smaller, let's send both in initial 'syn' package using modern TFO (tcp-fast-open) technique. That is: we connect to remote node with glued handshake + body package. Daemon accept connection and immediately has both handshake and body in the socket buffer, as they came in very first tcp 'SYN' packet. That eliminates another one RTT.
+4. Finally teach daemon to accept this improvement. Actually, from application it implies NOT to use TCP_NODELAY. And, from system side it implies to ensure that on daemon side accepting TFO is activated, and on client side sendint TFO is also activated. By default in modern systems client TFO is already activated by default, so you have only tune server TFO for all things to work.
+
+All these improvements without actual changing of protocol itself allowed to eliminate 1.5 RTT of tcp protocol from the connection. Which is, if query and answer capable to be placed in single tcp package, decreases whole binary API session from 3.5 RTT to 2 RTT - which makes network negotination about 2 times faster.
+
+So, all our improvements is stated around initially undefined statement: 'who speaks first'. If client speaks first - we may apply all these optimizations and effectively process connect + handshake + query in single TFO package. Moreover, we can look to the beginning of received package and determine real protocol. That is why you can connect to one and same port with all API/http/https. If daemon has to speak first - all this optimizations are impossible. And multiprotocol is also impossible. That is why we have dedicated port for mysql41, and not unified it with all other protocols into same port. Suddenly, among all clients one was written implying that daemon should send handshake first. That is - no possibility to all described improvements, pure legacy. That is sphinxSE pluging for mysql/mariadb. So, specially for this single client we dedicated `sphinx` proto definition to work most legacy way. Namely: both sides activate TCP_NODELAY and exchange with small packages. Daemon sends it's handshake on connect, then client sends it's, and then everything works usual way. That is not very optimal, but just works. If you use sphinxSE to connect to daemon - you have to dedicate a listener with explicitly stated `sphinx` proto. For another clients - avoid to use this proto flavour; it is slower. If you have another legacy sphinx API clients - check first, if they able to work with non-dedicated multiprotocol port. For master-agent linkage using non-dedicated (multiprotocol) port, and enabling client and server TFO is work well, and will definitely make working of network backend faster. 
+
 ### listen_tfo
 
 This setting allows TCP_FASTOPEN flag for all listeners. By default it is managed by system, but may be explicitly switched off by setting to '0'.
@@ -823,7 +840,8 @@ query_log_mode  = 666
 ### max_connections
 
 <!-- example max_connections -->
-Maximum number of simultaneous client connections. Unlimited by default.
+Maximum number of simultaneous client connections. Unlimited by default. That is usually noticeable only when using any
+kind of persistent connections, like cli mysql sessions, or persistent remote connections from remote distr indexes.
 When the limit is exceeded you can still connect to the server using [the VIP connection](Connecting_to_the_server/MySQL_protocol.md#VIP-connection)
 
 <!-- request Example -->
@@ -1052,7 +1070,8 @@ shutdown_timeout = 1m # wait for up to 60 seconds
 
 ### shutdown_token
 
-SHA1 hash of the password which is necessary to invoke 'shutdown' command from VIP Manticore SQL connection. Without it [debug](Reporting_bugs.md#DEBUG) 'shutdown' subcommand will never cause server's stop.
+SHA1 hash of the password which is necessary to invoke 'shutdown' command from VIP Manticore SQL connection. Without it [debug](Reporting_bugs.md#DEBUG) 'shutdown' subcommand will never cause server's stop. Notice, that such simple hashing should not be considered as strong protection, as we don't use
+salted hash or any kind of modern hash function. That is just fool-proof for housekeeping daemons in local network.
 
 
 ### snippets_file_prefix
@@ -1135,7 +1154,6 @@ Path to the SSL Certificate Authority (CA) certificate file (aka root certificat
 
 Server uses the CA file to verify the signature on the certificate. The file must be in PEM format.
 
-
 <!-- intro -->
 ##### Example: