Releases: bft-smart/library
BFT-SMaRt v2.0
Latest version of the BFT-SMaRt library (v2.0). Includes source code, binary, javadoc and runscripts. Following are the changes that the recent version contains.
New features:
- Added an option to generate disk writing overhead in ThroughputLatency micro benchmark.
- Added an option to generate a request signature and verify it in the servers in ThroughputLatency micro benchmark.
- Parallelized consensus proof generation.
- Defined signature as default Consensus proof type.
- Fetched the security provider from system.config to generate RSA keys.
- Added support for ECDSA.
- Added the system.numrepliers parameter in system.config to configure the number of replies to use to send responses.
- Added a mechanism to accumulate requests in batch instead of starting a new consensus instance as soon as the previous one finishes.
- Added bouncy castle provider.
- Added TLS support to communication system layer. This includes addition of configuration parameters in system.config and a directory to store keys.
- Added a fairness mechanism in client requests selection to be proposed in consensus.
- Added a benchmarking tool that makes it easy to execute distributed test, such as measuring throughput and latency.
- Implement ORDERED_HASHED request which allows to send ordered request and receive a full response from a single server and hashes from the other remaining servers.
Code modifications:
- Added new debug messages in several classes.
- Implemented shuffling mechanism in the replica-to-replica communication layer to prevent the replica with the lowest ID/index from always being the last one receiving messages.
- Added an optimization to the generation of consensus proofs consisting of speculatively creating the ACCEPT message upon the reception of the PROPOSE message.
- RSAKeyLoader will now store public keys in memory instead of always reading them from disk.
- Merged interface StateManager with class BaseStateManager, creating class StateManager.
- Renamed packages: bftsmart.statemanagement.strategy -> bftsmart.statemanagement.standard; bftsmart.statemanagement.strategy.durability - > bftsmart.statemanagement.durability;bftsmart.tom.server.defaultservices.durability -> bftsmart.tom.server.durability.
- Moved the responsibility of creating client responses from ServiceReplica to Executable interfaces.
- Removed hmac and mac from code and respective options from system.config.
- Migrated build tool to gradle.
- Added methods to pause and resume DeliveryThread.
- Added integration tests.
- Added a new throughput and latency benchamark.
- Moved invoked ordered timeout setting to system.config.
- Configured TLS to use cipher as default.
- Implemented byte-limits for clients requests.
- Simplified Map demo.
- Improved logger configurations
- Removed the use of BigInteger during the computation of hashcode in TimestampValuePair.
- Changed ThroughputLatencyClient to print server response when it is unexpected.
- Refactored service proxy code.
- Improved Counter demo.
- Load public key of new processes added through reconfiguration.
Bugs fixes:
- Fixed quorum calculation during the state transfer executed during replica initialization.
- Fixed vulnerability in LCManager.hasValidProof(...) that would not compare the values of the ACCEPT messages with the decision if the consensus proof was comprised of signatures.
- Fixed bug in StateManager that would keep the system from executing requests if it was comprised of a single replica.
- Fixed vulnerability that would cause the system to block if a client issued a malformed/invalid reconfiguration request.
- Fixed race condition in reconfiguration that would occur when batch execution was slower than consensus processing.
- Fixed bug in durability coordinator which would result in a BindException being thrown while trying to restart the group.
- Fixed bug in durability coordinator that would cause the protocol to try to start a consensus with an id that was already used.
- Fixed bug that didn't always printed "Ready to process operations".
- Fixed binding issue that is related to correctly quering the loopback address.
- Fixed control flow to avoid leader change.
- Fixed bug that prevents the use of negative sequence numbers.
- Fixed bug that would occur when starting a new session while there are pending requests.
- Fixed a synchronization bug that would result in a NullPointerException in NettyClientServerCommunicationSystemServerSide.
- Fixed bug in ClientsManager that would rarely cause unnecessary leader.
- Fixed bug that would cause a replica to get stuck if the leader receives enough accept messages before it processes its own proposal.
- Fixed bug that would occur when a client sends an unsigned request when it was supposed to sign it.
- Fixed race condition related to reconfiguration in DeliveryThread.
- Fixed bug that would allow faulty clients to prevent other clients from receiving replies.
- Fixed bug that would prevent a faulty replica from being fully recovered.
- Fixed a bug in DefaultVMServices where addServer function needs four args instead of three.
- Stop accepting values which had not been proposed.
- Fixed crash in AsyncLatencyClient with intervals <= 0.
- Fixed bug to stop deciding on values where proposal doesn't match accepted values.
- Fixed target selection in NettyClientServerCommunicationSystemClientSide.
- Fixed vulnerability in read-only requests optimization that compromised liveness.
BFT-SMaRt v1.2
Latest version of the BFT-SMaRt library (v1.2). Includes source code, binary, javadoc and runscripts. This release includes mostly bug fixes and some new minor features. Since it is the most stable codebase so far, it is no longer considered beta.
New features
- Implemented the ServiceReplica.kill() and ServiceReplica.retart() methods. ServiceReplica.kill() stops the service execution at a replica. It will shutdown all threads, stop the requests' timer, and drop all enqueued requests, thus letting the ServiceReplica object be garbage-collected. From the perspective of the rest of the system, this is equivalent to a simple crash fault. ServiceReplica.retart() simply cleans the object state and reboots execution. From the perspective of the rest of the system, this is equivalent to a rash followed by a recovery.
- Added option "system.communication.defaultkeys" to use the same key pair across all processes. This is meant to be used on experiments and benchmarks, so that deployment is more straightforward, without the need to manage keys.
- Added option "system.communication.bindaddress" that allows the replica to fetch its local ip address on its own when binding Netty's server bootstrap (instead of relying o the ip address present in
- Added option "system.numnettyworkers" to specify the number of netty worker threads created at each replica.
- Added option "system.samebatchsize" to force all replicas to receive the same number of requests per batch (not to be confused with the batch used for the PROPOSE message from the ordering protocol).
config/hosts.config). This way it is possible to avoid editing config/hosts.config at each replica when running inside docker or when deploying in Amazon EC2 with elastic IPs. - Replaced the library's proprietary logger class with SLF4J with the logback implementation. Also replaced all System.out.println and ex.printStackTrace methods with adequate SLF4J invocations.
- Included xml configuration file for logback at the ./config directory.
- It is now possible to supply to the library with custom key loader that overrides the default RSA key loading mechanism via the new interface KeyLoader (which can be supplied to the library at the ServiceReplica and ServiceProxy constructors). This is useful for applications that also need to use and manage the same structure of keys as the library and/or to use different public key algorithms/providers.
- Algorithms for hmac, secret keys, signature and hashing are now configurable at the config/system.config file. It is now possible to indicate the specific security provider to use for each algorithm.
Code modifications
- A leader election is now automatically triggered if replicas receive an invalid PROPOSE message from the current leader.
- Modified Netty's client communication system to share a single EventLoop across multiples channels (instead of creating an event loop per channel). This was done to conserve system resources.
- Removed constructors that allowed replicas to join the group, as well as the command to make them leave. Only the VMServices process is supposed to have the authority to manage the group, so these functionalities were superfluous and missleading.
- AsyncLatencyClient now supports the same parameters as ThroughputLatencyClicent
- Removed the "dos" parameter from ThroughputLatencyClient, since AsyncLatencyClient ca be used for that purpose instead.
- Added new parameter to appExecuteBatch method in DefaultRecoverable that indicates if the command arrived directly from the total order algorithm or if it is being applied by the state transfer protocol.
- Removed RandomDemo and LatencyClient/server demos from the bftsmart.demo package.
- Removed redundant FIFOExecutable interface.
- Modified RequestVerifier interface to receive the entire TOMMessage instead of just the payload.
- Implemented parallel signature verification when a full PROPOSE message arrives at the replicas.
- The algorithms for hmac, secret keys, signature and hashing are now the same across all parts of the code. The new defaults are, respectively: HmacSHA512, PBKDF2WithHmacSHA1, SHA512withRSA, SHA-512.
- Created a completely new implementation and interface for the BFTMap demo available at the pacakge bftsmart.demo.map.
- Implemented a simple flow control at client side so prevent the virtual machine from exausting the heap space if asynchronous clients aggressively send requests to the servers.
- Generated new default RSA keys with 2048 bit length (available in the ./config/keys directory).
- Library now compiles for java 1.8
Bugs fixes
- Fixed bug in the state transfer that would happen when the system had only a single replica.
- Fixed bug in the "system.numrepliers" parameter that would default to a single replier instead of to the default Netty communication system
- Fixed bug in the AsynchServiceProxy class that would case a race condition between the client and the servers. The client would store its sequence number after sending its requests, but if the servers responded quickly enough for the client to parse the replies before storing the sequence number, the messages would be discarded.
- Fixed bug related with unreleased Netty thread resources at the client side.
- Fixed bug in the state transfer protocol that would trigger after adding a replica to the view.
- Fixed bug on the leader change protocol that would occur if at least one of the messages in a consensus proof was invalid.
- Fixed memory leak at the netty communication system that prevented file descriptors held by clients from being released, even if the Netty channels were explicitly closed.
- Fixed bug in the state log of the DefaultRecoverable classes that would occur during de-serialization if the batch of operations contained in each entry had commands too large.
- Fixed bug in the reconfiguration protocol that would occur when a replica was removed from the group and them added back to it, which would prevent the replica from correctly resuming execution.
- Fixed null pointer exception on the leader change protocol that would happen if the leader crashed before any consensus message was exchanged among replicas.
- Fixed bug on one of ServiceReplica constructors that would not create the default Replier object if a null pointer was passed.
- Fixed bug on ServiceReplica where it would not invoke a custom replier when using the batchexecutor interface.
- Fixed null pointer exception in the default replier object that would occur during a reconfiguration.
- Fixed bug that would make the currentView file always be created in the same local directory regardless of the path that is passed as an argument to the ServiceReplica constructors.
- Fixed bug in ServiceProxy that could result in threads being stuck while invoking the invoke method a second time.
- Fixed mistake when evaluating the time elapsed since a request was received for the leader change protocol (time units were in nanoseconds but evaluated as miliseconds).
- Fixed bug on the leader change protocol that occured if the leader crashed and the timeout task triggered without any requests actually expired, which would make the system block.
- AsyncServiceProxy now supports updates to the view that come from the replicas.
- Fixed bug on the state transfer protocol that would calculate completely wrong the number of matching values for the current leader and regency.
- Fixed race condition in Netty's writeAndFush() method in the client/server communication system, which would cause message to include wrong MACs while disseminating them across multiple targets.
- Fixed vulnerability that would enable a malicious leader to perform replay attacks.
BFT-SMaRt 1.1 Beta
Lastest version of the BFT-SMaRt library (v1.1 beta). Includes source code, binary, javadoc and runscripts.
This version does not provide any new features in relation to the previous one (v1.0 beta), but it does include a significant amount of bug fixes, changes in the code, and a few modifications to the replication protocol.
Protocol alterations:
-
After sending a STOP message, each replica will now periodically re-transmitted it. This is necessary for cases where a replica that recovered from a failure does not return to the system in time to receive enough STOP messages from the other replicas. Consequently the synchronization phase may not complete in such scenario.
-
Under CFT mode, a replica now updates its timestamp/value pair immediately after receiving a (valid) PROPOSE message (or at the end of the synchronization phase), This must be done because the original consensus algorithm requires a a quorum of WRITE messages before updating this pair, but CFT mode bypasses the WRITE phase completely. Since in CFT mode replicas are expected to fail only by crashing, this does not break the correctness of the protocol.
-
Replicas now will only stop executing consensus instances after collecting 2f+1 STOP messages. This was done to avoid a corner case where a system with a single client would block, which can happen if:
- There is only one client sending requests;
- One replica is crashed;
- One of the three correct replicas timeout before being able to order the request (assuming f = 1, n =4).
This would not be a problem if the library did not support read-only invocations, which require only f+1 replies from replicas (which is in accordance to the specification of the Mod-SMaRt protocol). But with read-only invocations, clients need to wait for a Byzantine quorum of replies.
-
Standard state transfer now randomly selects a replica to ask for the full state. Implemented to deal with a corner case where a leader change may not ever finish if:
- The new leader is late and needs to ask for a state transfer;
- The timeout for requests is shorter than the state transfer timeout.
-
The state transfer is now obligated to send a proof for the last decided consensus, so that a recovered replica can obtain a CertifiedDecision object. This is necessary to ensure that any recovered replica can send its proof for its last consensus if the synchronization phase is triggered immediately after a recovered replica finishes installing the state.
Furthermore, replicas that are asked for the state should now check if they indeed have a proof for the requested state up to the specified consensus instance. If they do not, they should reply in the same way as if they did not had the state requested. However, a proof is never needed in CFT mode.
-
Lastly, there is a small, yet important correction to the Mod-SMaRt protocol: the content of the requests will now be validated before being stored and marked as pending requests. This is done to avoid malicious clients from forcing all correct replicas to propose invalid requests. If all correct replicas proposed invalid requests once they become leaders, the consensus instance would never decide anything, since all correct replicas refuse to send WRITE messages to invalid content. However, it is not necessary to perform any such verification under CFT mode.
Code modifications:
- Added method 'appExecuteUnordered(...)' to 'DefaultRecoverable', 'DefaultSingleRecoverable' and 'DurabilityCoordenator'. All demos now implement this method instead of 'executeUnordered(...)' from the 'Executable' interface;
- Transfered a huge portion of the code from 'TOMLayer' to a new class 'Synchronizer'. This was done because the TOMLayer class already had more code related to the synchronization phase than to the normal case (approximately 2/3 of TOMLayer's code was dedicated to the synchronization phase);
- The cryptographic proof for an ACCEPT message is now done within a dedicated method;
- Removed a few legacy attributes from the classic state transfer protocol that were no longer necessary;
- Removed a legacy attribute from the reconfiguration protocol that was no longer necessary;
- Removed a legacy parameter from the 'decided(..)' method of the 'Consensus' class;
- Renamed class 'Round' to 'Epoch';
- Renamed class 'Consensus' to 'Decision';
- Renamed class 'Execution' to 'Consensus';
- Renamed class 'PaxosMessage' to 'ConsensusMessage';
- Renamed class 'LastEidData' to 'CertifiedDecision';
- Renamed methods and variables across all code from 'EID' (Execution ID) to 'CID' (Consensus ID);
- Re-distributed the classes from all sub-packages from 'bftsmart.consensus' and 'bftsmart.tom.core', which resulted in removing 2 sub-packages that were rendered empty ('bftsmart.consensus.executionmanager' and 'bftsmart.tom.core.timer');
- The nonces generated within each consensus instance are now only generated upon usage of the method 'getNonces()' of a 'MessageContext' object. The original seed and number of nonces is now the only information that is exchanged amoung replicas (it is all that is needed to obtain the nonces);
- 'MessageContext' objects now hold all information of the original 'TOMMessage' object and is also able to re-create the original object with the method 'recreateTOMMessage(...)';
- 'MessageContext' objects now hold the cryptographic proof for the consensus instance to which it is associated with;
- Method 'noOp' from 'Recoverable' now provides the complete 'MessageContext' object associated with the consensus instance where it was triggered;
- Deleted class 'LeaderModule' and moved the few methods that were still being used to the 'ExecutionManager' class;
- Deleted classes 'Proof', 'CounterState' and 'ReceivedMessage', since they were no longer being used in any part of the code;
- Organized 'import's and fixed edentation in some classes;
- Added 'override' annotations across all the code.
Bug fixes:
- Setting 'useMAC' parameter to '0' will no longer throw any exception during execution;
- Fixed a bug related with nonce generation (the leader replica was not keeping this information);
- Fixed bug in initialization, which would make replicas always select replica 0 as the leader (regardless of if it was part of the group or not);
- Fixed issue on 'DefaultSingleRecoverable' class that would make all consensus messages go to the out-of-context set;
- Recovered replicas can now correctly calculate a quorum of replies associated with the state transfer;
- Fixed a bug that happened in the absence of clients issuing requests. if crashed replica X finished recovering and then another replica Y crashed and later asked for the latest state, replica X would send the wrong consensus ID;
- Leader change was sending a wrong message type in CFT mode (was sending a WRITE message instead of an ACCEPT);
- Replicas will now send their STOPDATA message even if they do not hold a proof for their last executed consensus;
- Fixed implementation of predicate 'sound', which was waiting for more than 'n-f' STOPDATA messages (instead of waiting for at least 'n-f');
- Fixed the timestamps associated with each consensus. They were being incremented at the SYNC message, but this must be done earlier (after receiving 2f+1 STOP messages);
- PROPOSE message is now validated in relation to the replica's current epoch (which must be 0 in order for the PROPOSE to be accepted);
- Fixed a bug that would send a timestamp/value pair with timestamp equal to -1;
- Implemented an out-of-context mechanism for the synchronization phase of the replication protocol (such mechanism only existed for the normal phase);
- Made sure that upon a leader change, the protocol will use a new Epoch object with the latest timestamp, and include such timestamp in the upcoming WRITE/ACCEPT messages;
- Made sure client requests relayed within STOP messages were proccessed in accordance to the Mod-SMaRt protocol;
- Made sure the synchronization phase now installs the consensus proof received in the SYNC message;
- Made sure STOP messages are exchanged also in CFT mode;
- Fixed concurrency issue related to a consensus WRITESET;
- Fixed a bug on synchronization phase which updated the replica's WRITESET before properly installing the new ETS;
- The synchronization phase now updates the consensus' ETS of delayed replicas (using the value of the current regency);
- Fixed memory leak in 'ExecutionManager' class;
- Fixed script 'smartrun.sh'.
Miscellaneous:
- Added interface where developers can enforce the 'external validity' property of VP-Consensus;
- Extended 'Recoverable' interface to allow delivery of requests via the 'op(...)' method;
- 'DurabilityCoordinator' now supports the 'noOp(...)' method;
- DefaultRecoverable and DefaultSingleRecoverable (finally) store and send the MessageContext objects associated with the ordered commands during checkpoints and state transfer, respectively;
- Added debug messages for the synchronization phase of the replication protocol;
- Removed a redundant lock from the communication system;
- Moved locks of the 'run_lc_protocol()' method to a more precise part of the code (in class RequestsTimer);
- Updated 'ShutdownHookThread' to properly display the state of a replica that gets shutdown;
- It is now possible to specify the key size in 'RSAPairGenerator';
- Method 'computeHash(...)' from 'TOMUtil' class is now thread-safe.
BFT-SMaRt 1.0 Beta
BFT-SMaRt library v1.0 beta. This release contains the codebase exported from GoogleCode upon migrating to GitHub. It includes the source code, binary files, javadoc and runscripts. No longer the most stable version available.