Gusli (G3+ User Space Access Library) is a C++ library encapsulating IO (Read/Write) to any local block device (or a file). It has a client library (.so) for IO submitting apps and server library (.so) for execution of the io's via backend
- Both client and server processes run on the same machine in user space. No Linux Kernel components involved.
- Zero copy kernel bypass io, using shared memory between processes
- 1 client connects to multiple servers. Each server represents a block device.
- Supported servers: SPDK based apps with standard bdevs, Any other process based on server library examples.
- IO (read/Write) can be blocking or asynchronous. Async-io can use callback functions and/or be polled for completion and/or canceled.
- The reason for separation of client and server is to avoid compile/link dependency between io submitting app and block device backend. Example: Backend might be spdk block device but client application does not have to be spdk application.
- Client library ensures io is valid & properly throttled before issuing it to the server
Additional Documentation and description: Here
make all
- For production
make clean all BUILD_RELEASE=1 BUILD_FOR_UNITEST=0 VERBOSE=1 ALLOW_USE_URING=0; ls /usr/lib/libg*; ls /usr/include/gus*;
- executables of unit-tests will be created in project root dir (same as makefile)
- use the command above to build and install production libraries into /usr (/usr/lib, /usr/include)
- make file does installation of binaries, see its output:
- Deliverables: .so/.hpp files pair for server side and .so/.hpp files pair for client
- More info: type
make help
To see more info of how to compile - Compilation: Both g++ and clang++ are supported. clang example:
make USE_CLANG=1 all
- Mandatory: None. Plain C++ code (libc) code.
- Third party components are not auto downloaded/installed. Please do the following:
OS | Install | Remarks |
---|---|---|
Ubuntu | apt install g++ make cmake pkg-config |
✅ |
Fedora | dnf install gcc-c++ cmake pkg-config |
🚧 Not tested |
- Optional:
- For IO's with large scatter gathers (> 64 ranges), uring API to local block devices may help. So consider:
sudo apt install liburing-dev
- For developers: useful utils:
sudo apt install -y build-essential gdb meld ncdu tree valgrind
- For IO's with large scatter gathers (> 64 ranges), uring API to local block devices may help. So consider:
- You can see the actual dependencies of this dynamic library by doing
ldd /usr/lib/libgusli_*.so
Bash script which allows you to monitor the io buffers, stacks, threads, etc of both client and server processes, as well as force kill them.
source gusli/80scripts/service.sh
;GUSLI;
To get help about about monitoring client and server
- Client: write-read-verify io
- function which connects to Gusli server issues some basic ios to it
class unitest_io
shows how io can be launched in many modes (with/without async callback, with/without polling, blocking mode), with verification.
- Server: A few server classes like ram based block device, always-fail-io, spdk-bdev, etc
- Testing executables:
- SPDK Both: Executable which forks itself, runs spdk server in 1 process, client basic io tests in another process spdk_app
./unitest/run
for running various Gusli unit-tests
- Gusli clients connect to servers via a config file. Example of config code can be found in a function
client_simple_test_of_server()
- See unit-test directory to learn more how to use the library and how it is tests.
- How to link? See examples above
- First decide what does your app need. Serve io (like user space block device) or Issue io (then you are a client).
- Use the supplied examples above to start from sample server/client application and extend as needed.
- Application can be both server and client, this works but not recommended. See examples of self forking app.
- If your app is a issuing io to standard linux block devices, then you can use only Gusli client, without server.
- If your app is spdk based block device do the following steps: Download spdk, and install it. Compile Gusli example server app which loads Gusli server. Change spdk configuration file to match your block device
- If your application is for production, please make sure you have a monitoring service/script which coordinates client/server apps
- Generates proper configuration which block devices should be visible to client via servers
- Restart server, client after crash
- Note: Gusli is just 2 libraries, not services.
- We strongly suggest to use dynamic linking to be able to enjoy latest Gusli features without recompiling your executable.
- Static linking is generally used for debugging and unit-testing only.
- You can use static linking (server/client) *.a files or dynamic linking *.so files.
- Integrating client with NIXL, see install script
- Integrating server with SPDK, see install script
⚠️ Gusli team assisted with the above integrations for testing purposes. It is not responsible for it and integration can break if any of the above open source projects introduce breaking changes.
The current Maintainers Group for the project consists of:
Maintainer | GitHub ID | Affiliation | |
---|---|---|---|
Daniel Herman Shmulyan | danielhe | NVIDIA | [email protected] |
- Jira Plan, accessible only internally in NVIDIA
- Todos in the code are marked with
nvTODO()
macro
Important
Carefully read the below known issues
- Proper client server disconnect-reconnect while io is in air. Examples:
- Unmap/Remap mem buffers destroys io buffers! Think how to solve it
- Client disconnect while Server backend returning completions - don't free io and executor
- Properly handling ghost remaining ios after cancel or disconnect
- This harms server upgradability. Server restarting while IO is in air will not resubmit the ios automatically.
- Proper Design for IO canceling.
- For now IO canceling is implemented as cancel and blocking wait for all after effects of io to finish. This can lead to unbounded wait. We need to better understand how users will use the cancel and design appropriately
- Possible change: Don't block on canceled io but block on stop_all_ios() or unregister memory which those ios use.
- Proper Design for IO done(). I introduced this destructor to force the user to actually think how they use the io api concurrently as io may have multithreaded access (completion callback, polling thread, io cancel thread, io destructor call). Explicit call to done() requires the user to understand when his io is terminated. But this is not mandatory and can be taken care internally in the io implementation via atomic refcount + block the ~destructor(). This however has its own costs, races and requires a careful planning
- Dynamic addition of servers not supported. When starting Gusli library a full configuration of servers is given. Only subset of servers can be actually connected, but a new server which is not in the config, cannot be accessed by the client.
- Timeouts not support yet.
- User can monitor the IO and call cancel, but this does not guarantee io finish, much like stuck io to a local disk.
- Control path operations (open/close bdev, etc) have no timeout and cannot be canceled. They are typically fast.
- IO throttling. Done by the client on the amount of in air IO requests min(client_limit, server_limit). But does not take into account the size of the request nor how many ranges it has.
- Run Correctness tests (pass the following tests):
./unitest/run ci
; - Verify io performance did not suffer > 1.5[Miops]. Typically 2[M]+ for single core server/client
- Verify no breaking changes introduces in external library API's
- Tag the version. Example:
git tag -a v0.04 -m "changes a, b, fix c"; git push --tags; git ls-remote --tags origin
- Update the change log
- Set up project integrations, Create a new merge request
- Automatically close issues from merge requests
- Enable merge request approvals
- SetUp CI/CD pipeline, Set automerge after approval and pipeline succeeds
- Analyze code for known vulnerabilities with Static Application Security Testing
- Deploy to Kubernetes, instead of just local makefile