Copyright (C) 2016-2023 The Open Library Foundation
ClassPath.java and ClassPathTest.java are also Copyright (C) 2012 The Guava Authors.
This software is distributed under the terms of the Apache License, Version 2.0. See the file "LICENSE" for more information.
- Introduction
- Upgrading
- Overview
- The basics
- Get started with a sample working module
- Command-line options
- Environment Variables
- Read and write database instances setup
- Local development server
- Creating a new module
- RestVerticle
- Adding an init() implementation
- Adding code to run periodically
- Adding a hook to run immediately after verticle deployment
- Adding a shutdown hook
- Implementing uploads
- Implement chunked bulk download
- PostgreSQL integration
- CQL (Contextual Query Language)
- CQL2PgJSON: CQL to PostgreSQL JSON converter
- CQL2PgJSON: Usage
- CQL: Field names
- CQL: Relations
- CQL: Modifiers
- CQL: Matching, comparing and sorting numbers
- CQL: Matching id and foreign key fields
- CQL: Matching full text
- CQL: Matching all records
- CQL: Matching undefined or empty values
- CQL: Matching array elements
- CQL: @-relation modifiers for array searches
- CQL2PgJSON: Multi Field Index
- CQL2PgJSON: Foreign key cross table index queries
- CQL2PgJSON: Foreign key tableAlias and targetTableAlias
- CQL2PgJSON: Exceptions
- CQL2PgJSON: Unit tests
- Tenant API
- RAMLs API
- JSON Schemas API
- Query Syntax
- Estimated totalRecords
- Metadata
- Optimistic Locking
- Facet Support
- JSON Schema fields
- Overriding RAML (traits) / query parameters
- Boxed types
- Messages
- Documentation of the APIs
- Logging
- Monitoring
- Instrumentation
- Overriding Out of The Box RMB APIs
- Client Generator
- A Little More on Validation
- Advanced Features
- Additional Tools
- Some REST examples
- Additional information
This documentation includes information about the Raml-Module-Builder (RMB) framework and examples of how to use it.
The goal of the project is to abstract away as much boilerplate functionality as possible and allow a developer to focus on implementing business functions. In other words: simplify the process of developing a micro service module. The framework is RAML driven, meaning a developer / analyst declares APIs that the 'to be developed' module is to expose (via RAML files) and declares the objects to be used and exposed by the APIs (via JSON schemas). Once the schemas and RAML files are in place, the framework generates code and offers a number of tools to help implement the module. Note that this framework is both a build and a run-time library.
The framework consists of a number of tools:
-
domain-models-api-interfaces
-- project exposes tools that receive as input these RAML files and these JSON schemas, and generates java POJOs and java interfaces. -
domain-models-api-aspects
-- project exposes tools that enforce strict adherence to the RAML declaration to any API call by exposing validation functionality.- for example: a RAML file may indicate that a specific parameter is mandatory or that a query parameter value must be a specific regex pattern. The aspects project handles this type of validation for developers so that it does not need to be re-developed over and over. More on validation below.
-
domain-models-runtime
-- project exposes a run-time library which should be used to run a module. It is Vert.x based. When a developer implements the interfaces generated by the interfaces project, the run-time library should be included in the developed project and run. The run-time library will automatically map URLs to the correct implemented function so that developers only need to implement APIs, and so all the wiring, validation, parameter / header / body parsing, logging (every request is logged in an apache like format) is handled by the framework. Its goal is to abstract away all boilerplate functionality and allow a module implementation to focus on implementing business functions.-
The runtime framework also exposes hooks that allow developers to implement one-time jobs, scheduled tasks, etc.
-
Provides tooling (Postgres client, etc.) for developers to use while developing their module.
-
Runtime library runs a Vert.x verticle.
-
-
rules
-- Basic Drools functionality allowing module developers to create validation rules via*.drl
files for objects (JSON schemas).
See separate upgrading notes how to upgrade an RMB based module to a new RMB version.
Follow the Introduction section above to generally understand the RMB framework. Review the separate Okapi Guide and Reference. Scan the Basics section below for a high-level overview of RMB. Then follow the Get started with a sample working module section which demonstrates an already constructed example. When that is understood, then move on to the section Creating a new module to get your project started.
Note that actually building this RAML Module Builder framework is not required. (Some of the images below are out-of-date.) The already published RMB artifacts will be incorporated into your project from the repository.
Build the raml-module-builder
project to generate the needed jars, then add them to your project's pom.xml
.
Call a Maven exec plugin with a class from the interfaces jar to generate POJOs and interfaces within your project.
See the call to com.sling.rest.tooks.GenerateRunner
in the circulation project's pom.xml
for an example.
The following RAML snippet will be used for this example:
/bibs/{bibId}:
type:
post:
example:
!include examples/bid.sample
schema:
bib.schema
Additionally, the following query parameters and header requirements will be used:
headers:
Authorization:
description: |
Used to send a valid JWT token.
example:
Bearer Hc8KNK7LAJPasAwX9pIbN7yeTwSCAq
required: true
queryParameters:
lang:
description: |
Requested language. Optional. [lang=en]
type: string
required: false
default: en
pattern: "[a-zA-Z]{2}"
From this snippet, an interface (BibInterface.java
) is generated based on the paths (each path+verb pair generates a method). This is documented with the examples from your RAML.
Additionally, an object (Bib.java
) is generated based on the JSON Schema provided. An example of this may be found in the A Little More on Validation section.
The following is an example of the interface method signature that would be generated:
void postBibs(
@HeaderParam("Authorization")
@NotNull
String authorization,
@QueryParam("lang")
@DefaultValue("en")
@Pattern(regexp = "[a-zA-Z]{2}")
String lang,
Bib entity
) throws Exception;
An example of the generated constraints is shown above.
- When implementing the interfaces, you must add the @Validate annotation to enforce the annotated constraints declared by the interface.
- Note that a Bib entity was passed as a parameter. The runtime framework transforms the JSON passed in the body into the correct POJO.
-
Add Maven repositories for FOLIO. There must be a section for regular dependencies (
<repositories>
) and a section for plugins (<pluginRepositories>
). -
Add the
domain-models-maven-plugin
. This will generate the POJOs and interfaces based on the RAML files. Usedomain-models-maven-plugin
as a plugin in the<plugins>
pom.xml section only. Please do not add it to the<dependencies>
section, it is based on old libraries with security vulnerabilities. The vulnerabilities do not affect us when runningdomain-models-maven-plugin
as a plugin but they may affect a module if those old dependencies are included as explicit dependencies. -
Add the
aspectj-maven-plugin
. This is required if you would like the runtime framework to validate all URLs. -
Add the
maven-shade-plugin
, indicating the main class to run asRestLauncher
and main verticle asRestVerticle
. This will create a runnable jar with the runtime'sRestVerticle
serving as the main class. -
Add the
maven-resources-plugin
. This will copy your RAML files to the /apidocs directory where they will be made visible online (html view) by the runtime framework.
These are further explained below.
Do mvn clean install
... and run :)
The runtime framework will route URLs in your RAML to the correct method
implementation. It will validate (if @Validate
was used), log, and expose
various tools.
Notice that no web server was configured or even referenced in the implementing module - this is all handled by the runtime framework.
Some sample projects:
- https://github.com/folio-org/mod-configuration
- https://github.com/folio-org/mod-notes
and other modules (not all do use the RMB).
The mod-organizations-storage is a full example which uses the RMB. Clone it, and then investigate:
$ git clone --recursive https://github.com/folio-org/mod-organizations-storage.git
$ cd mod-organizations-storage
$ mvn clean install
-
Its RAMLs and JSON schemas can be found in the
ramls
directory. These are also displayed as local API documentation. -
Open the pom.xml file - notice the jars in the
dependencies
section as well as theplugins
section. Theramls
directory is declared in the pom.xml and passed to the interface and POJO generating tool via a maven exec plugin. The tool generates source files into thetarget/generated-sources/raml-jaxrs
directory. The generated interfaces are implemented within the project in theorg.folio.rest.impl
package. -
Investigate the
src/main/java/org/folio/rest/impl/OrganizationsAPI.java
class. Notice that there is a function representing each endpoint that is declared in the RAML file. The appropriate parameters (as described in the RAML) are passed as parameters to these functions so that no parameter parsing is needed by the developer. Notice that the class contains all the code for the entire module. All handling of URLs, validations, objects, etc. is all either in the RMB jars, or generated for this module by the RMB at build time. -
IMPORTANT NOTE: Every interface implementation - by any module - must reside in package
org.folio.rest.impl
. This is the package that is scanned at runtime by the runtime framework, to find the needed runtime implementations of the generated interfaces.
Now run the module in standalone mode:
$ java -jar target/mod-organizations-storage-fat.jar
Now send some requests using 'curl' or 'httpie'
At this stage there is not much that can be queried, so stop that quick demonstration now. After explaining general command-line options, etc. we will get your local development server running and populated with test data.
-
-Dhttp.port=8080
Optional -- defaults to 8081 -
-Ddebug_log_package=*
Optional -- Set log level to debug for all packages. Or useorg.folio.rest.*
for all classes within a specific package, ororg.folio.rest.RestVerticle
for a specific class. -
db_connection=[path]
Optional -- path to a JSON config file with connection parameters to a PostgreSQL DB-
for example Postgres:
{"host":"localhost", "port":5432, "maxPoolSize":50, "username":"postgres","password":"mysecretpassword", "database":"postgres", "charset":"windows-1252", "queryTimeout" : 10000}
-
path defaults to /postgres-conf.json
-
tries to read a file at the path if the path is absolute
-
if file not found or path is relative tries to read a class path resource with that path
-
-
Other module-specific arguments can be passed via the command line in the format key=value. These will be accessible to implementing modules via
RestVerticle.MODULE_SPECIFIC_ARGS
map. -
Optional JVM arguments can be passed before the
-jar
argument, e.g.-XX:+HeapDumpOnOutOfMemoryError
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Xloggc:C:\Git\circulation\gc.log
RMB implementing modules expect a set of environment variables to be passed in at module startup. The environment variables expected by RMB modules are:
- DB_HOST
- DB_PORT
- DB_USERNAME
- DB_PASSWORD
- DB_DATABASE
- DB_HOST_READER
- DB_PORT_READER
- DB_SERVER_PEM
- DB_QUERYTIMEOUT
- DB_CHARSET
- DB_MAXPOOLSIZE
- DB_MAXSHAREDPOOLSIZE
- DB_CONNECTIONRELEASEDELAY
- DB_RECONNECTATTEMPTS
- DB_RECONNECTINTERVAL
- DB_EXPLAIN_QUERY_THRESHOLD
- DB_ALLOW_SUPPRESS_OPTIMISTIC_LOCKING
- TESTCONTAINERS_POSTGRES_IMAGE
The first five are mandatory, the others are optional.
Environment variables with periods/dots in their names are deprecated in RMB because a period is not POSIX compliant and therefore some shells, notably, the BusyBox /bin/sh included in Alpine Linux, strip them (reference: warning in OpenJDK docs).
See the Vert.x Async PostgreSQL Client Configuration documentation for the details.
The environment variable DB_MAXPOOLSIZE
sets the maximum number of concurrent connections for a tenant that one module instance opens. They are only opened if needed. If all connections for a tenant are in use further requests for that tenant will wait until one connection becomes free. Other tenants and other instances of a module are unaffected. The default is 4.
The environment variable DB_MAXSHAREDPOOLSIZE
sets the maximum number of concurrent connections that one module instance opens. They are only opened if needed. If all connections are in use further requests will wait until one connection becomes free. This way one tenant may block other tenants. If the variable is set DB_MAXPOOLSIZE
is ignored and all connections are shared across tenants.
Use DB_SERVER_PEM
(or server_pem
in the JSON config) to set SSL/TLS certificate(s) in PEM format to validate the PostgreSQL server certificate, this can be the server certificate, the root CA certificate, or the chain of the intermediate CA and the CA certificate. Defaults to none allowing unencrypted connection only. If set requires a TLSv1.3 connection and a valid server certificate, and rejects unencrypted connections.
The environment variable DB_CONNECTIONRELEASEDELAY
sets the delay in milliseconds after which an idle connection is closed. A connection becomes idle if the query ends, it is not idle if it is waiting for a response. Use 0 to keep idle connections open forever. RMB's default is one minute (60000 ms).
DB_RECONNECTATTEMPTS
and DB_RECONNECTINTERVAL
set the maximum number of retries after a connect to the database fails, and how many milliseconds to wait before the next reconnect. Reconnecting is disabled by default.
The environment variable DB_EXPLAIN_QUERY_THRESHOLD
is not observed by
Postgres itself, but is a value - in milliseconds - that triggers query
execution analysis. If a single query exceeds this threshold, it will be
analyzed by using EXPLAIN ANALYZE
. Note that this in turn further adds
time to the query, so this should only be executed for slow queries that
needs further attention. The analysis can effectively be turned off by setting
it to a high value (e.g. 300000 ~ 5 minutes). Like the DB-environment
variables, this pertains per RMB-module (process). The default
value of DB_EXPLAIN_QUERY_THRESHOLD
is 1000 (1 second).
The EXPLAIN ANALYZE - is only performed for PostgresClient.get, PostgresClient.select and PostgresClient.join. Not for methods such as PostgresClient.getById or PostgresClient.streamGet.
The environment variable DB_QUERYTIMEOUT
sets the number of milliseconds after which RMB sends a
cancel request
to a running PostgreSQL query. 0 disables this timeout and is the default.
To take effect an RMB based module must get it via PostgresClient.getConnectionConfig().getInteger("queryTimeout")
and pass it to the RMB method that starts the connection, transaction or query.
The environment variables DB_HOST_READER
and DB_PORT_READER
are for the synchronously replicated read and write database instances setup.
The environment variables DB_HOST_ASYNC_READER
and DB_PORT_ASYNC_READER
are for the asynchronously replicated read and write database instances setup.
DB_ALLOW_SUPPRESS_OPTIMISTIC_LOCKING
is a timestamp in the format 2022-12-31T23:59:59Z
. Setting it disables optimistic locking when sending a record that contains "_version":-1
before that time, after that time "_version":-1
is rejected. This applies only to tables with failOnConflictUnlessSuppressed
, see below. The timestamp ensures that disabling this option cannot be forgotten. Suppressing optimistic locking is known to lead to data loss in some cases, don't use in production, you have been warned!
TESTCONTAINERS_POSTGRES_IMAGE
changes the PostgreSQL container image name used at build time for testing; it is not used at runtime.
See the Environment Variables section of the Okapi Guide for more information on how to deploy environment variables to RMB modules via Okapi.
A PostgreSQL instance (the write instance) can be replicated for horizontal scaling (scale out). Each replica is a read-only standby instance.
RMB supports separating read and write requests. By default the write instance configured with DB_HOST
and DB_PORT
environment variables is used for reading as well, but optionally a read instance (or a load balancer for multiple read instances) can be configured by setting its host and port using the DB_HOST_READER
and DB_PORT_READER
environment variables. If either of these reader variables are not set then it will default to use the writer instance.
RMB supports both synchronous standby servers and asynchronously replicated standby servers (the default)[https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION]. When using synchronous replication, write instance must list the DB_HOST_READER
instance(s) in its synchronous_standby_names
configuration. This ensures ACID for both write and read instance.
PostgreSQL's default asynchronous replication is supported by RMB by configuring DB_HOST_ASYNC_READER
and DB_PORT_ASYNC_READER
. Asynchronous replication is eventually consistent and suitable for read-only applications like reporting, analytics, and data warehousing. To use the async read host in queries, get an instance of PostgresClient
using PostgresClientWithAsyncReadConn.getInstance(...)
. If no async read host is configured, it falls back to the sync read host if configured, otherwise it uses the write host.
AWS RDS does not support synchronous replication. For AWS it is recommended to only use DB_HOST
and DB_HOST_ASYNC_READER
in a given deployment.
APIs using the async client should provide a warning in the API documentation that the API uses stale data (for performance reasons).
To get going quickly with running a local instance of Okapi, adding a tenant and some test data, and deploying some modules, see Running a local FOLIO system.
Create the new project using the normal layout of files, and basic POM file.
Add the /ramls
directory, the area for the RAML, schemas, and examples files.
For a maven subproject the directory may be at the parent project only.
To get a quick start, copy the "ramls" directory and POM file from mod-notify. (At Step 6 below, these will be replaced to suit your project's needs.)
Adjust the POM file to match your project, e.g. artifactID, version, etc.
<repositories>
<repository>
<id>folio-nexus</id>
<name>FOLIO Maven repository</name>
<url>https://repository.folio.org/repository/maven-folio</url>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>folio-nexus</id>
<name>FOLIO Maven repository</name>
<url>https://repository.folio.org/repository/maven-folio</url>
</pluginRepository>
</pluginRepositories>
<dependencies>
<dependency>
<groupId>org.folio</groupId>
<artifactId>domain-models-runtime</artifactId>
<version>32.2.0</version>
</dependency>
<dependency>
<groupId>org.folio</groupId>
<artifactId>postgres-testing</artifactId>
<version>32.2.0</version>
<scope>test</scope>
</dependency>
...
...
</dependencies>
(postgres-testing is available in version 32.2.0 and later)
Four plugins need to be declared in the POM file:
-
The
exec-maven-plugin
which will generate the POJOs and interfaces based on the RAML files. -
The
aspectj-maven-plugin
which will pre-compile your code with validation aspects provided by the framework - remember the@Validate
annotation. The validation supplied by the framework verifies that headers are passed correctly, parameters are of the correct type and contain the correct content as indicated by the RAML file. -
The
maven-shade-plugin
which will generate a fat-jar runnable jar. The important thing to notice is the main class that will be run when running your module. Notice theMain-class
andMain-Verticle
in the shade plugin configuration. -
The
maven-resources-plugin
which will copy the RAML files into a directory under/apidocs
so that the runtime framework can pick it up and display html documentation based on the RAML files.
Add ramlfiles_path
property indicating the location of the RAML directory.
<properties>
<ramlfiles_path>${basedir}/ramls</ramlfiles_path>
</properties>
Compare the POM with other FOLIO RMB-based modules.
Do mvn clean install
This should:
-
Create java interfaces for each added RAML file.
-
Each interface will contain functions to be implemented (each function represents an API endpoint declared in the RAML).
-
The parameters within each function interface will be annotated with validation annotations that were declared in the RAML. So, if a trait was marked as mandatory, it will be marked as @NOT_NULL. This is not something that needs to be handled by the implementer. This is handled by the framework, which handles validation.
-
POJOs -- The JSON schemas will be generated into java objects.
-
All generated code can be found in the
org.folio.rest.jaxrs
package in thetarget/generated-sources/raml-jaxrs/
directory.
Implement the interfaces associated with the RAML files you created. An
interface is generated for every root endpoint in the RAML files.
For example an
org.folio.rest.jaxrs.resource.Ebooks
interface will be generated.
Note that the org.folio.rest.jaxrs.resource
will be the package for every
generated interface.
The implementations must go into the org.folio.rest.impl
package because RMB's
RestVerticle
scans this package for a class that implements the required interface. The class can
have any name.
RMB then uses reflection to invoke the constructor and the method.
See mod-notify's org.folio.rest.impl package for example implementations.
It is beneficial at this stage to take some time to design and prepare the RAML files for the project. Investigate the other FOLIO modules for guidance. The mod-notify is an exemplar.
Remove the temporary copy of the "ramls" directory from Step 1, and replace with your own.
The end-points that share the same first path segment must go into the same .raml file
because the first path segment determines the name of the resource .java interface, for example
/foo/bar
and /foo/baz
should go into foo.raml, and foo.raml will generate
org/folio/rest/jaxrs/resource/Foo.java
. However, you may
overrule the convention for the resource interface name
by implementing a GeneratorExtension.
Add the shared suite of RAML utility files as the "raml-util" directory inside your "ramls" directory:
git submodule add https://github.com/folio-org/raml ramls/raml-util
The "raml1.0" branch is the current and default branch.
Create JSON schemas indicating the objects exposed by the module.
Use the description
field alongside the type
field to explain the content and
usage and to add documentation.
Use "javaType": "org.folio.rest.jaxrs.model.MyEntity"
to set the class name of the
generated Java type to prevent a duplicate name where the second class overwrites the first class
or to avoid a misleading name. Otherwise the field name in the .json schema file is used as class name.
See jsonschema2pojo Reference for JSON schema details.
The GenerateRunner automatically dereferences the schema files and places them into the
target/classes/ramls/
directory. It scans the ${basedir}/ramls/
directory including
subdirectories, if not found then ${basedir}/../ramls/
supporting maven submodules with
common ramls directory.
The documentation of HTTP response codes is in HttpStatus.java
Use the collection/collection-item pattern provided by the collection resource type explained in the RAML 200 tutorial.
The RMB does do some validation of RAML files at compile-time. There are some useful tools to assist with command-line validation, and some can be integrated with text editors, e.g. raml-cop.
See the guide to Use raml-cop to assess RAML, schema, and examples and the Primer for RAML and JSON Schema quick-start document.
RAML-aware text editors are very helpful, such as api-workbench for Atom.
Remember that the POM configuration enables viewing your RAML and interacting with your application via the local API documentation.
RestVerticle is implemented using the reactor pattern and Vert.x HttpServer and must be started on the Vert.x event loop. Don't start it on a Vert.x worker context, this may fail because of unsupported concurrency.
Use VertxOptions
setMaxEventLoopExecuteTime
and
setWarningExceptionTime
to adjust warnings like
Thread vertx-eventloop-thread-3 has been blocked for 20458 ms
and fix these violations of the Golden Rule
that make the module unresponsive. For details see Running blocking code.
It is possible to add custom code that will run once before the application is deployed
(e.g. to init a DB, create a cache, create static variables, etc.) by implementing
the InitAPIs
interface. You must implement the
init(Vertx vertx, Context context, Handler<AsyncResult<Boolean>> resultHandler)
. Only one implementation per module is supported.
Currently the implementation should sit in the
org.folio.rest.impl
package in the implementing project. The implementation
will run during verticle deployment. The verticle will not complete deployment
until the init() completes. The init() function can do anything basically, but
it must call back the Handler. For example:
public class InitAPIs implements InitAPI {
public void init(Vertx vertx, Context context, Handler<AsyncResult<Boolean>> resultHandler){
try {
sayHello();
resultHandler.handle(io.vertx.core.Future.succeededFuture(true));
} catch (Exception e) {
e.printStackTrace();
resultHandler.handle(io.vertx.core.Future.failedFuture(e.getMessage()));
}
}
}
The init may also modify HTTP server options by using RestVerticle.getHttpServerOptions
.
This API can be used if your module instance needs to perform ongoing tasks. Consider using Okapi's timer facility if the task to be performed is "per-tenant" and only needs to be executed on one instance and not all instances of the module. If the task performs operations on persistent storage, that is typically a sign that it should be using the Okapi timer facility (not what is presented in this section).
It is possible to add custom code that will run periodically. For example, to ongoingly check status of something in the system and act upon that.
Need to implement the PeriodicAPI interface:
public interface PeriodicAPI {
/** this implementation should return the delay in which to run the function */
public long runEvery();
/** this is the implementation that will be run every runEvery() milliseconds*/
public void run(Vertx vertx, Context context);
}
For example:
public class PeriodicAPIImpl implements PeriodicAPI {
@Override
public long runEvery() {
return 45000;
}
@Override
public void run(Vertx vertx, Context context) {
try {
InitAPIs.amIMaster(vertx, context, v-> {
if(v.failed()){
//TODO - what should be done here?
}
});
} catch (Exception e) {
e.printStackTrace();
}
}
There can be multiple implementations of the periodic hook, all will be called by the RMB framework.
It is possible to add custom code that will be run immediately after the verticle running the module is deployed.
public interface PostDeployVerticle {
/** this implementation will be run immediately after the verticle is initially deployed. Failure does not stop
* deployment success. The implementing function MUST call the resultHandler to pass back
* control to the verticle, like so: resultHandler.handle(io.vertx.core.Future.succeededFuture(true));
* if not, this function will hang the verticle during deployment */
public void init(Vertx vertx, Context context, Handler<AsyncResult<Boolean>> resultHandler);
}
An implementation example:
public class InitConfigService implements PostDeployVerticle {
@Override
public void init(Vertx vertx, Context context, Handler<AsyncResult<Boolean>> handler) {
System.out.println("Getting secret key to decode DB password.");
/** hard code the secret key for now - in production env - change this to read from a secure place */
String secretKey = "b2+S+X4F/NFys/0jMaEG1A";
int port = context.config().getInteger("http.port");
AdminClient ac = new AdminClient("http://localhost:" + port, null);
ac.postSetAESKey(secretKey, reply -> {
if(reply.statusCode() == 204){
handler.handle(io.vertx.core.Future.succeededFuture(true));
}
else{
handler.handle(io.vertx.core.Future.failedFuture(reply.statusCode() + ", " + reply.statusMessage()));
}
});
}
}
It is possible to add custom code that will run just before the verticle is undeployed and the JVM stopped. This will occur on graceful shutdowns, but can not be guaranteed to run if the JVM is forcefully shutdown.
The interface to implement:
public interface ShutdownAPI {
public void shutdown(Vertx vertx, Context context, Handler<AsyncResult<Void>> handler);
}
An implementation example:
public class ShutdownImpl implements ShutdownAPI {
@Override
public void shutdown(Vertx vertx, Context context, Handler<AsyncResult<Void>> handler) {
try {
AuditLogger.getInstance().publish(new LogRecord(Level.INFO, "closing audit logger"));
AuditLogger.getInstance().close();
handler.handle(io.vertx.core.Future.succeededFuture());
}
catch (Exception e) {
e.printStackTrace();
handler.handle(io.vertx.core.Future.failedFuture(e.getMessage()));
}
}
}
Note that when implementing the generated interfaces it is possible to add a constructor to the implementing class. This constructor will be called for every API call. This is another way you can implement custom code that will run per request.
The RMB allows for content to be streamed to a specific implemented interface. For example, to upload a large file without having to save it all in memory:
- Mark the function to handle the upload with the
org.folio.rest.annotations.Stream
annotation@Stream
. - Declare the RAML as receiving
application/octet-stream
.
Example RAML:
/uploadOctet:
description: Uploads a file
post:
description: |
Uploads a file
body:
application/octet-stream:
The RMB will then call the function every time a chunk of data is received.
This means that a new Object is instantiated by the RMB for each chunk of
data, and the function of that object is called with the partial data included in a java.io.InputStream
object.
For each invocation, RMB adds header streamed_id
which will be unique
for the current stream. For the last invocation, header complete
is supplied
to indicate "end-of-stream".
As of RMB 23.12.0 and later, if an HTTP client prematurely closes the upload
before complete, the handler will be called with streamed_abort
.
RMB supports bulk downloads of chunks using CQL ordered by primary key id (since version 25).
- 1st CQL query:
cql.allRecords=1 sortBy id
- 2nd CQL query:
id > [last id from 1st CQL query] sortBy id
- 3rd CQL query:
id > [last id from 2nd CQL query] sortBy id
- ...
The chunk size is set using the API's limit parameter, for example limit=10000
for chunks of 10000 records each.
The PostgreSQL connection parameters locations are searched in this order:
- org.folio.rest.tools.utils.Envs.setEnv
- DB_* environment variables
- Configuration file, defaults to
resources/postgres-conf.json
but can be set via command-line options
With PostgresClient.setPostgresTester
, testing may be performed against
an instance implementing postgresTester
interface, for example
TestContainers Postgres Module.
If database configuration is already provided, this call is ignored and testing
is performed against the database instance given by configuration.
The runtime framework exposes a PostgreSQL async client which offers CRUD operations in an ORM type fashion. https://github.com/folio-org/raml-module-builder/blob/master/domain-models-runtime/src/main/java/org/folio/rest/persist/PostgresClient.java
Important Note: The PostgreSQL client currently implemented assumes JSONB tables in PostgreSQL. This is not mandatory and developers can work with regular PostgreSQL tables but will need to implement their own data access layer.
Important Note: For performance reasons the Postgres client will return accurate counts for result sets with less than 50,000 results. Queries with over 50,000 results will return an estimated count.
The PostgresClient expects tables in the following format:
create table <schema>.<table_name> (
id UUID PRIMARY KEY,
jsonb JSONB NOT NULL
);
This means that all the fields in the JSON schema (representing the JSON object) are the "jsonb" (column) in the Postgres table.
The minimum PostgreSQL server version for RMB 35.3 is 16.0.0. RMB fails a POST /_/tenant call if the actual version is lower. You may set a higher or lower requirement using server_version_num format:
public class TenantRefAPI extends TenantAPI {
@Validate
@Override
public void postTenant(TenantAttributes tenantAttributes, Map<String, String> headers,
Handler<AsyncResult<Response>> handler, Context context) {
// https://www.postgresql.org/docs/current/runtime-config-preset.html#GUC-SERVER-VERSION-NUM
context.putLocal("postgres_min_version_num", "150000");
context.putLocal("postgres_min_version", "15.0"); // human readable, for error message only
super.postTenant(tenantAttributes, headers, handler, context);
}
}
As indicated, the PostgresClient is jsonb oriented. If there is a need to store data in binary form, this can be done in the following manner (only id based upsert is currently supported):
byte[] data = ......;
JsonArray jsonArray = new JsonArray().add(data);
client.upsert(TABLE_NAME, id, jsonArray, false, replyHandler -> {
.....
});
As previously mentioned, the Postgres Client supplied by the RMB looks for a file called postgres-conf.json
. However, leaving a file which contains the DB password to a superuser in plain text on the server is not a good idea. It is possible to encrypt the password in the file. The encryption should be an AES encryption (symmetric block cipher). This encryption is done with a secret key.
Meaning: password in plain text + secret key = encrypted password
The RMB comes with an AES class that supports generating secret keys, encrypting and decrypting them, https://github.com/folio-org/raml-module-builder/blob/master/domain-models-runtime/src/main/java/org/folio/rest/security/AES.java
Note that the use of this class is optional.
To work with an encrypted password the RMB exposes an API that can be used to set the secret key (stored only in memory). When creating the DB connection the RMB will check to see if the secret key has been set. If the secret key has been set, the RMB will decrypt the password with the secret key, and use the decrypted password to connect to the DB. Otherwise it will assume an un-encrypted password, and will connect using that password as-is.
A module can also set the secret key via the static method AES.setSecretKey(mykey)
The needed steps are:
- Generate a key
- Encrypt a password
- Include that password in the config file
- Either call
AES.setSecretKey(mykey)
or theadmin/set_AES_key
API (to load the secret key into memory)
A good way for a module to set the secret key is by using the post deployment hook interface in the RMB.
public class InitConfigService implements PostDeployVerticle {
@Override
public void init(Vertx vertx, Context context, Handler<AsyncResult<Boolean>> handler) {
System.out.println("Getting secret key to decode DB password.");
//** hard code the secret key - in production env - read from a secure place *//
String secretKey = "b2%2BS%2BX4F/NFys/0jMaEG1A";
int port = context.config().getInteger("http.port");
AdminClient ac = new AdminClient("http://localhost:" + port, null);
ac.postSetAESKey(secretKey, reply -> {
if(reply.statusCode() == 204){
handler.handle(io.vertx.core.Future.succeededFuture(true));
}
else{
handler.handle(io.vertx.core.Future.failedFuture(reply.statusCode() + ", " + reply.statusMessage()));
}
});
handler.handle(io.vertx.core.Future.succeededFuture(true));
}
}
A foreignKeys
entry in schema.json of the Tenant API automatically creates the following columns, triggers and indexes for the foreign key.
This additional column is needed because PostgreSQL does not directly support a foreign key constraint (referential integrity) of a field inside the JSONB.
Example, similar to the SQL produced by a foreignKeys
entry:
CREATE TABLE item (
id UUID PRIMARY KEY,
jsonb JSONB NOT NULL,
permanentLoanTypeId UUID REFERENCES loan_type,
temporaryLoanTypeId UUID REFERENCES loan_type
);
CREATE OR REPLACE FUNCTION update_item_references()
RETURNS TRIGGER AS $$
BEGIN
NEW.permanentLoanTypeId = NEW.jsonb->>'permanentLoanTypeId';
NEW.temporaryLoanTypeId = NEW.jsonb->>'temporaryLoanTypeId';
RETURN NEW;
END;
$$ language 'plpgsql';
CREATE TRIGGER update_item_references
BEFORE INSERT OR UPDATE ON item
FOR EACH ROW EXECUTE PROCEDURE update_item_references();
CREATE INDEX IF NOT EXISTS ON item (permanentLoanTypeId);
CREATE INDEX IF NOT EXISTS ON item (temporaryLoanTypeId);
CQL2PgJSON automatically uses this extracted column and its index whenever the foreign key is used.
The overhead of this trigger and foreign key constraint reduces the number of UPDATE transactions per second on this table by about 10% (when tested against an external stand alone Postgres database). See https://github.com/folio-org/raml-module-builder/blob/master/domain-models-runtime/src/test/java/org/folio/rest/persist/ForeignKeyPerformanceIT.java for the performance test. Doing the foreign key check manually by sending additional SELECT queries takes much more time than 10%.
See also foreign key CQL support.
Further CQL information.
The source code is at ./cql2pgjson and ./cql2pgjson-cli
Invoke like this:
// users.user_data is a JSONB field in the users table.
CQL2PgJSON cql2pgJson = new CQL2PgJSON("users.user_data");
String cql = "name=Miller";
String where = cql2pgJson.cql2pgJson(cql);
String sql = "select * from users where " + where;
// select * from users
// where CAST(users.user_data->'name' AS text)
// ~ '(^|[[:punct:]]|[[:space:]])Miller($|[[:punct:]]|[[:space:]])'
Or use toSql(String cql)
to get the ORDER BY
clause separately:
CQL2PgJSON cql2pgJson = new CQL2PgJSON("users.user_data");
String cql = "name=Miller";
SqlSelect sqlSelect = cql2pgJson.toSql(cql);
String sql = "select * from users where " + sqlSelect.getWhere()
+ " order by " + sqlSelect.getOrderBy();
Setting server choice indexes is possible, the next example searches name=Miller or email=Miller
:
CQL2PgJSON cql2pgJson = new CQL2PgJSON("users.user_data", Arrays.asList("name", "email"));
String cql = "Miller";
String where = cql2pgJson.cql2pgJson(cql);
String sql = "select * from users where " + where;
Searching across multiple JSONB fields works like this. The first json field specified in the constructor will be applied to any query arguments that aren't prefixed with the appropriate field name:
// Instantiation
CQL2PgJSON cql2pgJson = new CQL2PgJSON(Arrays.asList("users.user_data","users.group_data"));
// Query processing
where = cql2pgJson.cql2pgJson( "users.user_data.name=Miller" );
where = cql2pgJson.cql2pgJson( "users.group_data.name==Students" );
where = cql2pgJson.cql2pgJson( "name=Miller" ); // implies users.user_data
The field names (keys in JSON) are case sensitive. This is against the CQL specification of index names.
Only these relations have been implemented yet:
=
(this is==
for number matching andadj
for a string matching. Examples 1:height =/number 3.4
Example 2:title = Potter
)==
(field match, for examplebarcode == 883746123
or field prefix matchtitle == "Harry Pott*"
matching "Harry Potter and the chamber of secrets" but not "Science of Harry Potter";==/number
matches any form: 3.4 = 3.400 = 0.34e1)all
(each word of the query string exists somewhere,title all "Potter Harry"
matches "Harry X. Potter")any
(any word of the query string exists somewhere,title any "Potter Foo"
matches "Harry Potter")adj
(substring phrase match: all words of the query string exist consecutively in that order, there may be any whitespace and punctuation in between,title adj "Harry Potter"
matches "The Harry - . - Potter Story")>
>=
<
<=
<>
(comparison for both strings and numbers)
Note to mask the CQL special characters by prepending a backslash: * ? ^ " \
Example using StringUtil:
String query = "username==" + StringUtil.cqlEncode(username);
String url = "https://example.com/users?query=" + PercentCodec.encode(query);
Use quotes if the search string contains a space, for example title = "Harry Potter"
.
To speed up the ==
field matching use a b-tree "index"
or "uniqueIndex"
.
To speed up the word matching use a "fullTextIndex"
.
The "caseSensitive"
and "removeAccents"
configuration of the supporting index in schema.json is
used for all string matching operators; "fullTextIndex"
is always "caseSensitive": false
.
Refer to further explanation of CQL string matching for the "Exact match operator" and "Word match operators".
Matching modifiers: Only masked
is implemented, not unmasked
, regexp
,
honorWhitespace
, substring
.
Word begin and word end in JSON is only detected at whitespace and punctuation characters from the ASCII charset, not from other Unicode charsets.
Add the /number modifier to enable number matching, comparing and sorting, for example age ==/number 18
,
age >=/number 21
and sortBy age/number
.
3.4, 3.400, and 0.34e1 match each other when using ==/number
, and 2 is smaller than 19
(in contrast to string comparison where "2" > "19").
This requires that the value has been stored as a JSONB number ({"age": 19}
)
and not as a JSONB string ({"age": "19"}
).
The id field and any foreign key field declared in schema.json is a UUID field and is not searched in the JSONB but in an extracted proper database table field. An index is automatically created for such a field, do not add an index entry in schema.json.
=
, ==
, <>
, >
, >=
, <
, and <=
relations are supported for comparison with a valid UUID and use the index.
=
, ==
, and <>
relations allow *
for right truncation with index support.
=
is interpreted as ==
for the id field and foreign key fields declared in schema.json; for other fields
that contain a UUID =
results in a word full text match. Therefore =
should be avoided and ==
been used for all UUID fields.
Modifiers are forbidden.
See PostgreSQL's tsvector full text parser documentation how word splitting works when using a full text index. Some notable consequences:
CQL field adj "bar"
matches bar
, bar-baz
, foo-bar-baz
.
CQL field adj "bar baz"
matches bar baz
, bar-baz
, foo-bar-baz
, foo-bar baz
, bar-baz-foo
.
CQL field adj "bar-baz"
matches bar-baz
, but neither bar baz
nor foo-bar-baz
nor foo-bar baz
nor bar-baz-foo
.
CQL field adj "123 456"
matches 123 456
, but not 123-456
.
CQL field adj "123-456"
matches 123-456
, but not 123 456
.
foo/bar/baz
is a single word, while foo//bar//baz
, foo///bar///baz
, foo////bar////baz
, etc.
are split into the three words foo
, /bar
, and /baz
(always reduced to a single slash).
A search matching all records in the target index can be executed with a
cql.allRecords=1
(CQL standard) or a id=*
(RMB specific) query. They can be used alone or as part of
a more complex query, for example
cql.allRecords=1 NOT name=Smith sortBy name/sort.ascending
cql.allRecords=1 NOT name=Smith
matches all records where name does not contain Smith as a word or where name is not defined.name="" NOT name=Smith
matches all records where name is defined but does not contain Smith as a word.- For performance reasons, searching for
*
in any fulltext field will match all records as well, including records where that field does not exist.
A relation does not match if the value on the left-hand side is undefined. (but see the fulltext
*
case above).
A negation (using NOT) of a relation matches if the value on the left-hand side is
not defined or if it is defined but doesn't match.
name=""
matches all records where name is defined.cql.allRecords=1 NOT name=""
matches all records where name is not defined.name==""
matches all records where name is defined and empty.cql.allRecords=1 NOT name==""
matches all records where name is defined and not empty or where name is not defined.name="" NOT name==""
matches all records where name is defined and not empty.
For matching the elements of an array use these queries (assuming that lang is either an array or not defined, and assuming an array element value does not contain double quotes):
lang == "[]"
for matching records where lang is defined and an empty arraycql.allRecords=1 NOT lang <> "[]"
for matching records where lang is not defined or an empty arraylang == "*\"en\"*"
for matching records where lang is defined and contains the value encql.allRecords=1 NOT lang == "*\"en\"*"
for matching records where lang does not contain the value en (including records where lang is not defined)lang = "" NOT lang == "*\"en\"*"
for matching records where lang is defined and and does not contain the value enlang = ""
for matching records where lang is definedcql.allRecords=1 NOT lang = ""
for matching records where lang is not defined
RMB 26 or later supports array searches with relation modifiers, that are particular suited for structures like:
"property" : [
{
"type1" : "value1",
"type2" : "value2",
"subfield": "value"
},
...
]
An example of this kind of structure is contributors
(property) from
mod-inventory-storage . contributorTypeId
is the type of contributor
(type1).
With CQL you can limit searches to property
with regular match in
subfield
, with type1=value1 with
property =/@type1=value1 value
Observe that the relation modifier is preceded with the @-character to avoid clash with other CQL relation modifiers.
The type1, type2 and subfield must all be defined in schema.json, because the JSON schema is not known. And also because relation modifiers are unfortunately lower-cased by cqljava. To match value1 against the property contents of type1, full-text match is used.
Multiple relation modifiers with value are ANDed together. So
property =/@type1=value1/@type2=value2 value
will only give a hit if both type1 has value1 AND type2 has value2.
It is also possible to specify relation modifiers without value. This essentially is a way to override what subfield to search. In this case the right hand side term is matched. Multiple relation modifiers are OR'ed together. For example:
property =/@type1 value
And to match any of the sub properties type1, type2, you could use:
property =/@type1/@type2 value
In schema.json two new properties, arraySubfield
and arrayModifiers
,
specifies the subfield and the list of modifiers respectively.
This can be applied to ginIndex
or fullTextIndex
.
schema.json example:
{
"fieldName": "property",
"tOps": "ADD",
"caseSensitive": false,
"removeAccents": true,
"arraySubfield": "subfield",
"arrayModifiers": ["type1", "type2"]
}
For the identifiers example we could define things in schema.json with:
{
"fieldName": "identifiers",
"tOps": "ADD",
"arraySubfield": "value",
"arrayModifiers": ["identifierTypeId"]
}
This will allow you to perform searches, such as:
identifiers = /@identifierTypeId=7e591197-f335-4afb-bc6d-a6d76ca3bace 6316800312
CQL2PGjson allows generating and querying indexes that contain multiple columns. The index json object now has support for the following properties:
-
sqlExpression
Allows the user to explicitly define the expression they wish to use in the index"fieldName": "address", "sqlExpression": "concat_space_sql(jsonb->>'city', jsonb->>'state')",
-
multiFieldNames
This is a comma-separated list of json fields that are to be concatenated together via concat_ws with a space character. example:"fieldName": "address", "multiFieldNames": "city,state",
These 2 examples are equivalent and would be queried by using the fieldName such as:
address = Boston MA
These fields are optional but mutually exclusive, you only need one of them.
CQL2PgJSON supports cross table joins via subquery based on foreign keys. This allows arbitrary depth relationships in both child-to-parent (many-to-one) and parent-to-child (one-to-many) direction.
Example relationship: item → holdings_record → instance
Join conditions of this example:
- item.holdingsRecordId = holdings_record.id
- holdings_record.instanceId = instance.id
The field in the child table points to the primary key id
field of the parent table; the parent table is also called the target table.
- Precede the index you want to search, with the table name in camelCase, e.g.
instance.title = "bee"
. - There is no change with child table fields. Use them in the regular way without table name prefix.
- The
foreignKey
entry in schema.json automatically creates an index on the foreign key field. - For fast queries, declare an index on any other searched field like
title
in the schema.json file. - For a multi-table join, use
targetPath
instead offieldName
and put the list of field names into thetargetPath
array. ThetargetPath
array must be in child-to-parent direction (many-to-one), e.g. item → holdings_record → instance, queries are possible in both directions. - When querying in parent → child direction each sub-expression may refer to a different child/grandchild/...:
Searching for instances using
item.status.name=="Missing" and item.effectiveLocationId="fcd64ce1-6995-48f0-840e-89ffa2288371"
will look for an instance that has at least one item that is missing and this or some other item of the instance has the effective location. Use a child-to-parent query against the item endpoint if both conditions should apply to the same item record. - Use
= *
to check whether a join record exists. This runs a cross index join with no further restriction, e.g.instance.id = *
. - The sortBy clause doesn't support foreign table fields. Use the API endpoint of the records with the field you want to sort on.
- The schema for the above example:
{
"tables": [
{
"tableName": "instance",
"index": [
{
"fieldName": "title",
"tOps": "ADD",
"caseSensitive": false,
"removeAccents": true
}
]
},
{
"tableName": "holdings_record",
"foreignKeys": [
{
"fieldName": "instanceId",
"targetTable": "instance",
"targetTableAlias": "instance",
"tableAlias": "holdingsRecord",
"tOps": "ADD"
}
]
},
{
"tableName": "item",
"foreignKeys": [
{
"fieldName": "holdingsRecordId",
"targetTable": "holdings_record",
"targetTableAlias": "holdingsRecord",
"tableAlias": "item",
"tOps": "ADD"
},
{
"targetPath": ["holdingsRecordId", "instanceId"],
"targetTable": "instance",
"targetTableAlias": "instance",
"tableAlias": "item"
}
]
}
]
}
The property targetTableAlias
enables that parent table name in CQL queries against the current child table.
The property tableAlias
enables that child table name in CQL queries against the target/parent table.
If any of these two properties is missing, then that respective foreign key join syntax is disabled.
The name may be different from the table name (tableName
, targetTable
). One use case is to change to camelCase, e.g.
"targetTable": "holdings_record"
and "targetTableAlias": "holdingsRecord"
. Another use case is
to resolve ambiguity when two foreign keys point to the same target table, example:
{
"tableName": "item",
"foreignKeys": [
{
"fieldName": "permanentLoanTypeId",
"tableAlias": "itemWithPermanentLoanType",
"targetTable": "loan_type",
"targetTableAlias": "loanType",
"tOps": "ADD"
},
{
"fieldName": "temporaryLoanTypeId",
"tableAlias": "itemWithTemporaryLoanType",
"targetTable": "loan_type",
"targetTableAlias": "temporaryLoanType",
"tOps": "ADD"
}
]
}
Running CQL loanType.name == "Can circulate"
against the item endpoint returns all items where the item's permanentLoanTypeId points to a loan_type where the loan_type's name equals "Can circulate".
Running CQL temporaryLoanType.name == "Can circulate"
against the item endpoint returns all items where the item's temporaryLoanTypeId points to a loan_type where the loan_type's name equals "Can circulate".
Running CQL itemWithPermanentLoanType.status == "In transit"
against the loan_type endpoint returns all loan_types where there exists an item that has this loan_type as a permanentLoanType and where the item's status equals "In transit".
Running CQL itemWithTemporaryLoanType.status == "In transit"
against the loan_type endpoint returns all loan_types where there exists an item that has this loan_type as a temporaryLoanType and where the item's status equals "In transit".
All locally produced Exceptions are derived from a single parent so they can be caught collectively
or individually. Methods that load a JSON data object model pass in the identity of the model as a
resource file name, and may also throw a native java.io.IOException
.
CQL2PgJSONException
├── FieldException
├── SchemaException
├── ServerChoiceIndexesException
├── CQLFeatureUnsupportedException
└── QueryValidationException
└── QueryAmbiguousException
To run the unit tests in your IDE, the Unicode input files must have been produced by running maven. In Eclipse you may use "Run as ... Maven Build" for doing so.
The Postgres Client support in the RMB is schema specific, meaning that it expects every tenant to be represented by its own schema. The Tenant API is asynchronous as of RMB 32 and later. The tenant job is initiated by a POST and may be inspected with GET and optionally be cleaned up with DELETE.
The RAML defining the API:
https://github.com/folio-org/raml/blob/tenant_v2_0/ramls/tenant.raml
By default RMB includes an implementation of the Tenant API which assumes Postgres being present. Implementation in TenantAPI.java file. You might want to extend/override this because:
- You want to not call it at all (your module is not using Postgres).
- You want to provide further Tenant control, such as loading reference and/or sample data.
Since RMB 32, the postTenant that is provided is working in
an asynchronous fashion. The handler provided will be called
very early in the process and before schema.json
is fully processed.
For this reason doing anything in the handler is problematic as the
database is not fully populated with data. Furthermore, if there are
any errors that could occur in your post handler, it seems natural to
throw an error, but the underlying RMB code does not know about that, and
thus, continues its operation - including managing a job in the background.
Here's an example of incorrect usage. DO NOT DO THIS:
super.postTenant(entiry, headers, ar -> {
LiquibaseUtil.initializeSchemaForTenant(vertx, tenantId);
if (someerror) {
handler.handle(Future.succeededFuture(
PostTenantResponse.respond400TextPlain(...)));
}
}, context);
The LiquibaseUtil.initializeSchemaForTenant
assumes that the database is configured. The handler is called when the module
is invoked always (on purge, disable as well). Throwing error is problematic
because postTenant has already done something. And, finally, the status
code of ar
is not even inspected.
For these reasons, in most cases, it is discouraged to call
super.postTenant
. Instead extend the loadData
method (see below).
If your module - based on RMB - does not use a storage then there is not
a problem - you must not call super.postTenant
and you can just
implement an easy postTenant implementation in sync fashion -
returning 204 for OK/No Content.
Extend the loadData
method, to load sample/reference data for a module
or do any other initialiation of a module - such as doing LiquibaseUtil
work. The loadData is called upon create/upgrade.
@Validate
@Override
Future<Integer> loadData(TenantAttributes attributes, String tenantId,
Map<String, String> headers, Context vertxContext) {
return super.loadData(attributes, tenantId, headers, vertxContext)
.compose(superRecordsLoaded -> {
// load n records or any other initialization
return Future.succeededFuture(superRecordsLoaded + n);
});
}
The Integer here is the number of records loaded - just add 0 if no sample/ref is loaded.
There is no right way to load data, but consider that data load will be both happening for first time tenant usage of the module and during an upgrade process. Your data loading should be idempotent. If files are stored as resources and as JSON files, you can use the TenantLoading utility.
import org.folio.rest.tools.utils.TenantLoading;
@Validate
@Override
Future<Integer> loadData(TenantAttributes attributes, String tenantId,
Map<String, String> headers, Context vertxContext) {
return super.loadData(attributes, tenantId, headers, vertxContext)
.compose(recordsLoaded -> new TenantLoading()
// two sets of reference data files
// resources ref-data/data1 and ref-data/data2 .. loaded to
// okapi-url/instances and okapi-url/items respectively
.withKey("loadReference").withLead("ref-data")
.withIdContent()
.add("data1", "instances")
.add("data2", "items")
.perform(attributes, headers, vertxContext, recordsLoaded));
}
If data is already in resources, then fine. If not, then copy it with maven-resource-plugin.
For example, to copy reference-data
to ref-data
in resources:
<execution>
<id>copy-reference-data</id>
<phase>process-resources</phase>
<goals>
<goal>copy-resources</goal>
</goals>
<configuration>
<outputDirectory>${basedir}/target/classes/ref-data</outputDirectory>
<resources>
<resource>
<directory>${basedir}/reference-data</directory>
<filtering>true</filtering>
</resource>
</resources>
</configuration>
</execution>
For unit testing method TenantAPI.postTenantSync
may be used
to initialize the tenant. This method is a straight API call and is
not called via HTTP. Use it before/after using our own provided
interfaces in org.folio.rest.impl
. Example:
// create
TenantAPI tenantAPI = new TenantAPI();
TenantAttributes tenantAttributes = new TenantAttributes();
tenantAttributes.setModuleTo("mod-2.0.0");
tenantAPI.postTenantSync(tenantAttributes, okapiHeaders, context.asyncAssertSuccess(result ->
assertThat(result.getStatus(), is(204))
), vertx.getOrCreateContext());
A HTTP based alternatives is TenantInit.exec
with a similar interface. Example:
TenantClient client = new TenantClient(....);
TenantAttributes tenantAttributes = new TenantAttributes().withModuleTo("mod-2.0.0");
TenantInit.exec(client, tenantAttributes, 60000).onComplete(context.asyncAssertSuccess());
The Postgres based Tenant API implementation will look for a file at /resources/templates/db_scripts/
called schema.json
The file contains an array of tables and views to create for a tenant on registration (tenant api post)
An example can be found here:
The top level properties in schema.json (some of which are optional) are scripts
, tables
, views
and exactCount
.
Entries in the json file to be aware of:
For each table in tables
property:
tableName
- name of the table that will be generated - this is the table that should be referenced from the code. Maximum length is 49 characters.generateId
- No longer supported. This functionality is not stable in Pgpool-II see https://www.pgpool.net/docs/latest/en/html/restrictions.html. The solution is to generate a UUID in java in the same manner as https://github.com/folio-org/raml-module-builder/blob/v23.11.0/domain-models-runtime/src/main/java/org/folio/rest/persist/PgUtil.java#L358fromModuleVersion
- this optional field indicates the version in which the table was created / updated in. When a tenant update is requested - only versions older than the indicated version will generate the declared table. This ensures that if a module upgrades from an older version, the needed tables will be generated for it, however, subsequent upgrades from versions equal or later than the version indicated for the table will not re-generate the table.- Note that this is enforced for all tables, views, indexes, FK, triggers, etc. (via the
IF NOT EXISTS
sql Postgres statement)
- Note that this is enforced for all tables, views, indexes, FK, triggers, etc. (via the
mode
- should be used only to indicatedelete
withMetadata
- will generate the needed triggers to populate the metadata section in the json on update / insertlikeIndex
- indicate which fields in the json will be queried using theLIKE
. Needed for fields that will be faceted on.fieldName
the field name in the json for which to create the index, maximum name length is 49 characters.- the
tOps
indicates the table operation - ADD means to create this index, DELETE indicates this index should be removed - the
caseSensitive
allows you to create case insensitive indexes (boolean true / false), if you have a string field that may have different casings and you want the value to be unique no matter the case. Defaults to false. removeAccents
- normalize accents or leave accented chars as is. Defaults to true.- the
whereClause
allows you to create partial indexes, for example: "whereClause": "WHERE (jsonb->>'enabled')::boolean = true" stringType
- defaults to true - if this is set to false than the assumption is that the field is not of type text therefore ignoring the removeAccents and caseSensitive parameters.arrayModifiers
- specifies array relation modifiers supported for some index. The modifiers must exactly match the name of the property in the JSON object within the array.arraySubfield
- is the key of the object that is used for the primary term when array relation modifiers are in use. This is typically also defined whenarrayModifiers
are also defined.multiFieldNames
- see CQL2PgJSON: Multi Field Index abovesqlExpression
- see CQL2PgJSON: Multi Field Index abovesqlExpressionQuery
- How to wrap the query string $ before matching against the index expression. This overwrites anycaseSensitive
andremoveAccents
wrapping. Example that manually implements case insensitive matching:"sqlExpression": "lower(jsonb->>'name')", "sqlExpressionQuery": "lower($)"
. Example for plain field, note the required braces:"sqlExpression": "(jsonb->>'name')", "sqlExpressionQuery": "$"
.- Do not manually add an index for an
id
field or a foreign key field. They get indexed automatically.
ginIndex
- generate an inverted index on the JSON using thegin_trgm_ops
extension. Allows for left and right truncation LIKE queries and regex queries to run in an optimal manner (similar to a simple search engine). Note that the generated index is large and does not support the full field match (SQL=
operator and CQL==
operator without wildcards). See thelikeIndex
for available options.uniqueIndex
- create a unique index on a field in the JSON- the
tOps
indicates the table operation - ADD means to create this index, DELETE indicates this index should be removed - the
whereClause
allows you to create partial indexes, for example: "whereClause": "WHERE (jsonb->>'enabled')::boolean = true" - Adding a record will fail if the b-tree index byte size limit of 2712 is exceeded. Consider enforcing a length limit on the field, for example by adding a 600 character limit (multi-byte characters) to the RAML specification.
- See additional options in the likeIndex section above
- the
index
- create a btree index on a field in the JSON- the
tOps
indicates the table operation - ADD means to create this index, DELETE indicates this index should be removed - the
whereClause
allows you to create partial indexes, for example: "whereClause": "WHERE (jsonb->>'enabled')::boolean = true" - See additional options in the likeIndex section above
- The expression is wrapped into left(..., 600) to prevent exceeding the b-tree byte size limit of 2712. Special case: sqlExpression is not wrapped.
- the
fullTextIndex
- create a full text index using the tsvector features of postgres.removeAccents
can be used, the defaultcaseSensitive: false
cannot be changed because tsvector always converts to lower case.- See CQL: Matching full text to learn how word splitting works.
- The
tOps
is optional (like for all indexes), and defaults to ADDing the index. whereClause
andstringType
work as forlikeIndex
above.
withAuditing
- Creates an auditing table and a trigger that populates the audit table with the history of the table record whenever an insert, update, or delete occurs."withAuditing": true
for enabled,false
or undefined for disabled.auditingTableName
The name of the audit table.auditingFieldName
The field (JSON property) in the audit record that contains the copy of the original record."withAuditing": true
automatically creates the auditing table; an entry of the audit table in the "tables" section of schema.json is optional, for example to create indexes.- The
auditingSnippet
section allows some customizations to the auditing function with custom SQL in the declare section and the body (for either insert / update / delete). - The audit table jsonb column has three fields:
$auditingFieldName
contains the original record (jsonb from the original table),id
contains a new unique id,operation
containsI
,U
,D
for insert, update, delete, andcreatedDate
contains the time when the audit record was created.
foreignKeys
- adds / removes foreign keys (trigger populating data in a column based on a field in the JSON and creating a FK constraint)customSnippetPath
- a relative path to a file with custom SQL commands for this specific tabledeleteFields
/addFields
- delete (or add with a default value), a field at the specified path for all JSON entries in the tablepopulateJsonWithId
- This schema.json entry and the disable option is no longer supported. The primary key is always copied intojsonb->'id'
on each insert and update.pkColumnName
- No longer supported. The name of the primary key column is alwaysid
and is copied intojsonb->'id'
in each insert and update. The method PostgresClient.setIdField(String) no longer exists.withOptimisticLocking
-off
(default),logOnConflict
,failOnConflictUnlessSuppressed
, orfailOnConflict
, for details see Optimistic Locking section below
The views section is a bit more self explanatory, as it indicates a viewName and the two tables (and a column per table) to join by. In addition to that, you can indicate the join type between the two tables. For example:
"views": [
{
"viewName": "items_mt_view",
"join": [
{
"table": {
"tableName": "item",
"joinOnField": "materialTypeId"
},
"joinTable": {
"tableName": "material_type",
"joinOnField": "id",
"jsonFieldAlias": "mt_jsonb"
}
}
]
}
]
Behind the scenes this will produce the following statement which will be run as part of the schema creation:
CREATE OR REPLACE VIEW ${tenantid}_${module_name}.items_mt_view AS
SELECT u.id, u.jsonb as jsonb, g.jsonb as mt_jsonb
FROM ${tenantid}_${module_name}.item u
JOIN ${tenantid}_${module_name}.material_type g
ON lower(f_unaccent(g.jsonb->>'id')) = lower(f_unaccent(u.jsonb->>'materialTypeId'))
Notice the lower(f_unaccent(
functions, currently, by default, all string fields will be wrapped in these functions (will change in the future).
A three table join would look something like this:
{
"viewName": "instance_holding_item_view",
"join": [
{
"table": {
"tableName": "instance",
"joinOnField": "id"
},
"joinTable": {
"tableName": "holdings_record",
"joinOnField": "instanceId",
"jsonFieldAlias": "ho_jsonb"
}
},
{
"table": {
"tableName": "holdings_record",
"joinOnField": "id",
"jsonFieldAlias": "ho2_jsonb"
},
"joinTable": {
"tableName": "item",
"joinOnField": "holdingsRecordId",
"jsonFieldAlias": "it_jsonb"
}
}
]
}
The script section allows a module to run custom SQLs before table / view creation/updates and after all tables/views have been created/updated.
The fields in the script section include:
run
- eitherbefore
orafter
the tables / views are generatedsnippet
- the SQL to run. There is no variable replacement. UsesnippetPath
if variable replacement or line breaks for readability are needed.snippetPath
- relative path to a file with SQL script to run. IfsnippetPath
is set thensnippet
field will be ignored.fromModuleVersion
- same asfromModuleVersion
for table. If no fromModuleVersion is provided the SQL runs on each upgrade. This is reasonable if the SQL runs fast and is idempotent, for exampleCREATE OR REPLACE FUNCTION
and you want to avoid the error-prone fromModuleVersion value maintenance needed when changing the function. The SQL also runs on the initial install of a tenant.
A snippetPath
SQL file (but not a snippet
SQL statement) can make use of
FreeMarker template engine that runs and evaluates on runtime. For examples, see RMB's
db_scripts directory.
RMB provides these variables:
${myuniversity}
tenant id, for examplediku
${mymodule}
module name, for examplemod-inventory-storage
${mode}
eitherCREATE
,UPDATE
, orDELETE
${version}
the previous module version on update (for examplemod-inventory-storage-18.0.0
or18.0.0
), or0.0
if not available (not provided by Okapi or inCREATE
mode).${newVersion}
the new version that currently gets installed (CREATE
mode) or is upgraded to (UPDATE
mode), for examplemod-inventory-storage-18.1.0
or18.1.0
.${rmbVersion}
the RMB version that the module currently uses, for example29.3.2
.${exactCount}
theexactCount
number from schema.json or the default (1000).
In addition the tables
, views
and scripts
sections of schema.json are available.
Maven's pom.xml file may contain <filtering>true</filtering>
to replace variables at build time,
it also uses the ${}
syntax. Disable maven's filtering for the SQL script files or exclude them from filtering,
otherwise it replaces for example ${version}
by a wrong value (the current module version from pom.xml).
The exactCount section is optional and the value of the property is a simple integer with a default value of 1000 for the estimated totalRecords hit count calculation.
The tables / views will be generated in the schema named tenantid_modulename
The x-okapi-tenant header passed in to the API call will be used to get the tenant id. The value used for the module name is the artifactId found in the pom.xml (the parent artifactId is used if one is found).
When upgrading a module via the Tenant API, an index is deleted if either "tOps": "DELETE"
is set or the complete index entry is removed. Note that indexes are the only elements where removing the entry in schema.json
removes them from the database.
"tOps" example:
"index": [
{
"fieldName": "title",
"tOps": "DELETE",
"caseSensitive": false,
"removeAccents": true
}
]
RMB removes a table and all related SQL functions on the next module upgrade if the table entry contains "mode": "DELETE"
:
{
"tableName": "holdings_record",
"mode": "DELETE"
}
Posting a new tenant must include a body. The body should contain a JSON conforming to the tenantAttributes schema.
To enable a new module indicate the version module in module_to
and omit module_from
.
To upgrade a module indicate the existing version module in module_from
and the new version module in module_to
.
To disable a module indicate the existing version module in module_from
and omit module_to
.
The body may also hold a parameters
property to specify per-tenant
actions/info to be done during tenant creation/update.
As of now (this may change in the future), securing a tenant's connection to the database via an encrypted password can be accomplished in the following way:
- Set the secret key (as described in the Securing DB Configuration file section)
The PASSWORD will be replaced with the following: encrypt(tenant id with secret key) = new tenant's password The new tenant's password will replace the default PASSWORD value (which is the tenantid_modulename) The RMB Postgres client will use the secret key and the passed in tenant id to calculate the tenant's password when DB connections are needed for that tenant. Note that if you use the tenant API and set the secret key - the decrypting of the password will be done by the Postgres Client for each tenant connection.
The RMB comes with a TenantClient to facilitate calling the API via URL. To post a tenant via the client:
TenantClient tClient = null;
tClient = new TenantClient("http://localhost:" + port, "mytenantid", "sometoken");
tClient.post( response -> {
response.bodyHandler( body -> {
System.out.println(body.toString());
async.complete();
});
});
When this API is called RMB will basically drop the schema for the tenant (CASCADE) as well as drop the user
Some Postgres Client examples
Examples:
Saving a POJO within a transaction:
PoLine poline = new PoLine();
...
postgresClient.withTrans(conn -> {
return conn.getByIdForUpdate(SOME_OTHER_TABLE, id)
.compose(x -> ...)
.compose(x -> conn.save(TABLE_NAME_POLINE, poline)
.compose(x -> ...)
}).compose(res -> ...
Querying for similar POJOs in the DB (with or without additional criteria):
Criterion c = new Criterion(new Criteria().addField("id").setJSONB(false).setOperation("=").setVal(entryId));
postgresClient.get(TABLE_NAME_POLINE, PoLine.class, c)
.compose(x -> ...
The RAMLs API is a multiple interface which affords RMB modules to expose their RAML files in a machine readable way. To enable the interface the module must add the following to the provides array of its module descriptor:
{
"id": "_ramls",
"version": "1.0",
"interfaceType" : "multiple",
"handlers" : [
{
"methods" : [ "GET" ],
"pathPattern" : "/_/ramls",
"permissionsRequired" : [ ]
}
]
}
The interface has a single GET endpoint with an optional query parameter path. Without the path query parameter the response will be an application/json array of the available RAMLs. This will be the immediate RAMLs the module provides. If the query parameter path is provided it will return the RAML at the path if exists. The RAML will have HTTP resolvable references. These references are either to JSON Schemas or RAMLs the module provides or shared JSON Schemas and RAMLs. The shared JSON Schemas and RAMLs are included in each module via a git submodule under the path raml_util
. These paths are resolvable using the path query parameter.
The RAML defining the API:
The JSON Schemas API is a multiple interface which affords RMB modules to expose their JSON Schema files in a machine readable way. To enable the interface the module must add the following to the provides array of its module descriptor:
{
"id": "_jsonSchemas",
"version": "1.0",
"interfaceType" : "multiple",
"handlers" : [
{
"methods" : [ "GET" ],
"pathPattern" : "/_/jsonSchemas",
"permissionsRequired" : [ ]
}
]
}
The interface has a single GET endpoint with an optional query parameter path.
Without the path query parameter the response will be an "application/json" array of the available JSON Schemas. By default this will be JSON Schemas that are stored in the root of ramls directory of the module. Returned list of schemas can be customized in modules pom.xml file.
The ramls
subdirectory is the default to search for schema files. Add <ramlDirs>
to the domain-models-maven-plugin configuration in pom.xml to change it, for example:
<configuration>
<ramlDirs>
<ramlDir>ramls</ramlDir>
<ramlDir>ramls/raml-util/ramls</ramlDir>
</ramlDirs>
</configuration>
If the query parameter path is provided it will return the JSON Schema at the path if exists. The JSON Schema will have HTTP resolvable references. These references are either to JSON Schemas or RAMLs the module provides or shared JSON Schemas and RAMLs. The shared JSON Schemas and RAMLs are included in each module via a git submodule under the path raml_util
. These paths are resolvable using the path query parameter.
The RAML defining the API:
The RMB can receive parameters of different types. Modules can declare a query parameter and receive it as a string parameter in the generated API functions.
The RMB exposes an easy way to query, using CQL (Contextual Query Language). This enables a seamless integration from the query parameters to a prepared "where" clause to query with.
//create object on table.field
CQL2PgJSON cql2pgJson = new CQL2PgJSON("tablename.jsonb");
//cql wrapper based on table.field and the cql query
CQLWrapper cql = new CQLWrapper(cql2pgJson, query);
//query the db with the cql wrapper object
PostgresClient.getInstance(context.owner(), tenantId).get(CONFIG_COLLECTION, Config.class,
cql, true,
The CQLWrapper can also get an offset and limit:
new CQLWrapper(cql2pgJson, query).setLimit(new Limit(limit)).setOffset(new Offset(offset));
A CQL querying example:
http://localhost:<port>/configurations/entries?query=scope.institution_id=aaa%20sortBy%20enabled
RMB adds a totalRecords
field to result sets. For limit
= 0 it contains the exact value, otherwise an estimation how many records were matching if paging parameters were set to offset
= 0 and limit
= unlimited. For totalRecords
= none the calculation and the field are suppressed.
Examples (before percent encoding):
/instance-storage/instances?query=title=garden&limit=0 /users?query=barcode=="1234"&totalRecords=none
For estimation it uses this algorithm:
- Run "EXPLAIN SELECT" to get an estimation from PostgreSQL.
- If this is greater than 4*1000 return it and stop.
- Run "SELECT COUNT(*) FROM query LIMIT 1000".
- If this is less than 1000 return it (this is the exact number) and stop.
- If the result from 1. is less than 1000 return 1000 and stop.
- Return result from 1. (this is a value between 1000 and 4*1000).
Step 2. contains the factor 4 that should be adjusted when there is empirical data for a better number. Step 3. may take long for full text queries with many hits, therefore we need steps 1.-2.
1000 is configurable, see the Post Tenant API how to set exactCount
in schema.json.
RMB adjusts totalRecords
when the number of returned records in the current result set and the paging parameters offset
and limit
proves it wrong:
- If no record is returned and
totalRecords > offset
then adjusttotalRecords = offset
. - If the number of records returned equals
limit
andtotalRecords < offset + limit
then adjusttotalRecords = offset + limit
. - If the number of records is at least one but less than
limit
then adjusttotalRecords = offset +
number of records returned (this is the exact number).
Note that clients should continue on the next page when totalRecords = offset + limit
because there may be more records.
This is the exact count guarantee:
For limit
= 0 an exact count is returned without any records. Otherwise:
If a result set has a totalRecords
value that is less than 1000 then it is the exact count; if it is 1000 or more it may be an estimate.
If the exact count is less than 1000 then totalRecords
almost always contains the exact count; only when PostgreSQL estimates it to be more than 4*1000 then it contains that overestimation.
Replace 1000 by exactCount
if configured differently.
Performance warning: Depending on query and database indexes using limit
= 0 may take long, may put load on the database and should run asynchronously in the front-end. If there isn't a supporting database index it requires a full table scan. In all cases PostgreSQL needs to check the visibility map, see Slow counting and Index-only_scans.
RMB is aware of the metadata.schema. When a request (POST / PUT / PATCH) comes into an RMB module, RMB will check if the passed-in JSON's schema declares a reference to the metadata schema. If so, RMB will populate the JSON with a metadata section with the current user and the current time. RMB will set both update and create values to the same date/time and to the same user, as accepting this information from the request may be unreliable. The module should persist the creation date and the created by values after the initial POST. For an example of this using SQL triggers see metadata.ftl. Add withMetadata to the schema.json to create that trigger.
RMB supports optimistic locking. By default it is disabled. Module developers can enable it by adding attribute withOptimisticLocking
to the table definition in schema.json. The available options are listed below. Unless off
is specified, a database trigger will be created to auto populate the _version
field in json on insert. On jsonb column update a database trigger will check that the _version
field is the same in the incoming record and in the database, and will then increment the _version
field before updating the database record. Note, _version
field has to be defined in json to make this work. PgUtil reports a version conflict error as 409 HTTP response code if 409 is defined in RAML, otherwise, it will fall back to use 400 or 500 HTTP response code. The client should never increment the _version
property; only the trigger should increment it to ensure that the check and the increment are in the same transaction to avoid any race condition.
off
Optimistic Locking is disabled. The trigger is removed if it existed before.failOnConflict
Optimistic Locking is enabled. Version conflict will fail the transaction with a customized SQL error code 23F09.failOnConflictUnlessSuppressed
The same asfailOnConflict
but setting_version
to-1
is handled as if the current_version
were provided. This must be enabled with theDB_ALLOW_SUPPRESS_OPTIMISTIC_LOCKING
enviroment variable, see above. To have one API with this suppression and one without for the same database table add code to the latter that removes the_version
property if it is-1
. Suppressing optimistic locking is known to lead to data loss in some cases, don't use in production, you have been warned!logOnConflict
Optimistic Locking is enabled but the conflicting update succeeds, the version conflict info is logged with a customized SQL error code 23F09. Consult Vertx logging and PostgreSQL Errors and Messages iflogOnConflict
doesn't log.
Upsert fails when using INSERT ... ON CONFLICT (id) DO UPDATE
because the INSERT
trigger overwrites the _version
property that the UPDATE
trigger uses to detect optimistic locking conflicts. Instead use RMB's upsert
plpgsql function or implement upsert using a transaction and UPDATE ...; IF NOT FOUND THEN INSERT ...
Sample log entries:
01:14:07 [] [] [] [] WARN ? Backend notice: severity='NOTICE', code='23F09', message='Ignoring optimistic locking conflict while overwriting changed record 57db089f-18e4-7815-55d5-4cc6607e9059: Stored _version is 2, _version of request is "1"' ...
01:15:20 [] [] [] [] ERROR PostgresClient saveBatch size=2 { "message": "version conflict", "severity": "ERROR", "code": "23F09", "where": "PL/pgSQL function raise_409() line 1 at RAISE", "file": "pl_exec.c", "line": "3337", "routine": "exec_stmt_raise" }
Use mod-inventory-storage instance
table as an example, do following to enable optimistic locking
- in db
schema.json
, addwithOptimisticLocking
attribute forinstance
table definition
{
"tableName": "instance",
"withOptimisticLocking": "logOnConflict",
...
}
- in
instance.json
, add_version
field
{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "An instance record",
"type": "object",
"properties": {
"_version": {
"type": "integer",
"description": "Record version for optimistic locking"
}
...
}
- update raml file to define 409 response code. For example in
instance-storage.raml
add below for/{instanceId}
API
put:
responses:
409:
description: "Conflict"
body:
text/plain:
example: "Optimistic locking version has changed"
When enabling the withOptimisticLocking
property on a table definition in schema.json
an optional _version
property MUST be added to the related entity definition in JSON Schema.
When changing the value of the withOptimisticLocking
property, there MUST be a corresponding change to the API version:
- when changing from
off
tologOnConflict
– the API version change should be minor (e.g 1.1 to 1.2) - when changing from
off
orlogOnConflict
tofailOnConflict
– the API version change should be major (e.g 1.1 to 2.0) - when changing from
failOnConflict
tooff
orlogConflict
– the API version change should be major (e.g 2.0 to 3.0)
RMB also allows easy faceting of result sets. The grouping / faceting is done in the database. To add faceting to your API.
- Add the faceting RAML trait to your RAML and reference it from the endpoint (using the is:[])
- facet query parameter format:
facets=a.b.c
orfacets=a.b.c:10
(they are repeating). For example?facets=active&facets=personal.lastName
- facet query parameter format:
- Add the resultInfo.schema to your RAML and reference it within your collection schemas. For example:
"type": "object",
"properties": {
"items": {
"id": "items",
"type": "array",
"items": {
"type": "object",
"$ref" : "item.json"
}
},
"resultInfo": {
"type": "object",
"$ref": "raml-util/schemas/resultInfo.schema"
}
}
- When building your module, an additional parameter will be added to the generated interfaces of the faceted endpoints.
List<String> facets
. You can simply convert this list into a List of Facet objects using the RMB tool as follows:List<FacetField> facetList = FacetManager.convertFacetStrings2FacetFields(facets, "jsonb");
and pass thefacetList
returned to thepostgresClient
'sget()
methods.
You can set the amount of results to facet on by calling (defaults to 10,000) FacetManager.setCalculateOnFirst(20000);
Note that higher numbers will potentially affect performance.
- Faceting on array fields can be done in the following manner:
personal.secondaryAddress[].postalCode
personal.secondaryAddress[].address[].postalCode
NOTE: Creating an index on potential facet fields may be required so that performance is not greatly hindered
It is possible to indicate that a field in the JSON is a readonly field when declaring the schema. "readonly": true
. From example:
"resultInfo": {
"$ref": "raml-util/schemas/resultInfo.schema",
"readonly" : true
}
A readonly
field is not allowed to be passed in as part of the request. A request that contains data for a field that was declared as readonly
will have its read-only fields removed from the passed in data by RMB (the data will be passed into the implementing functions without the read-only fields)
This is part of a framework exposed by RMB which allows creating a field and associating a validation constraint on that field.
To add a custom field, add a system property (in the configuration) to the plugin definition (in the pom.xml
) running the <mainClass>org.folio.rest.tools.GenerateRunner</mainClass>
for example:
<systemProperty>
<key>jsonschema.customfield</key>
<value>{"fieldname" : "readonly" , "fieldvalue": true , "annotation" : "javax.validation.constraints.Null"}</value>
</systemProperty>
the jsonschema.customfield
key can contain multiple JSON values (delimited by a ;
). Each JSON indicates a field name + a field value to match against - and a validation annotation to apply. So, getting back to the readonly field, the example above indicates that a field in the JSON schema that has been tagged with the readonly
field can not contain data when passed in as part of the request.
A list of available annotations:
https://docs.oracle.com/javaee/7/api/javax/validation/constraints/package-summary.html
To customize generation of java classes, add a system property to plugin definition running <mainClass>org.folio.rest.tools.GenerateRunner</mainClass>
.
Properties that start with jsonschema2pojo.config
will be passed to underlying library that generates java classes.
Incomplete list of available properties:
- jsonschema2pojo.config.includeHashcodeAndEquals - adds hashCode and equals methods
- jsonschema2pojo.config.includeToString - adds toString method
- jsonschema2pojo.config.serializable - makes classes serializable
For more available properties see: https://joelittlejohn.github.io/jsonschema2pojo/site/1.0.0/generate-mojo.html https://github.com/mulesoft-labs/raml-for-jax-rs/blob/master/raml-to-jaxrs/jaxrs-code-generator/src/main/java/org/raml/jaxrs/generator/RamlToJaxRSGenerationConfig.java
A module may require slight changes to existing RAML traits.
For example, a limit
trait may be defined in the following manner:
limit:
description: Limit the number of elements returned in the response
type: integer
required: false
example: 10
default: 10
minimum: 1
maximum: 2147483647
However, a module may not want to allow such a high maximum as this may cause a crash.
A module can create a raml_overrides.json
file and place it in the /resources/overrides/
directory.
The file is defined in the schema:
domain-models-maven-plugin/src/main/resources/overrides/raml_overrides.schema
.
Note that DEFAULTVALUE
only allows string values. SIZE
requires a range ex. "15, 20"
. REQUIRED
does not accept a "value"
, meaning an optional parameter can become required but not vice versa.
Example: domain-models-maven-plugin/src/main/resources/overrides/raml_overrides.json
RAML parameter types can be boxed primitives using a plugin. i.e. boolean
to be a Boolean
.
Here is an example of a boxed boolean.
In order to use plugins you must use a library for annotationTypes
:
uses:
ramltojaxrs: raml-util/library/ramltojaxrs.raml
This can be found in the raml repository. ramltojaxrs.raml
Then can apply plugin to a type:
type: boolean
(ramltojaxrs.types):
plugins:
- name: core.box
Caveat
Currently RMB only supports Boolean and Integer Boxed type query parameters. Both Boolean and Integer will become null if parameter not present. To add additional type support, see RestVerticle.java.
The runtime framework comes with a set of messages it prints out to the logs /
sends back as error responses to incorrect API calls. These messages are
language-specific. In order to add your own message files, place the files in
your project under the /resources/messages
directory.
Note that the format of the file names should be either:
[name]_[lang_2_letters].properties
(e.g.:APIMessages_de.properties
)[lang_2_letters]_messages.prop
(e.g.:en_messages.prop
)
For example:
In the circulation project, the messages file can be found at /circulation/src/main/resources/en_messages.prop
with the following content:
20002=Operation can not be calculated on a Null Amount
20003=Unable to pay fine, amount is larger than owed
20004=The item {0} is not renewable
20005=Loan period must be greater than 1, period entered: {0}
The circulation project exposes these messages as enums for easier usage in the code:
package org.folio.utils;
import org.folio.rest.tools.messages.MessageEnum;
public enum CircMessageConsts implements MessageEnum {
OperationOnNullAmount("20002"),
FinePaidTooMuch("20003"),
NonRenewable("20004"),
LoanPeriodError("20005");
private String code;
private CircMessageConsts(String code){
this.code = code;
}
public String getCode(){
return code;
}
}
Usage:
private final Messages messages = Messages.getInstance();
messages.getMessage(lang, CircMessageConsts.OperationOnNullAmount);
Note: parameters can also be passed when relevant. The raml-module-builder runtime also exposes generic error message enums which can be found at /domain-models-runtime/src/main/java/org/folio/rest/tools/messages/MessageConsts.java
The runtime framework includes a web application which exposes RAMLs in a view-friendly HTML format. This uses api-console (Powered by MuleSoft for RAML Copyright (c) 2013 MuleSoft, Inc.)
The maven-resources-plugin
plugin described earlier
copies the RAML files into the correct directory in your project, so that the
runtime framework can access it and show local API documentation.
So for example, when running the sample working module then its API documentation is at:
http://localhost:8081/apidocs/index.html?raml=raml/configuration/config.raml
If instead your new module is running on the default port, then its API documentation is at:
http://localhost:8081/apidocs/index.html?raml=raml/my-project.raml
and remember to specify the "X-Okapi-Tenant: diku" header.
The RMB also automatically provides other documentation, such as the "Admin API":
http://localhost:8081/apidocs/index.html?raml=raml/admin.raml
All current API documentation is also available at dev.folio.org/doc/api
RMB uses log4j2 for logging. Bundled with the domain-models-runtime (jar) there are two log4j configurations:
log4j2.properties
console/line based logger and it is the defaultlog4j2-json.properties
JSON structured logging
You can choose the JSON structured logging by using setting:
java -Dlog4j.configurationFile=log4j2-json.properties -jar ...
A module that wants to generate log4J2 logs in a different format can create a log4j2.properties
file in the /resources directory.
The log levels can also be changed via the /admin
API provided by the framework. For example:
Get log level of all classes:
(GET) http://localhost:8081/admin/loglevel
Change log level of all classes to INFO:
(PUT) http://localhost:8081/admin/loglevel?level=INFO
A java_package
parameter can also be passed to change the log level of a specific package. For example:
http://localhost:8081/admin/loglevel?level=INFO&java_package=org.folio.rest.persist.PostgresClient
http://localhost:8081/admin/loglevel?level=INFO&java_package=org.folio.rest.persist
The runtime framework via the /admin
API exposes (as previously mentioned) some APIs to help monitor the service (setting log levels, DB information).
Some are listed below (and see the full set):
/admin/jstack
-- Stack traces of all threads in the JVM to help find slower and bottleneck methods./admin/memory
-- A jstat type of reply indicating memory usage within the JVM on a per pool basis (survivor, old gen, new gen, metadata, etc.) with usage percentages./admin/slow_queries
-- Queries taking longer than X seconds./admin/cache_hit_rates
-- Cache hit rates in Postgres./admin/table_index_usage
-- Index usage per table./admin/postgres_table_size
-- Disk space used per table./admin/postgres_table_access_stats
-- Information about how tables are being accessed./admin/postgres_load
-- Load information in Postgres./admin/postgres_active_sessions
-- Active sessions in Postgres./admin/health
-- Returns status code 200 as long as service is up.
RMB shares the same instrumentation code with Okapi. Please see
Okapi instrumentation.
Change okapi-core/target/okapi-core-fat.jar dev
in the example to RMB based module jar name and parameters.
It is possible to over ride APIs that the RMB provides with custom implementations.
For example:
To override the /health
API to return a relevant business logic health check for a specific module do the following:
extend
the AdminAPI class that comes with the RMB framework -public class CustomHealthCheck extends AdminAPI
and over ride thegetAdminHealth
function. The RMB will route the URL endpoint associated with the function to the custom module's implementation.
Example:
public class CustomHealthCheck extends AdminAPI {
@Override
public void getAdminHealth(Map<String, String> okapiHeaders,
Handler<AsyncResult<Response>> asyncResultHandler, Context vertxContext) throws Exception {
super.getAdminHealth(okapiHeaders, res -> {
System.out.println(" --- this is an over ride of the health API by the config module "+res.result().getStatus());
asyncResultHandler.handle(io.vertx.core.Future.succeededFuture(GetAdminHealthResponse.withOK()));
}, vertxContext);
}
@Override
public void getAdminModuleStats(Map<String, String> okapiHeaders,
Handler<AsyncResult<Response>> asyncResultHandler, Context vertxContext) throws Exception {
super.getAdminModuleStats(okapiHeaders, res -> {
JsonObject o = new JsonObject(res.result().getEntity().toString());
System.out.println(" --- this is an over ride of the Module Stats API by the config module ");
asyncResultHandler.handle(io.vertx.core.Future.succeededFuture(GetAdminModuleStatsResponse.
withPlainOK( o.encodePrettily() )));
}, vertxContext);
}
}
The framework can generate a Client class for every RAML file with a function for every API endpoint in the RAML.
To generate a client API from your RAML use the generateClients
option of
the domain-models-maven-plugin in your pom.xml as well as depend on the
artifact that includes interfaces.
<plugin>
<groupId>org.folio</groupId>
<artifactId>domain-models-maven-plugin</artifactId>
<version>${raml-module-builder-version}</version>
<dependencies>
<dependency>
<groupId>org.folio</groupId>
<artifactId>mod-my-server</artifactId>
<version>${project.parent.version}</version>
</dependency>
</dependencies>
<executions>
<execution>
<id>generate_interfaces</id>
<goals>
<goal>java</goal>
</goals>
<configuration>
<generateClients>true</generateClients>
</configuration>
</execution>
</executions>
</plugin>
For the monitoring APIs exposed by the runtime framework, changing the log level via the client would look like this:
AdminClient aClient = new AdminClient("http://localhost:" + 8083, "myuniversityId", "sometoken");
aClient.putLoglevel(Level.INFO, "org.folio", apiResponse -> {
System.out.println(apiResponse.statusCode());
});
Requesting a stack trace would look like this:
AdminClient aClient = new AdminClient("http://localhost:" + 8083, "myuniversityId", "sometoken");
aClient.getJstack( trace -> {
trace.bodyHandler( content -> {
System.out.println(content);
});
});
Query parameters and headers, as declared in the RAML, are used to generate the parameter annotations.
An example of this is found above in the Generated Files section.
For object validation, the provided JSON schema is used as the basis for constraint annotations.
For example, a Vendor object defined like this:
"type": "object",
"properties": {
"vendors": {
"type": "array",
"items": {
"type": "object",
"$ref": "vendor"
}
}.
"total_records": {
"type": "integer"
}
},
"required": [
"vendors",
"total_records"
]
Will result in a class with generated fields and annotations as follows:
public class Vendors {
@Valid
@NotNull
private List<Vendor> vendors = new ArrayList<Vendor>();
@NotNull
private Integer totalRecords;
...
}
org.folio.rest.persist.PgUtil
has default implementations that should be used if possible.
Example: mod-organizations-storage OrganizationsAPI.java
You need to add @Validate
to each implementing method so that RMB validates the input against the schema,
for example required properties, no additional properties, property type, or regexp pattern.
This example shows how to use advanced features that go beyond the default PgUtil implementation.
@Validate
@Override
public void postConfigurationsEntries(
Config entity, RoutingContext routingContext, Map<String, String>okapiHeaders,
Handler<AsyncResult<Response>> asyncResultHandler, Context context) {
try {
log.debug("sending... postConfigurationsTables");
defaultToEnabled(entity);
String tenantId = TenantTool.tenantId(okapiHeaders);
PostgresClient.getInstance(context.owner(), tenantId)
.save(CONFIG_TABLE, entity.getId(), entity)
.onSuccess(ret -> {
entity.setId(ret);
asyncResultHandler.handle(Future.succeededFuture(
PostConfigurationsEntriesResponse.respond201WithApplicationJson(entity,
PostConfigurationsEntriesResponse.headersFor201().withLocation(LOCATION_PREFIX + ret))));
}).onFailure(e -> {
log.error(e.getMessage(), e);
if (isNotUniqueModuleConfigAndCode(e)) {
asyncResultHandler.handle(Future.succeededFuture(
PostConfigurationsEntriesResponse
.respond422WithApplicationJson(uniqueModuleConfigAndCodeError(entity))));
return;
}
ValidationHelper.handleError(e, asyncResultHandler);
});
} catch (Exception e) {
log.error(e.getMessage(), e);
ValidationHelper.handleError(e, asyncResultHandler);
}
}
RMB handles all routing, so this is abstracted from the developer. However, there are cases where third party functionality may need access to routing information. Once again, this is not to be used for routing, but in order to pass in routing information to a third party (one such example is the pac4j vertx saml client). RMB allows a developer to receive the Vertx RoutingContext object as a parameter to a generated function by indicating the endpoint represented by the function in the pom.xml (uses a comma delimiter for multiple paths).
<properties>
<generate_routing_context>/rmbtests/test</generate_routing_context>
</properties>
At runtime RMB will serialize/deserialize the received JSON in the request body of PUT, PATCH and POST requests into a POJO and pass this on to an implementing function, as well as the POJO returned by the implementing function into JSON. The ObjectMapperTool provides the serializer and de-serializer. Note that JsonObject.mapFrom
is incompatible, use ObjectMapperTool.valueAsString
instead.
A module can implement its own version of the de-serializer, for example, the below will register a de-serializer that will tell RMB to set a User to not active if the expiration date has passed. This will be run when a User JSON is passed in as part of a request
ObjectMapperTool.registerDeserializer(User.class, new UserDeserializer());
public class UserDeserializer extends JsonDeserializer<User> {
@Override
public User deserialize(JsonParser parser, DeserializationContext context) throws IOException, JsonProcessingException {
ObjectMapper mapper = ObjectMapperTool.getDefaultMapper();
ObjectCodec objectCodec = parser.getCodec();
JsonNode node = objectCodec.readTree(parser);
User user = mapper.treeToValue(node, User.class);
Optional<Date> expirationDate = Optional.ofNullable(user.getExpirationDate());
if (expirationDate.isPresent()) {
Date now = new Date();
if (now.compareTo(expirationDate.get()) > 0) {
user.setActive(false);
}
}
return user;
}
}
Making async calls to the PostgresClient requires handling failures of different kinds. RMB exposes a tool that can handle the basic error cases, and return them as a 422 validation error status falling back to a 500 error status when the error is not one of the standard DB errors.
Usage with Future:
...
.onFailure(exception -> ValidationHelper.handleError(exception, asyncResultHandler))
.onSuccess(result -> ...);
Usage with AsyncResult:
if (asyncResult.failed()) {
ValidationHelper.handleError(asyncResult.cause(), asyncResultHandler);
return;
}
RMB will return a response to the client as follows:
- invalid UUID - 422 status
- duplicate key violation - 422 status
- Foreign key violation - 422 status
- tenant does not exist / auth error to db - 401 status
- Various CQL errors - 422 status
- Anything else will fall back to a 500 status error
RMB will not cross check the raml to see that these statuses have been defined for the endpoint. This is the developer's responsibility.
Have these in the headers - currently not validated hence not mandatory:
Accept: application/json,text/plain
Content-Type: application/json;
// POST http://localhost:8080/patrons/56dbe25ea12958478cec42ba/fines
{
"fine_amount": 10,
"fine_outstanding": 0,
"fine_date": 1413879432,
"fine_pay_in_full": true,
"fine_pay_in_partial": false,
"fine_note": "aaaaaa",
"item_id": "56dbe160a129584dc8de7973",
"fine_forgiven": {
"user": "the cool librarian",
"amount": "none"
},
"patron_id": "56dbe25ea12958478cec42ba"
}
GET http://localhost:8080/patrons/56dbe25ea12958478cec42ba/fines
GET http://localhost:8080/patrons/56dbe25ea12958478cec42ba
GET http://localhost:8080/patrons
DELETE http://localhost:8080/patrons/56dbe791a129584a506fb41a
// POST http://localhost:8080/patrons
{
"status": "ACTIVE",
"patron_name": "Smith,John",
"patron_barcode": "00007888",
"patron_local_id": "abcdefd",
"contact_info": {
"patron_address_local": {
"line1": "Main Street 1",
"line2": "Nice building near the corner",
"city": "London",
"state_province": "",
"postal_code": "",
"address_note": "",
"start_date": "2013-12-26Z"
},
"patron_address_home": {
"line1": "Main Street 1",
"line2": "Nice building near the corner",
"city": "London",
"state_province": "",
"postal_code": "",
"address_note": "",
"start_date": "2013-12-26Z"
},
"patron_address_work": {
"line1": "Main Street 1",
"line2": "Nice building near the corner",
"city": "London",
"state_province": "",
"postal_code": "",
"address_note": "",
"start_date": "2013-12-26Z"
},
"patron_email": "[email protected]",
"patron_email_alternative": "[email protected]",
"patron_phone_cell": "123456789",
"patron_phone_home": "123456789",
"patron_phone_work": "123456789",
"patron_primary_contact_info": "patron_email"
},
"total_loans": 50,
"total_fines": "100$",
"total_fines_paid": "0$",
"patron_code": {
"value": "CH",
"description": "Child"
}
}
Other RMB documentation (e.g. DB schema migration, Upgrading notes).
Other modules.
Dedicated FOLIO Slack channel #raml-module-builder.
See project RMB at the FOLIO issue tracker.
Other FOLIO Developer documentation is at dev.folio.org