The official Python client library for Databento.
Key features include:
- Fast, lightweight access to both live and historical data from multiple markets.
- Multiple schemas such as MBO, MBP, top of book, OHLCV, last sale, and more.
- Fully normalized, i.e. identical message schemas for both live and historical data, across multiple asset classes.
- Provides mappings between different symbology systems, including smart symbology for futures rollovers.
- Point-in-time instrument definitions, free of look-ahead bias and retroactive adjustments.
- Reads and stores market data in an extremely efficient file format using Databento Binary Encoding.
- Event-driven market replay, including at high-frequency order book granularity.
- Support for batch download of flat files.
- Support for pandas, CSV, and JSON.
The best place to begin is with our Getting started guide.
You can find our full client API reference on the Historical Reference and Live Reference sections of our documentation. See also the Examples section for various tutorials and code samples.
The library is fully compatible with the latest distribution of Anaconda 3.9 and above.
The minimum dependencies as found in the pyproject.toml
are also listed below:
- python = "^3.9"
- aiohttp = "^3.8.3"
- databento-dbn = "0.23.1"
- numpy= ">=1.23.5"
- pandas = ">=1.5.3"
- pip-system-certs = ">=4.0" (Windows only)
- pyarrow = ">=13.0.0"
- requests = ">=2.25.1"
- zstandard = ">=0.21.0"
To install the latest stable version of the package from PyPI:
pip install -U databento
The library needs to be configured with an API key from your account.
Sign up for free and you will automatically
receive a set of API keys to start with. Each API key is a 32-character
string starting with db-
, that can be found on the API Keys page of your Databento user portal.
A simple Databento application looks like this:
import databento as db
client = db.Historical('YOUR_API_KEY')
data = client.timeseries.get_range(
dataset='GLBX.MDP3',
symbols='ES.FUT',
stype_in='parent',
start='2022-06-10T14:30',
end='2022-06-10T14:40',
)
data.replay(callback=print) # market replay, with `print` as event handler
Replace YOUR_API_KEY
with an actual API key, then run this program.
This uses .replay()
to access the entire block of data
and dispatch each data event to an event handler. You can also use
.to_df()
or .to_ndarray()
to cast the data into a Pandas DataFrame
or numpy ndarray
:
df = data.to_df() # to DataFrame
array = data.to_ndarray() # to ndarray
Note that the API key was also passed as a parameter, which is
not recommended for production applications.
Instead, you can leave out this parameter to pass your API key via the DATABENTO_API_KEY
environment variable:
import databento as db
# Pass as parameter
client = db.Historical('YOUR_API_KEY')
# Or, pass as `DATABENTO_API_KEY` environment variable
client = db.Historical()
Distributed under the Apache 2.0 License.