Skip to content

databento/databento-python

Repository files navigation

databento-python

test python pypi-version license code-style: black Slack

The official Python client library for Databento.

Key features include:

  • Fast, lightweight access to both live and historical data from multiple markets.
  • Multiple schemas such as MBO, MBP, top of book, OHLCV, last sale, and more.
  • Fully normalized, i.e. identical message schemas for both live and historical data, across multiple asset classes.
  • Provides mappings between different symbology systems, including smart symbology for futures rollovers.
  • Point-in-time instrument definitions, free of look-ahead bias and retroactive adjustments.
  • Reads and stores market data in an extremely efficient file format using Databento Binary Encoding.
  • Event-driven market replay, including at high-frequency order book granularity.
  • Support for batch download of flat files.
  • Support for pandas, CSV, and JSON.

Documentation

The best place to begin is with our Getting started guide.

You can find our full client API reference on the Historical Reference and Live Reference sections of our documentation. See also the Examples section for various tutorials and code samples.

Requirements

The library is fully compatible with the latest distribution of Anaconda 3.9 and above. The minimum dependencies as found in the pyproject.toml are also listed below:

  • python = "^3.9"
  • aiohttp = "^3.8.3"
  • databento-dbn = "0.23.1"
  • numpy= ">=1.23.5"
  • pandas = ">=1.5.3"
  • pip-system-certs = ">=4.0" (Windows only)
  • pyarrow = ">=13.0.0"
  • requests = ">=2.25.1"
  • zstandard = ">=0.21.0"

Installation

To install the latest stable version of the package from PyPI:

pip install -U databento

Usage

The library needs to be configured with an API key from your account. Sign up for free and you will automatically receive a set of API keys to start with. Each API key is a 32-character string starting with db-, that can be found on the API Keys page of your Databento user portal.

A simple Databento application looks like this:

import databento as db

client = db.Historical('YOUR_API_KEY')
data = client.timeseries.get_range(
    dataset='GLBX.MDP3',
    symbols='ES.FUT',
    stype_in='parent',
    start='2022-06-10T14:30',
    end='2022-06-10T14:40',
)

data.replay(callback=print)  # market replay, with `print` as event handler

Replace YOUR_API_KEY with an actual API key, then run this program.

This uses .replay() to access the entire block of data and dispatch each data event to an event handler. You can also use .to_df() or .to_ndarray() to cast the data into a Pandas DataFrame or numpy ndarray:

df = data.to_df()  # to DataFrame
array = data.to_ndarray()  # to ndarray

Note that the API key was also passed as a parameter, which is not recommended for production applications. Instead, you can leave out this parameter to pass your API key via the DATABENTO_API_KEY environment variable:

import databento as db

# Pass as parameter
client = db.Historical('YOUR_API_KEY')

# Or, pass as `DATABENTO_API_KEY` environment variable
client = db.Historical()

License

Distributed under the Apache 2.0 License.