Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
41bb295
:wastebasket: Remove all files from docs/wiki
pavithraes Mar 13, 2024
be67c62
:sparkles: Add initial restructured files and folders
pavithraes Mar 13, 2024
f0d021b
:broom: Lint and format docs
pavithraes Mar 13, 2024
5a4a42e
:memo: Split How to Write Adpaters
pavithraes Mar 18, 2024
1c920bb
:memo: Move graph_info from how-to to api-reference
pavithraes Mar 18, 2024
b4f4774
:memo: Remove release notes from sidebar
pavithraes Mar 18, 2024
9573bd8
:broom: Fix Local-Development-Setup file extension
pavithraes Mar 18, 2024
d9c434a
:broom: Fix typo in TOC
pavithraes Mar 18, 2024
f919430
Apply suggestions from code review
pavithraes Mar 20, 2024
229c5ff
:memo: update toc in stats node api ref
pavithraes Mar 20, 2024
29e209b
:repeat: Merge main branch
pavithraes Mar 20, 2024
27d2fbe
:memo: Update Home.md
pavithraes Mar 20, 2024
4a78dd2
Rename a page, add some copy, fix code formatting
melissawm Mar 27, 2024
5d4d865
Rename Glossary file and fix reference
melissawm Apr 1, 2024
4ffc649
Fix reference to docs folder in footer
melissawm Apr 1, 2024
8ba9c7e
Fix CSP capitalization and formatting for code display.
melissawm Apr 1, 2024
c770bc0
Add installation instructions with conda for Linux and Mac
melissawm Apr 1, 2024
12827ae
Change typing notation for consistency, add note about notation
melissawm Apr 1, 2024
64e0a65
Merge 'main' into pavithraes/docs-restructure
pavithraes Apr 3, 2024
7573d10
Add more terms to the Glossary
pavithraes Apr 3, 2024
9a14d23
Fix linting errors
pavithraes Apr 3, 2024
599fff5
Update historical buffers examples
robambalu Apr 5, 2024
5be0393
placate the mdformat gds
robambalu Apr 5, 2024
205db20
:sparkles: Updates from code review
pavithraes Apr 10, 2024
ed4c93b
Remove Caching from sidebar
pavithraes Apr 10, 2024
105681b
Update headings for base nodes and adapaters API references
pavithraes Apr 10, 2024
5a5758d
Update the roadmap
pavithraes Apr 10, 2024
1c444cf
Fix wording in roadmap
pavithraes Apr 10, 2024
7894f05
Merge branch 'main' into pavithraes/docs-restructure
pavithraes Apr 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions docs/wiki/Home.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,18 @@ CSP (Composable Stream Processing) is a library for high-performance real-time e

## Get Started

- Tutorials: Go through the introductory tutorials to learn the basics of CSP
- Examples: Check out various features and use cases
- [Install `csp`](Installation) and [write your first CSP program](First-Steps)
- Learn more about [nodes](CSP-Node), [graphs](CSP-Graph), and [execution modes](Execution-Modes)
- Learn to extend CSP with [adapters](Adapters)
<!-- - Check out the [examples](Examples) for various CSP features and use cases -->

Tip: Find relevant docs with GitHub’s search function, use `repo:Point72/csp type:wiki <search terms>` to search the documentation Wiki Pages.
> \[!TIP\]
> Find relevant docs with GitHub’s search function, use `repo:Point72/csp type:wiki <search terms>` to search the documentation Wiki Pages.

## Community

- Developer guide: Learn to build and develop CSP locally
- Roadmap: Read what’s the development team is excited about
- [Contribute](Contribute) to CSP and help improve the project
- Read about future plans in the [project roadmap](Roadmap)

## License

Expand Down
2 changes: 1 addition & 1 deletion docs/wiki/_Sidebar.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Notes for editors:

- [CSP Node](CSP-Node)
- [CSP Graph](CSP-Graph)
- [Historical Data](Historical-Data)
- [Historical Buffers](Historical-Buffers)
- [Execution Modes](Execution-Modes)
- [Adapters](Adapters)
- [Caching](Caching)
Expand Down
83 changes: 39 additions & 44 deletions docs/wiki/concepts/CSP-Node.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
## Anatomy of a `csp.node`

The heart of a calculation graph are the csp.nodes that run the computations.
`csp.node` methods can take any number of scalar and timeseries arguments, and can return 0 → N timeseries outputs.
`csp.node` methods can take any number of scalar and timeseries arguments, and can return 0 → N timeseries outputs.
Timeseries inputs/outputs should be thought of as the edges that connect components of the graph.
These "edges" can tick whenever they have a new value.
Every tick is associated with a value and the time of the tick.
Expand Down Expand Up @@ -52,7 +52,7 @@ def demo_node(n: int, xs: ts[float], ys: ts[float]) -> ts[float]: # 2

Lets review line by line

1\) Every csp node must start with the **`@csp.node`** decorator
1\) Every csp node must start with the **`@csp.node`** decorator

2\) `csp` nodes are fully typed and type-checking is strictly enforced.
All arguments must be typed, as well as all outputs.
Expand All @@ -75,7 +75,7 @@ Note that variables declared in state will live across invocations of the method
9\) An example declaration and initialization of state variable `s_sum`.
It is good practice to name state variables prefixed with `s_`, which is the convention in the `csp` codebase.

11\) **`with csp.start()`**: an optional block to execute code at the start of the engine.
11\) **`with csp.start()`**: an optional block to execute code at the start of the engine.
Generally this is used to setup initial timers or set input timeseries properties such as buffer sizes, or to make inputs passive

14-15) **`csp.set_buffering_policy`**: nodes can request a certain amount of history be kept on the incoming time series, this can be denoted in number of ticks or in time.
Expand All @@ -92,8 +92,8 @@ Note that `schedule_alarm` can be called multiple times on the same alarm to sch
19\) **`with csp.stop()`** is an optional block that can be called when the engine is done running.

22\) all nodes will have if conditions to react to different inputs.
**`csp.ticked()`** takes any number of inputs and returns true if **any** of the inputs ticked.
**`csp.valid`** similar takes any number of inputs however it only returns true if **all** inputs are valid.
**`csp.ticked()`** takes any number of inputs and returns true if **any** of the inputs ticked.
**`csp.valid`** similar takes any number of inputs however it only returns true if **all** inputs are valid.
Valid means that an input has had at least one tick and so it has a "current value".

23\) One of the benefits of `csp` is that you always have easy access to the latest value of all inputs.
Expand All @@ -105,7 +105,7 @@ Valid means that an input has had at least one tick and so it has a "current val

## Basket inputs

In addition to single time-series inputs, a node can also accept a **basket** of time series as an argument.
In addition to single time-series inputs, a node can also accept a **basket** of time series as an argument.
A basket is essentially a collection of timeseries which can be passed in as a single argument.
Baskets can either be list baskets or dict baskets.
Individual timeseries in a basket can tick independently, and they can be looked at and reacted to individually or as a collection.
Expand Down Expand Up @@ -147,18 +147,19 @@ The convention is the same as passing multiple inputs to `csp.ticked`, `csp.tick
- **`validvalues`**: iterator of values of all valid inputs
- **`validkeys`**: iterator of keys of all valid inputs
- **`validitems`**: iterator of (key,value) tuples of valid inputs
- **`keys`**: list of keys on the basket (**dictionary baskets only** )
- **`keys`**: list of keys on the basket (**dictionary baskets only** )

10-11) This demonstrates the ability to access an individual element of a
basket and react to it as well as access its current value

## **Node Outputs**

Nodes can return any number of outputs (including no outputs, in which case it is considered an "output" or sink node,
see [Graph Pruning](https://github.com/Point72/csp/wiki/0.-Introduction#graph-pruning)).
see [Graph Pruning](https://github.com/Point72/csp/wiki/0.-Introduction#graph-pruning)).
Nodes with single outputs can return the output as an unnamed output.
Nodes returning multiple outputs must have them be named.
When a node is called at graph building time, if its is a single unnamed node the return variable is an edge representing the output which can be passed into other nodes.
When a node is called at graph building time, if it is a single unnamed node the return variable is an edge representing the output which can be passed into other nodes.
An output timeseries cannot be ticked more than once in a given node invocation.
If the outputs are named, the return value is an object with the outputs available as attributes.
For example (examples below demonstrate various ways to output the data as well)

Expand Down Expand Up @@ -202,44 +203,38 @@ Similarly to inputs, a node can also produce a basket of timeseries as an output
For example:

```python
class MyStruct(csp.Struct): # 1
symbol: str # 2
index: int # 3
value: float # 4
# 5
@csp.node # 6
def demo_basket_output_node( # 7
in_: ts[MyStruct], # 8
symbols: [str], # 9
num_symbols: int # 10
) -> csp.Outputs( # 11
dict_basket=csp.OutputBasket( # 12
Dict[str, ts[float]], # 13
shape="symbols", # 14
), # 15
list_basket=csp.OutputBasket( # 16
List[ts[float]], # 17
shape="num_symbols" # 18
), # 19
): # 20
# 21
if csp.ticked(in_): # 22
# output to dict basket # 23
csp.output(dict_basket[in_.symbol], in_.value)
# alternate output syntax, can output multiple keys at once
# csp.output(dict_basket={in_.symbol: in_.value})
# output to list basket
csp.output(list_basket[in_.index], in_.value)
# alternate output syntax, can output multiple keys at once
# csp.output(list_basket={in_.index: in_.value})
class MyStruct(csp.Struct): # 1
symbol: str # 2
index: int # 3
value: float # 4
# 5
@csp.node # 6
def demo_basket_output_node( # 7
in_: ts[MyStruct], # 8
symbols: [str], # 9
num_symbols: int # 10
) -> csp.Outputs( # 11
dict_basket=csp.OutputBasket(Dict[str, ts[float]], shape="symbols"), # 15
list_basket=csp.OutputBasket(List[ts[float]], shape="num_symbols"), # 16
): # 17
# 18
if csp.ticked(in_): # 19
# output to dict basket # 20
csp.output(dict_basket[in_.symbol], in_.value) # 21
# alternate output syntax, can output multiple keys at once # 22
# csp.output(dict_basket={in_.symbol: in_.value}) # 23
# output to list basket # 24
csp.output(list_basket[in_.index], in_.value) # 25
# alternate output syntax, can output multiple keys at once # 26
# csp.output(list_basket={in_.index: in_.value}) # 27
```

11-20) Note the output declaration syntax.
11-17) Note the output declaration syntax.
A basket output can be either named or unnamed (both examples here are named), and its shape can be specified two ways.
The `shape` parameter is used with a scalar value that defines the shape of the basket, or the name of the scalar argument (a dict basket expects shape to be a list of keys. lists basket expects `shape` to be an `int`).
The `shape` parameter is used with a scalar value that defines the shape of the basket, or the name of the scalar argument (a dict basket expects shape to be a list of keys. lists basket expects `shape` to be an `int`).
`shape_of` is used to take the shape of an input basket and apply it to the output basket.

23+) There are several choices for output syntax.
20+) There are several choices for output syntax.
The following work for both list and dict baskets:

- `csp.output(basket={key: value, key2: value2, ...})`
Expand Down Expand Up @@ -272,5 +267,5 @@ def const(value: '~T') -> ts['T']:
`sample` takes a timeseries of type `'T'` as an input, and returns a timeseries of type `'T'`.
This allows us to pass in a `ts[int]` for example, and get a `ts[int]` as an output, or `ts[bool]` → `ts[bool]`

`const` takes value as an *instance* of type `T`, and returns a timeseries of type `T`.
So we can call `const(5)` and get a `ts[int]` output, or `const('hello!')` and get a `ts[str]` output, etc...
`const` takes value as an *instance* of type `T`, and returns a timeseries of type `T`.
So we can call `const(5)` and get a `ts[int]` output, or `const('hello!')` and get a `ts[str]` output, etc...
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@

## Historical Buffers

`csp` provides access to historical input data as well.
By default only the last value of an input is kept in memory, however one can request history to be kept on an input either by number of ticks or by time using **csp.set_buffering_policy.**
`csp` can provide access to historical input data as well.
By default only the last value of an input is kept in memory, however one can request history to be kept on an input either by number of ticks or by time using **csp.set_buffering_policy.**

The methods **csp.value_at**, **csp.time_at** and **csp.item_at** can be used to retrieve historical input values.
The methods **csp.value_at**, **csp.time_at** and **csp.item_at** can be used to retrieve historical input values.
Each node should call **csp.set_buffering_policy** to make sure that its inputs are configured to store sufficiently long history for correct implementation.
For example, let's assume that we have a stream of data and we want to create equally sized buckets from the data.
A possible implementation of such a node would be:
Expand All @@ -29,7 +29,7 @@ In this example, we use **`csp.set_buffering_policy(input, tick_count=bin_size)`
Note that an input can be shared by multiple nodes, if multiple nodes provide size requirements, the buffer size would be resolved to the maximum size to support all requests.

Alternatively, **`csp.set_buffering_policy`** supports a **`timedelta`** parameter **`tick_history`** instead of **`tick_count`.**
If **`tick_history`** is provided, the buffer will scale dynamically to ensure that any period of length **`tick_history`** will fit into the history buffer.
If **`tick_history`** is provided, the buffer will scale dynamically to ensure that any period of length **`tick_history`** will fit into the history buffer.

To identify when there are enough samples to construct a bin we use **`csp.num_ticks(input) % bin_size == 0`**.
The function **`csp.num_ticks`** returns the number or total ticks for a given time series.
Expand All @@ -38,21 +38,21 @@ NOTE: The actual size of the history buffer is usually less than **`csp.num_tick
The past values in this example are accessed using **`csp.value_at`**.
The various historical access methods take the same arguments and return the value, time and tuple of `(time,value)` respectively:

- **`csp.value_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns **value** at requested `index_or_time`
- **`csp.time_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns **datetime** at requested `index_or_time`
- **`csp.item_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns tuple of `(datetime,value)` at requested `index_or_time`
- **`csp.value_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns **value** of the timeseries at requested `index_or_time`
- **`csp.time_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns **datetime** of the timeseries at requested `index_or_time`
- **`csp.item_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns tuple of `(datetime,value)` of the timeseries at requested `index_or_time`
- **`ts`**: the name of the input
- **`index_or_time`**:
- If providing an **index**, this represents how many ticks back to rereieve **and should be \<= 0**.
0 indicates the current value, -1 is the previous value, etc.
- If providing **time** one can either provide a datetime for absolute time, or a timedelta for how far back to access.
- If providing **time** one can either provide a datetime for absolute time, or a timedelta for how far back to access.
**NOTE** that timedelta must be negative to represent time in the past..
- **`duplicate_policy`**: when requesting history by datetime or timedelta, its possible that there could be multiple values that match the given time.
**`duplicate_policy`** can be provided to control the behavior of what to return in this case.
The default policy is to return the LAST_VALUE that exists at the given time.
- **`default`**: value to be returned if the requested time is out of the history bounds (if default is not provided and a request is out of bounds an exception will be raised).
- **`default`**: value to be returned if the requested time is out of the history bounds (if default is not provided and a request is out of bounds an exception will be raised).

To illustrate the usage of history access using the **timedelta** indexing, consider a possible implementation of a function that sums up samples taken every second for each periods of **n_seconds** of the input time series.
To illustrate the usage of history access using the **timedelta** indexing, consider a possible implementation of a function that sums up samples taken every second for each periods of **n_seconds** of the input time series.
If the value ticks slower than every second then this implementation could sample the same value more than once (this is just an illustration, it's NOT recommended to use such implementation in real application as it could be implemented more efficiently):

```python
Expand All @@ -79,7 +79,7 @@ def sample_sum(n_seconds: int, input: ts[int], default_sample_value: int = 0) ->

## Historical Range Access

In similar fashion, the methods **`csp.values_at`**, **`csp.times_at`** and **`csp.items_at`** can be used to retrieve a range of historical input values as numpy arrays.
In similar fashion, the methods **`csp.values_at`**, **`csp.times_at`** and **`csp.items_at`** can be used to retrieve a range of historical input values as numpy arrays.
The bin generator example above can be accomplished more efficiently with range access:

```python
Expand All @@ -93,7 +93,7 @@ def data_bin_generator(bin_size: int, input: ts['T']) -> ts[['T']]:
return csp.values_at(input, -bin_size + 1, 0).tolist()
```

The past values in this example are accessed using **`csp.values_at`**.
The past values in this example are accessed using **`csp.values_at`**.
The various historical access methods take the same arguments and return the value, time and tuple of `(times,values)` respectively:

- **`csp.values_at`**`(ts, start_index_or_time, end_index_or_time, start_index_policy=TimeIndexPolicy.INCLUSIVE, end_index_policy=TimeIndexPolicy.INCLUSIVE)`:
Expand All @@ -104,13 +104,13 @@ The various historical access methods take the same arguments and return the val
returns a tuple of (times, values) numpy arrays
- **`ts`** - the name of the input
- **`start_index_or_time`**:
- If providing an **index**, this represents how many ticks back to retrieve **and should be \<= 0**.
- If providing an **index**, this represents how many ticks back to retrieve **and should be \<= 0**.
0 indicates the current value, -1 is the previous value, etc.
- If providing  **time** one can either provide a datetime for absolute time, or a timedelta for how far back to access.
**NOTE that timedelta must be negative** to represent time in the past..
- If **None** is provided, the range will begin "from the beginning" - i.e., the oldest tick in the buffer.
- If providing **time** one can either provide a datetime for absolute time, or a timedelta for how far back to access.
**NOTE that timedelta must be negative** to represent time in the past..
- If **None** is provided, the range will begin "from the beginning" - i.e., the oldest tick in the buffer.
- **end_index_or_time:** same as start_index_or_time
- If **None** is provided, the range will go "until the end" - i.e., the newest tick in the buffer.
- If **None** is provided, the range will go "until the end" - i.e., the newest tick in the buffer.
- **`start_index_policy`**: only for use with datetime/timedelta as the start and end parameters.
- **\`TimeIndexPolicy.INCLUSIVE**: if there is a tick exactly at the requested time, include it
- **TimeIndexPolicy.EXCLUSIVE**: if there is a tick exactly at the requested time, exclude it
Expand Down