Point72 · robambalu · Apr 10, 2024 · Mar 13, 2024 · Mar 13, 2024 · Mar 13, 2024
@@ -19,15 +19,18 @@ CSP (Composable Stream Processing) is a library for high-performance real-time e
 
 ## Get Started
 
-- Tutorials: Go through the introductory tutorials to learn the basics of CSP
-- Examples: Check out various features and use cases
+- [Install `csp`](Installation) and [write your first CSP program](First-Steps)
+- Learn more about [nodes](CSP-Node), [graphs](CSP-Graph), and [execution modes](Execution-Modes)
+- Learn to extend CSP with [adapters](Adapters)
+<!-- - Check out the [examples](Examples) for various CSP features and use cases -->
 
-Tip: Find relevant docs with GitHub’s search function, use `repo:Point72/csp type:wiki <search terms>` to search the documentation Wiki Pages.
+> \[!TIP\]
+> Find relevant docs with GitHub’s search function, use `repo:Point72/csp type:wiki <search terms>` to search the documentation Wiki Pages.
 
 ## Community
 
-- Developer guide: Learn to build and develop CSP locally
-- Roadmap: Read what’s the development team is excited about
+- [Contribute](Contribute) to CSP and help improve the project
+- Read about future plans in the [project roadmap](Roadmap)
 
 ## License
 

@@ -19,7 +19,7 @@ Notes for editors:
 
 - [CSP Node](CSP-Node)
 - [CSP Graph](CSP-Graph)
-- [Historical Data](Historical-Data)
+- [Historical Buffers](Historical-Buffers)
 - [Execution Modes](Execution-Modes)
 - [Adapters](Adapters)
 - [Caching](Caching)

@@ -10,7 +10,7 @@
 ## Anatomy of a `csp.node`
 
 The heart of a calculation graph are the csp.nodes that run the computations.
-`csp.node` methods can take any number of scalar and timeseries arguments, and can return 0 → N timeseries outputs.
+`csp.node` methods can take any number of scalar and timeseries arguments, and can return 0 → N timeseries outputs.
 Timeseries inputs/outputs should be thought of as the edges that connect components of the graph.
 These "edges" can tick whenever they have a new value.
 Every tick is associated with a value and the time of the tick.
@@ -52,7 +52,7 @@ def demo_node(n: int, xs: ts[float], ys: ts[float]) -> ts[float]:        # 2
 
 Lets review line by line
 
-1\) Every csp node must start with the **`@csp.node`** decorator
+1\) Every csp node must start with the **`@csp.node`** decorator
 
 2\) `csp` nodes are fully typed and type-checking is strictly enforced.
 All arguments must be typed, as well as all outputs.
@@ -75,7 +75,7 @@ Note that variables declared in state will live across invocations of the method
 9\) An example declaration and initialization of state variable `s_sum`.
 It is good practice to name state variables prefixed with `s_`, which is the convention in the `csp` codebase.
 
-11\) **`with csp.start()`**: an optional block to execute code at the start of the engine.
+11\) **`with csp.start()`**: an optional block to execute code at the start of the engine.
 Generally this is used to setup initial timers or set input timeseries properties such as buffer sizes, or to make inputs passive
 
 14-15) **`csp.set_buffering_policy`**: nodes can request a certain amount of history be kept on the incoming time series, this can be denoted in number of ticks or in time.
@@ -92,8 +92,8 @@ Note that `schedule_alarm` can be called multiple times on the same alarm to sch
 19\) **`with csp.stop()`** is an optional block that can be called when the engine is done running.
 
 22\) all nodes will have if conditions to react to different inputs.
-**`csp.ticked()`** takes any number of inputs and returns true if **any** of the inputs ticked.
-**`csp.valid`** similar takes any number of inputs however it only returns true if **all** inputs are valid.
+**`csp.ticked()`** takes any number of inputs and returns true if **any** of the inputs ticked.
+**`csp.valid`** similar takes any number of inputs however it only returns true if **all** inputs are valid.
 Valid means that an input has had at least one tick and so it has a "current value".
 
 23\) One of the benefits of `csp` is that you always have easy access to the latest value of all inputs.
@@ -105,7 +105,7 @@ Valid means that an input has had at least one tick and so it has a "current val
 
 ## Basket inputs
 
-In addition to single time-series inputs, a node can also accept a **basket** of time series as an argument.
+In addition to single time-series inputs, a node can also accept a **basket** of time series as an argument.
 A basket is essentially a collection of timeseries which can be passed in as a single argument.
 Baskets can either be list baskets or dict baskets.
 Individual timeseries in a basket can tick independently, and they can be looked at and reacted to individually or as a collection.
@@ -147,18 +147,19 @@ The convention is the same as passing multiple inputs to `csp.ticked`, `csp.tick
 - **`validvalues`**: iterator of values of all valid inputs
 - **`validkeys`**: iterator of keys of all valid inputs
 - **`validitems`**: iterator of (key,value) tuples of valid inputs
-- **`keys`**: list of keys on the basket (**dictionary baskets only** )
+- **`keys`**: list of keys on the basket (**dictionary baskets only** )
 
 10-11) This demonstrates the ability to access an individual element of a
 basket and react to it as well as access its current value
 
 ## **Node Outputs**
 
 Nodes can return any number of outputs (including no outputs, in which case it is considered an "output" or sink node,
-see [Graph Pruning](https://github.com/Point72/csp/wiki/0.-Introduction#graph-pruning)).
+see [Graph Pruning](https://github.com/Point72/csp/wiki/0.-Introduction#graph-pruning)).
 Nodes with single outputs can return the output as an unnamed output.
 Nodes returning multiple outputs must have them be named.
-When a node is called at graph building time, if its is a single unnamed node the return variable is an edge representing the output which can be passed into other nodes.
+When a node is called at graph building time, if it is a single unnamed node the return variable is an edge representing the output which can be passed into other nodes.
+An output timeseries cannot be ticked more than once in a given node invocation.
 If the outputs are named, the return value is an object with the outputs available as attributes.
 For example (examples below demonstrate various ways to output the data as well)
 
@@ -202,44 +203,38 @@ Similarly to inputs, a node can also produce a basket of timeseries as an output
 For example:
 
 ```python
-class MyStruct(csp.Struct):            # 1
-    symbol: str                        # 2
-    index: int                         # 3
-    value: float                       # 4
-                                       # 5
-@csp.node                              # 6
-def demo_basket_output_node(           # 7
-    in_: ts[MyStruct],                 # 8
-    symbols: [str],                    # 9
-    num_symbols: int                   # 10
-) -> csp.Outputs(                      # 11
-    dict_basket=csp.OutputBasket(      # 12
-        Dict[str, ts[float]],          # 13
-        shape="symbols",               # 14
-    ),                                 # 15
-    list_basket=csp.OutputBasket(      # 16
-        List[ts[float]],               # 17
-        shape="num_symbols"            # 18
-    ),                                 # 19
-):                                     # 20
-                                       # 21
-    if csp.ticked(in_):                # 22
-        # output to dict basket        # 23
-        csp.output(dict_basket[in_.symbol], in_.value)
-        # alternate output syntax, can output multiple keys at once
-        # csp.output(dict_basket={in_.symbol: in_.value})
-        # output to list basket
-        csp.output(list_basket[in_.index], in_.value)
-        # alternate output syntax, can output multiple keys at once
-        # csp.output(list_basket={in_.index: in_.value})
+class MyStruct(csp.Struct):                                               # 1
+    symbol: str                                                           # 2
+    index: int                                                            # 3
+    value: float                                                          # 4
+                                                                          # 5
+@csp.node                                                                 # 6
+def demo_basket_output_node(                                              # 7
+    in_: ts[MyStruct],                                                    # 8
+    symbols: [str],                                                       # 9
+    num_symbols: int                                                      # 10
+) -> csp.Outputs(                                                         # 11
+    dict_basket=csp.OutputBasket(Dict[str, ts[float]], shape="symbols"),  # 15
+    list_basket=csp.OutputBasket(List[ts[float]], shape="num_symbols"),   # 16
+):                                                                        # 17
+                                                                          # 18
+    if csp.ticked(in_):                                                   # 19
+        # output to dict basket                                           # 20
+        csp.output(dict_basket[in_.symbol], in_.value)                    # 21
+        # alternate output syntax, can output multiple keys at once       # 22
+        # csp.output(dict_basket={in_.symbol: in_.value})                 # 23
+        # output to list basket                                           # 24
+        csp.output(list_basket[in_.index], in_.value)                     # 25
+        # alternate output syntax, can output multiple keys at once       # 26
+        # csp.output(list_basket={in_.index: in_.value})                  # 27
 ```
 
-11-20) Note the output declaration syntax.
+11-17) Note the output declaration syntax.
 A basket output can be either named or unnamed (both examples here are named), and its shape can be specified two ways.
-The `shape` parameter is used with a scalar value that defines the shape of the basket, or the name of the scalar argument (a dict basket expects shape to be a list of keys. lists basket expects `shape` to be an `int`).
+The `shape` parameter is used with a scalar value that defines the shape of the basket, or the name of the scalar argument (a dict basket expects shape to be a list of keys. lists basket expects `shape` to be an `int`).
 `shape_of` is used to take the shape of an input basket and apply it to the output basket.
 
-23+) There are several choices for output syntax.
+20+) There are several choices for output syntax.
 The following work for both list and dict baskets:
 
 - `csp.output(basket={key: value, key2: value2, ...})`
@@ -272,5 +267,5 @@ def const(value: '~T') -> ts['T']:
 `sample` takes a timeseries of type `'T'` as an input, and returns a timeseries of type `'T'`.
 This allows us to pass in a `ts[int]` for example, and get a `ts[int]` as an output, or `ts[bool]` → `ts[bool]`
 
-`const` takes value as an *instance* of type `T`, and returns a timeseries of type `T`.
-So we can call `const(5)` and get a `ts[int]` output, or `const('hello!')` and get a `ts[str]` output, etc...
+`const` takes value as an *instance* of type `T`, and returns a timeseries of type `T`.
+So we can call `const(5)` and get a `ts[int]` output, or `const('hello!')` and get a `ts[str]` output, etc...
@@ -6,10 +6,10 @@
 
 ## Historical Buffers
 
-`csp` provides access to historical input data as well.
-By default only the last value of an input is kept in memory, however one can request history to be kept on an input either by number of ticks or by time using **csp.set_buffering_policy.**
+`csp` can provide access to historical input data as well.
+By default only the last value of an input is kept in memory, however one can request history to be kept on an input either by number of ticks or by time using **csp.set_buffering_policy.**
 
-The methods **csp.value_at**, **csp.time_at** and **csp.item_at** can be used to retrieve historical input values.
+The methods **csp.value_at**, **csp.time_at** and **csp.item_at** can be used to retrieve historical input values.
 Each node should call **csp.set_buffering_policy** to make sure that its inputs are configured to store sufficiently long history for correct implementation.
 For example, let's assume that we have a stream of data and we want to create equally sized buckets from the data.
 A possible implementation of such a node would be:
@@ -29,7 +29,7 @@ In this example, we use **`csp.set_buffering_policy(input, tick_count=bin_size)`
 Note that an input can be shared by multiple nodes, if multiple nodes provide size requirements, the buffer size would be resolved to the maximum size to support all requests.
 
 Alternatively, **`csp.set_buffering_policy`** supports a **`timedelta`** parameter **`tick_history`** instead of **`tick_count`.**
-If **`tick_history`** is provided, the buffer will scale dynamically to ensure that any period of length **`tick_history`** will fit into the history buffer.
+If **`tick_history`** is provided, the buffer will scale dynamically to ensure that any period of length **`tick_history`** will fit into the history buffer.
 
 To identify when there are enough samples to construct a bin we use **`csp.num_ticks(input) % bin_size == 0`**.
 The function **`csp.num_ticks`** returns the number or total ticks for a given time series.
@@ -38,21 +38,21 @@ NOTE: The actual size of the history buffer is usually less than **`csp.num_tick
 The past values in this example are accessed using **`csp.value_at`**.
 The various historical access methods take the same arguments and return the value, time and tuple of `(time,value)` respectively:
 
-- **`csp.value_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns **value** at requested `index_or_time`
-- **`csp.time_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns **datetime** at requested `index_or_time`
-- **`csp.item_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns tuple of `(datetime,value)` at requested `index_or_time`
+- **`csp.value_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns **value** of the timeseries at requested `index_or_time`
+- **`csp.time_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns **datetime** of the timeseries at requested `index_or_time`
+- **`csp.item_at`**`(ts, index_or_time, duplicate_policy=DuplicatePolicy.LAST_VALUE, default=UNSET)`: returns tuple of `(datetime,value)` of the timeseries at requested `index_or_time`
   - **`ts`**: the name of the input
   - **`index_or_time`**:
     - If providing an **index**, this represents how many ticks back to rereieve **and should be \<= 0**.
       0 indicates the current value, -1 is the previous value, etc.
-    - If providing **time** one can either provide a datetime for absolute time, or a timedelta for how far back to access.
+    - If providing **time** one can either provide a datetime for absolute time, or a timedelta for how far back to access.
       **NOTE** that timedelta must be negative to represent time in the past..
   - **`duplicate_policy`**: when requesting history by datetime or timedelta, its possible that there could be multiple values that match the given time.
     **`duplicate_policy`** can be provided to control the behavior of what to return in this case.
     The default policy is to return the LAST_VALUE that exists at the given time.
-  - **`default`**: value to be returned if the requested time is out of the history bounds (if default is not provided and a request is out of bounds an exception will be raised).
+  - **`default`**: value to be returned if the requested time is out of the history bounds (if default is not provided and a request is out of bounds an exception will be raised).
 
-To illustrate the usage of history access using the **timedelta** indexing, consider a possible implementation of a function that sums up samples taken every second for each periods of **n_seconds** of the input time series.
+To illustrate the usage of history access using the **timedelta** indexing, consider a possible implementation of a function that sums up samples taken every second for each periods of **n_seconds** of the input time series.
 If the value ticks slower than every second then this implementation could sample the same value more than once (this is just an illustration, it's NOT recommended to use such implementation in real application as it could be implemented more efficiently):
 
 ```python
@@ -79,7 +79,7 @@ def sample_sum(n_seconds: int, input: ts[int], default_sample_value: int = 0) ->
 
 ## Historical Range Access
 
-In similar fashion, the methods **`csp.values_at`**, **`csp.times_at`** and **`csp.items_at`** can be used to retrieve a range of historical input values as numpy arrays.
+In similar fashion, the methods **`csp.values_at`**, **`csp.times_at`** and **`csp.items_at`** can be used to retrieve a range of historical input values as numpy arrays.
 The bin generator example above can be accomplished more efficiently with range access:
 
 ```python
@@ -93,7 +93,7 @@ def data_bin_generator(bin_size: int, input: ts['T']) -> ts[['T']]:
         return csp.values_at(input, -bin_size + 1, 0).tolist()
 ```
 
-The past values in this example are accessed using **`csp.values_at`**.
+The past values in this example are accessed using **`csp.values_at`**.
 The various historical access methods take the same arguments and return the value, time and tuple of `(times,values)` respectively:
 
 - **`csp.values_at`**`(ts, start_index_or_time, end_index_or_time, start_index_policy=TimeIndexPolicy.INCLUSIVE, end_index_policy=TimeIndexPolicy.INCLUSIVE)`:
@@ -104,13 +104,13 @@ The various historical access methods take the same arguments and return the val
   returns a tuple of (times, values) numpy arrays
   - **`ts`** - the name of the input
   - **`start_index_or_time`**:
-    - If providing an **index**, this represents how many ticks back to retrieve **and should be \<= 0**.
+    - If providing an **index**, this represents how many ticks back to retrieve **and should be \<= 0**.
       0 indicates the current value, -1 is the previous value, etc.
-    - If providing  **time** one can either provide a datetime for absolute time, or a timedelta for how far back to access.
-      **NOTE that timedelta must be negative** to represent time in the past..
-    - If **None** is provided, the range will begin "from the beginning" - i.e., the oldest tick in the buffer.
+    - If providing  **time** one can either provide a datetime for absolute time, or a timedelta for how far back to access.
+      **NOTE that timedelta must be negative** to represent time in the past..
+    - If **None** is provided, the range will begin "from the beginning" - i.e., the oldest tick in the buffer.
   - **end_index_or_time:** same as start_index_or_time
-    - If **None** is provided, the range will go "until the end" - i.e., the newest tick in the buffer.
+    - If **None** is provided, the range will go "until the end" - i.e., the newest tick in the buffer.
   - **`start_index_policy`**: only for use with datetime/timedelta as the start and end parameters.
     - **\`TimeIndexPolicy.INCLUSIVE**: if there is a tick exactly at the requested time, include it
     - **TimeIndexPolicy.EXCLUSIVE**: if there is a tick exactly at the requested time, exclude it