|
1 |
| -# SIP 008 Clarity Parsing and Analysis Cost Assessment |
2 |
| - |
3 |
| -## Preamble |
4 |
| - |
5 |
| -Title: Clarity Parsing and Analysis Cost Assessment |
6 |
| - |
7 |
| -Author: Aaron Blankstein <[email protected]> |
8 |
| - |
9 |
| -Status: Draft |
10 |
| - |
11 |
| -Type: Standard |
12 |
| - |
13 |
| -Created: 03/05/2020 |
14 |
| - |
15 |
| -License: BSD 2-Clause |
16 |
| - |
17 |
| -# Abstract |
18 |
| - |
19 |
| -This document describes the measured costs and asymptotic costs |
20 |
| -assessed for parsing Clarity code into an abstract syntax tree (AST) |
21 |
| -and the static analysis of that Clarity code (type-checking and |
22 |
| -read-only enforcement). This will not specify the _constants_ |
23 |
| -associated with those asymptotic cost functions. Those constants will |
24 |
| -necessarily be measured via benchmark harnesses and regression |
25 |
| -analyses. |
26 |
| - |
27 |
| -# Measurements for Execution Cost |
28 |
| - |
29 |
| -The cost of analyzing Clarity code is measured using the same 5 categories |
30 |
| -described in SIP-006 for the measurement of execution costs: |
31 |
| - |
32 |
| -1. Runtime cost: captures the number of cycles that a single |
33 |
| - processor would require to process the Clarity block. This is a |
34 |
| - _unitless_ metric, so it will not correspond directly to cycles, |
35 |
| - but rather is meant to provide a basis for comparison between |
36 |
| - different Clarity code blocks. |
37 |
| -2. Data write count: captures the number of independent writes |
38 |
| - performed on the underlying data store (see SIP-004). |
39 |
| -3. Data read count: captures the number of independent reads |
40 |
| - performed on the underlying data store. |
41 |
| -4. Data write length: the number of bytes written to the underlying |
42 |
| - data store. |
43 |
| -5. Data read length: the number of bytes read from the underlying |
44 |
| - data store. |
45 |
| - |
46 |
| -Importantly, these costs are used to set a _block limit_ for each |
47 |
| -block. When it comes to selecting transactions for inclusion in a |
48 |
| -block, miners are free to make their own choices based on transaction |
49 |
| -fees, however, blocks may not exceed the _block limit_. If they do so, |
50 |
| -the block is considered invalid by the network --- none of the block's |
51 |
| -transactions will be materialized and the leader forfeits all rewards |
52 |
| -from the block. |
53 |
| - |
54 |
| -Costs for static analysis are assessed during the _type check_ pass. |
55 |
| -The read-only and trait-checking passes perform work which is strictly |
56 |
| -less than the work performed during type checking, and therefore, the |
57 |
| -cost assessment can safely fold any costs that would be incurred during |
58 |
| -those passes into the type checking pass. |
59 |
| - |
60 |
| -# Common Analysis Metrics and Costs |
61 |
| - |
62 |
| -## AST Parsing |
63 |
| - |
64 |
| -The Clarity parser has a runtime that is linear with respect to the Clarity |
65 |
| -program length. |
66 |
| - |
67 |
| -``` |
68 |
| -a*X+b |
69 |
| -``` |
70 |
| - |
71 |
| -where a and b are constants, and |
72 |
| - |
73 |
| -X := the program length in bytes |
74 |
| - |
75 |
| -## Dependency cycle detection |
76 |
| - |
77 |
| -Clarity performs cycle detection for intra-contract dependencies (e.g., |
78 |
| -functions that depend on one another). This detection is linear in the |
79 |
| -number of dependency edges in the smart contract: |
80 |
| - |
81 |
| -``` |
82 |
| -a*X+b |
83 |
| -``` |
84 |
| - |
85 |
| -where a and b are constants, and |
86 |
| -X := the total number of dependency edges in the smart contract |
87 |
| - |
88 |
| -Dependency edges are created anytime a top-level definition refers |
89 |
| -to another top-level definition. |
90 |
| - |
91 |
| -## Type signature size |
92 |
| - |
93 |
| -Types in Clarity may be described using type signatures. For example, |
94 |
| -`(tuple (a int) (b int))` describes a tuple with two keys `a` and `b` |
95 |
| -of type `int`. These type descriptions are used by the Clarity analysis |
96 |
| -passes to check the type correctness of Clarity code. Clarity type signatures |
97 |
| -have varying size, e.g., the signature `int` is smaller than the signature for a |
98 |
| -list of integers. |
99 |
| - |
100 |
| -The signature size of a Clarity type is defined as follows: |
101 |
| - |
102 |
| -``` |
103 |
| -type_signature_size(x) := |
104 |
| - if x = |
105 |
| - int => 1 |
106 |
| - uint => 1 |
107 |
| - bool => 1 |
108 |
| - principal => 1 |
109 |
| - buffer => 2 |
110 |
| - optional => 1 + type_signature_size(entry_type) |
111 |
| - response => 1 + type_signature_size(ok_type) + type_signature_size(err_type) |
112 |
| - list => 2 + type_signature_size(entry_type) |
113 |
| - tuple => 1 + 2*(count(entries)) |
114 |
| - + sum(type_signature_size for each entry) |
115 |
| - + sum(len(key_name) for each entry) |
116 |
| -``` |
117 |
| - |
118 |
| -## Type annotation |
119 |
| - |
120 |
| -Each node in a Clarity contract's AST is annotated with the type value |
121 |
| -for that node during the type checking analysis pass. |
122 |
| - |
123 |
| -The runtime cost of type annotation is: |
124 |
| - |
125 |
| -``` |
126 |
| -a + b*X |
127 |
| -``` |
128 |
| - |
129 |
| -where a and b are constants, and X is the type signature size of the |
130 |
| -type being annotated. |
131 |
| - |
132 |
| -## Variable lookup |
133 |
| - |
134 |
| -Looking up variables during static analysis incurs a non-constant cost -- the stack |
135 |
| -depth _and_ the length of the variable name affect this cost. However, |
136 |
| -variable names in Clarity have bounded length -- 128 characters. Therefore, |
137 |
| -the cost assessed for variable lookups may safely be constant with respect |
138 |
| -to name length. |
139 |
| - |
140 |
| -The stack depth affects the lookup cost because the variable must be |
141 |
| -checked for in each context on the stack. |
142 |
| - |
143 |
| -Cost Function: |
144 |
| - |
145 |
| -``` |
146 |
| -a*X+b*Y+c |
147 |
| -``` |
148 |
| - |
149 |
| -where a, b, and c are constants, |
150 |
| -X := stack depth |
151 |
| -Y := the type size of the looked up variable |
152 |
| - |
153 |
| -## Function Lookup |
154 |
| - |
155 |
| -Looking up a function incurs a constant cost with respect |
156 |
| -to name length (for the same reason as variable lookup). However, |
157 |
| -because functions may only be defined in the top-level contract |
158 |
| -context, stack depth does not affect function lookup. |
159 |
| - |
160 |
| -Cost Function: |
161 |
| - |
162 |
| -``` |
163 |
| -a*X + b |
164 |
| -``` |
165 |
| - |
166 |
| -where a and b are constants, |
167 |
| -X := the sum of the type sizes for the function signature (each argument's type size, as well |
168 |
| - as the function's return type) |
169 |
| - |
170 |
| -## Name Binding |
171 |
| - |
172 |
| -The cost of binding a name in Clarity -- in either a local or the contract |
173 |
| -context is _constant_ with respect to the length of the name, but linear in |
174 |
| -the size of the type signature. |
175 |
| - |
176 |
| -``` |
177 |
| -binding_cost = a + b*X |
178 |
| -``` |
179 |
| - |
180 |
| -where a and b are constants, and |
181 |
| -X := the size of the bound type signature |
182 |
| - |
183 |
| -## Type check cost |
184 |
| - |
185 |
| -The cost of a static type check is _linear_ in the size of the type signature: |
186 |
| - |
187 |
| -``` |
188 |
| -type_check_cost(expected, actual) := |
189 |
| - a + b*X |
190 |
| -``` |
191 |
| - |
192 |
| -where a and b are constants, and |
193 |
| - |
194 |
| -X := `max(type_signature_size(expected), type_signature_size(actual))` |
195 |
| - |
196 |
| -## Function Application |
197 |
| - |
198 |
| -Static analysis of a function application in Clarity requires |
199 |
| -type checking the function's expected arguments against the |
200 |
| -supplied types. |
201 |
| - |
202 |
| -The cost of applying a function is: |
203 |
| - |
204 |
| - |
205 |
| -``` |
206 |
| -a + sum(type_check_cost(expected, actual) for each argument) |
207 |
| -``` |
208 |
| - |
209 |
| -where a is a constant. |
210 |
| - |
211 |
| -This is also the _entire_ cost of type analysis for most function calls |
212 |
| -(e.g., intra-contract function calls, most simple native functions). |
213 |
| - |
214 |
| -## Iterating the AST |
215 |
| - |
216 |
| -Static analysis iterates over the entire program's AST in the type checker, |
217 |
| -the trait checker, and in the read-only checker. This cost is assessed |
218 |
| -as a constant cost for each node visited in the AST during the type |
219 |
| -checking pass. |
220 |
| - |
221 |
| -# Special Function Costs |
222 |
| - |
223 |
| -Some functions require additional work from the static analysis system. |
224 |
| - |
225 |
| -## Functions on sequences (e.g., map, filter, fold) |
226 |
| - |
227 |
| -Functions on sequences need to perform an additional check that the |
228 |
| -supplied type is a list or buffer before performing the normal |
229 |
| -argument type checking. This cost is assessed as: |
230 |
| - |
231 |
| -``` |
232 |
| -a |
233 |
| -``` |
234 |
| - |
235 |
| -where a is a constant. |
236 |
| - |
237 |
| -## Functions on options/responses |
238 |
| - |
239 |
| -Similarly to the functions on sequences, option/response functions |
240 |
| -must perform a simple check to see if the supplied input is an option or |
241 |
| -response before performing additional argument type checking. This cost is |
242 |
| -assessed as: |
243 |
| - |
244 |
| -``` |
245 |
| -a |
246 |
| -``` |
247 |
| - |
248 |
| -## Data functions (ft balance checks, nft lookups, map-get?, ...) |
249 |
| - |
250 |
| -Static checks on intra-contract data functions do not require database lookups |
251 |
| -(unlike the runtime costs of these functions). Rather, these functions |
252 |
| -incur normal type lookup (i.e., fetching the type of an NFT, data map, or data var) |
253 |
| -and type checking costs. |
254 |
| - |
255 |
| -## get |
256 |
| - |
257 |
| -Checking a tuple _get_ requires accessing the tuple's signature |
258 |
| -for the specific field. This has runtime cost: |
259 |
| - |
260 |
| -``` |
261 |
| -a*log(N) + b |
262 |
| -``` |
263 |
| -where a and b are constants, and |
264 |
| - |
265 |
| -N := the number of fields in the tuple type |
266 |
| - |
267 |
| -## tuple |
268 |
| - |
269 |
| -Constructing a tuple requires building the tuple's BTree for |
270 |
| -accessing fields. This has runtime cost: |
271 |
| - |
272 |
| - |
273 |
| -``` |
274 |
| -a*N*log(N) + b |
275 |
| -``` |
276 |
| -where a and b are constants, and |
277 |
| - |
278 |
| -N := the number of fields in the tuple type |
279 |
| - |
280 |
| -## use-trait |
281 |
| - |
282 |
| -Importing a trait imposes two kinds of costs on the analysis. |
283 |
| -First, the import requires a database read. Second, the imported |
284 |
| -trait is included in the static analysis output -- this increases |
285 |
| -the total storage usage and write length of the static analysis. |
286 |
| - |
287 |
| -The costs are defined as: |
288 |
| - |
289 |
| -``` |
290 |
| -read_count = 1 |
291 |
| -write_count = 0 |
292 |
| -runtime = a*X+b |
293 |
| -write_length = c*X+d |
294 |
| -read_length = c*X+d |
295 |
| -``` |
296 |
| - |
297 |
| -where a, b, c, and d are constants, and |
298 |
| - |
299 |
| -X := the total type size of the trait (i.e., the sum of the |
300 |
| - type sizes of each function signature). |
301 |
| - |
302 |
| -## contract-call? |
303 |
| - |
304 |
| -Checking a contract call requires a database lookup to inspect |
305 |
| -the function signature of a prior smart contract. |
306 |
| - |
307 |
| -The costs are defined as: |
308 |
| - |
309 |
| -``` |
310 |
| -read_count = 1 |
311 |
| -read_length = a*X+b |
312 |
| -runtime = c*X+d |
313 |
| -``` |
314 |
| - |
315 |
| -where a, b, c, and d are constants, and |
316 |
| - |
317 |
| -X := the total type size of the function signature |
318 |
| - |
319 |
| -## let |
320 |
| - |
321 |
| -Let bindings require the static analysis system to iterate over |
322 |
| -each let binding and ensure that they are syntactically correct. |
323 |
| - |
324 |
| -This imposes a runtime cost: |
325 |
| - |
326 |
| -``` |
327 |
| -a*X + b |
328 |
| -``` |
329 |
| -where a and b are constants, and |
330 |
| - |
331 |
| -X := the number of entries in the let binding. |
| 1 | +# SIP-008 Clarity Parsing and Analysis Cost Assessment |
332 | 2 |
|
| 3 | +This document formerly contained SIP-008 before the Stacks 2.0 mainnet launched. |
333 | 4 |
|
| 5 | +This SIP is now located in the [stacksgov/sips repository](https://github.com/stacksgov/sips/blob/main/sips/sip-008/sip-008-analysis-cost-assessment.md) as part of the [Stacks Community Governance organization](https://github.com/stacksgov). |
0 commit comments