@@ -73,9 +73,7 @@ important benefits:
7373   out the coverage counts of each unique instantiation of a generic function,
7474   if invoked with multiple type substitution variations.
7575
76- ## Components of LLVM Coverage Instrumentation in ` rustc `   
77- 
78- ### LLVM Runtime Dependency  
76+ ## The LLVM profiler runtime  
7977
8078Coverage data is only generated by running the executable Rust program. ` rustc ` 
8179statically links coverage-instrumented binaries with LLVM runtime code
@@ -94,209 +92,7 @@ When compiling with `-C instrument-coverage`,
9492[ compiler-rt-profile ] : https://github.com/llvm/llvm-project/tree/main/compiler-rt/lib/profile 
9593[ crate-loader-postprocess ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/creader/struct.CrateLoader.html#method.postprocess 
9694
97- ### MIR Pass: ` InstrumentCoverage `   
98- 
99- Coverage instrumentation is performed on the MIR with a [ MIR pass] [ mir-passes ] 
100- called [ ` InstrumentCoverage ` ] [ mir-instrument-coverage ] . This MIR pass analyzes
101- the control flow graph (CFG)--represented by MIR ` BasicBlock ` s--to identify
102- code branches, attaches [ ` FunctionCoverageInfo ` ]  to the function's body,
103- and injects additional [ ` Coverage ` ] [ coverage-statement ]  statements into the
104- ` BasicBlock ` s.
105- 
106- A MIR ` Coverage `  statement is a virtual instruction that indicates a counter
107- should be incremented when its adjacent statements are executed, to count
108- a span of code ([ ` CodeRegion ` ] [ code-region ] ). It counts the number of times a
109- branch is executed, and is referred to by coverage mappings in the function's
110- coverage-info struct.
111- 
112- Note that many coverage counters will _ not_  be converted into
113- physical counters (or any other executable instructions) in the final binary.
114- Some of them will be (see [ ` CoverageKind::CounterIncrement ` ] ),
115- but other counters can be computed on the fly, when generating a coverage
116- report, by mapping a ` CodeRegion `  to a coverage-counter _ expression_ .
117- 
118- As an example:
119- 
120- ``` rust 
121- fn  some_func (flag :  bool ) {
122-     //  increment Counter(1)
123-     ... 
124-     if  flag  {
125-         //  increment Counter(2)
126-         ... 
127-     } else  {
128-         //  count = Expression(1) = Counter(1) - Counter(2)
129-         ... 
130-     }
131-     //  count = Expression(2) = Counter(1) + Zero
132-     //      or, alternatively, Expression(2) = Counter(2) + Expression(1)
133-     ... 
134- }
135- ``` 
136- 
137- In this example, four contiguous code regions are counted while only
138- incrementing two counters.
139- 
140- CFG analysis is used to not only determine _ where_  the branches are, for
141- conditional expressions like ` if ` , ` else ` , ` match ` , and ` loop ` , but also to
142- determine where expressions can be used in place of physical counters.
143- 
144- The advantages of optimizing coverage through expressions are more pronounced
145- with loops. Loops generally include at least one conditional branch that
146- determines when to break out of a loop (a ` while `  condition, or an ` if `  or
147- ` match `  with a ` break ` ). In MIR, this is typically lowered to a ` SwitchInt ` ,
148- with one branch to stay in the loop, and another branch to break out of the
149- loop. The branch that breaks out will almost always execute less often,
150- so ` InstrumentCoverage `  chooses to add a ` CounterIncrement `  to that branch, and
151- uses an expression (` Counter(loop) - Counter(break) ` ) for the branch that
152- continues.
153- 
154- The ` InstrumentCoverage `  MIR pass is documented in
155- [ more detail below] [ instrument-coverage-pass-details ] .
156- 
157- [ mir-passes ] : mir/passes.md 
158- [ mir-instrument-coverage ] : https://github.com/rust-lang/rust/tree/master/compiler/rustc_mir_transform/src/coverage 
159- [ `FunctionCoverageInfo` ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/coverage/struct.FunctionCoverageInfo.html 
160- [ code-region ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/coverageinfo/ffi/struct.CodeRegion.html 
161- [ `CoverageKind::CounterIncrement` ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/coverage/enum.CoverageKind.html#variant.CounterIncrement 
162- [ coverage-statement ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.StatementKind.html#variant.Coverage 
163- [ instrument-coverage-pass-details ] : #implementation-details-of-the-instrumentcoverage-mir-pass 
164- 
165- ### Counter Injection and Coverage Map Pre-staging  
166- 
167- When the compiler enters [ the Codegen phase] [ backend-lowering-mir ] , with a
168- coverage-enabled MIR, [ ` codegen_statement() ` ] [ codegen-statement ]  converts each
169- MIR ` Statement `  into some backend-specific action or instruction.
170- ` codegen_statement() `  forwards ` Coverage `  statements to
171- [ ` codegen_coverage() ` ] [ codegen-coverage ] :
172- 
173- ``` rust 
174-     pub  fn  codegen_statement (& mut  self , mut  bx :  Bx , statement :  & mir :: Statement <'tcx >) ->  Bx  {
175-         ... 
176-         match  statement . kind {
177-             ... 
178-             mir :: StatementKind :: Coverage (box  ref  coverage ) =>  {
179-                 self . codegen_coverage (bx , coverage , statement . source_info. scope);
180-             }
181- ``` 
182- 
183- ` codegen_coverage() `  handles inlined statements and then forwards the coverage
184- statement to [ ` Builder::add_coverage ` ] , which handles each ` CoverageKind `  as
185- follows:
186- 
187- 
188- -  For both ` CounterIncrement `  and ` ExpressionUsed ` , the underlying counter or
189-   expression ID is passed through to the corresponding [ ` FunctionCoverage ` ] 
190-   struct to indicate that the corresponding regions of code were not removed
191-   by MIR optimizations.
192- -  For ` CoverageKind::CounterIncrement ` s, an instruction is injected in the backend
193-   IR to increment the physical counter, by calling the ` BuilderMethod ` 
194-   [ ` instrprof_increment() ` ] [ instrprof-increment ] .
195- 
196- ``` rust 
197-     fn  add_coverage (& mut  self , instance :  Instance <'tcx >, coverage :  & Coverage ) {
198-         ... 
199-         let  Coverage  { kind  } =  coverage ;
200-         match  * kind  {
201-             CoverageKind :: CounterIncrement  { id  } =>  {
202-                 func_coverage . mark_counter_id_seen (id );
203-                 ... 
204-                 bx . instrprof_increment (fn_name , hash , num_counters , index );
205-             }
206-             CoverageKind :: ExpressionUsed  { id  } =>  {
207-                 func_coverage . mark_expression_id_seen (id );
208-             }
209-         }
210-     }
211- ``` 
212- 
213- >  The function name ` instrprof_increment() `  is taken from the LLVM intrinsic
214-  call of the same name ([ ` llvm.instrprof.increment ` ] [ llvm-instrprof-increment ] ),
215- and uses the same arguments and types; but note that, up to and through this
216- stage (even though modeled after LLVM's implementation for code coverage
217- instrumentation), the data and instructions are not strictly LLVM-specific.
218- > 
219- >  But since LLVM is the only Rust-supported backend with the tooling to
220-  process this form of coverage instrumentation, the backend for ` Coverage ` 
221- statements is only implemented for LLVM, at this time.
222- 
223- [ backend-lowering-mir ] : backend/lowering-mir.md 
224- [ codegen-statement ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.codegen_statement 
225- [ codegen-coverage ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.codegen_coverage 
226- [ `Builder::add_coverage` ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/builder/struct.Builder.html#method.add_coverage 
227- [ `FunctionCoverage` ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/coverageinfo/map_data/struct.FunctionCoverage.html 
228- [ instrprof-increment ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.BuilderMethods.html#tymethod.instrprof_increment 
229- 
230- ### Coverage Map Generation  
231- 
232- With the instructions to increment counters now implemented in LLVM IR,
233- the last remaining step is to inject the LLVM IR variables that hold the
234- static data for the coverage map.
235- 
236- ` rustc_codegen_llvm ` 's [ ` compile_codegen_unit() ` ] [ compile-codegen-unit ]  calls
237- [ ` coverageinfo_finalize() ` ] [ coverageinfo-finalize ] ,
238- which delegates its implementation to the
239- [ ` rustc_codegen_llvm::coverageinfo::mapgen ` ] [ mapgen-finalize ]  module.
240- 
241- For each function ` Instance `  (code-generated from MIR, including multiple
242- instances of the same MIR for generic functions that have different type
243- substitution combinations), ` mapgen ` 's ` finalize() `  method queries the
244- ` Instance ` -associated ` FunctionCoverage `  for its ` Counter ` s, ` Expression ` s,
245- and ` CodeRegion ` s; and calls LLVM codegen APIs to generate
246- properly-configured variables in LLVM IR, according to very specific
247- details of the [ _ LLVM Coverage Mapping Format_ ] [ coverage-mapping-format ] 
248- (Version 6).[ ^ llvm-and-covmap-versions ] 
249- 
250- [ ^ llvm-and-covmap-versions ] : The Rust compiler (as of <!--  date-check: -->   Nov 2024) supports _ LLVM Coverage Mapping Format_  6.
251-     The Rust compiler will automatically use the most up-to-date coverage mapping format
252-     version that is compatible with the compiler's built-in version of LLVM.
253- 
254- ``` rust 
255- pub  fn  finalize <'ll , 'tcx >(cx :  & CodegenCx <'ll , 'tcx >) {
256-     ... 
257-     if  ! tcx . sess. instrument_coverage_except_unused_functions () {
258-         add_unused_functions (cx );
259-     }
260- 
261-     let  mut  function_coverage_map  =  match  cx . coverage_context () {
262-         Some (ctx ) =>  ctx . take_function_coverage_map (),
263-         None  =>  return ,
264-     };
265-     ... 
266-     let  mut  mapgen  =  CoverageMapGenerator :: new ();
267- 
268-     for  (instance , function_coverage ) in  function_coverage_map  {
269-         ... 
270-         let  coverage_mapping_buffer  =  llvm :: build_byte_buffer (| coverage_mapping_buffer |  {
271-             mapgen . write_coverage_mapping (expressions , counter_regions , coverage_mapping_buffer );
272-         });
273- ``` 
274- _ code snippet trimmed for brevity_ 
275- 
276- One notable first step performed by ` mapgen::finalize() `  is the call to
277- [ ` add_unused_functions() ` ] [ add-unused-functions ] :
278- 
279- When finalizing the coverage map, ` FunctionCoverage `  only has the ` CodeRegion ` s
280- and counters for the functions that went through codegen; such as public
281- functions and "used" functions (functions referenced by other "used" or public
282- items). Any other functions (considered unused) were still parsed and processed
283- through the MIR stage.
284- 
285- The set of unused functions is computed via the set difference of all MIR
286- ` DefId ` s (` tcx `  query ` mir_keys ` ) minus the codegenned ` DefId ` s (` tcx `  query
287- ` codegened_and_inlined_items ` ). ` add_unused_functions() `  computes the set of
288- unused functions, queries the ` tcx `  for the previously-computed ` CodeRegions ` ,
289- for each unused MIR, synthesizes an LLVM function (with no internal statements,
290- since it will not be called), and adds a new ` FunctionCoverage ` , with
291- ` Unreachable `  code regions.
292- 
293- [ compile-codegen-unit ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/base/fn.compile_codegen_unit.html 
294- [ coverageinfo-finalize ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/context/struct.CodegenCx.html#method.coverageinfo_finalize 
295- [ mapgen-finalize ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/coverageinfo/mapgen/fn.finalize.html 
296- [ coverage-mapping-format ] : https://llvm.org/docs/CoverageMappingFormat.html 
297- [ add-unused-functions ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/coverageinfo/mapgen/fn.add_unused_functions.html 
298- 
299- ## Testing LLVM Coverage  
95+ ## Testing coverage instrumentation  
30096
30197[ (See also the compiletest documentation for the ` tests/coverage ` 
30298test suite.)] ( ./tests/compiletest.md#coverage-tests ) 
@@ -341,151 +137,3 @@ and `mir-opt` tests can be refreshed by running:
341137[ `src/tools/coverage-dump` ] : https://github.com/rust-lang/rust/tree/master/src/tools/coverage-dump 
342138[ `tests/coverage-run-rustdoc` ] : https://github.com/rust-lang/rust/tree/master/tests/coverage-run-rustdoc 
343139[ `tests/codegen/instrument-coverage/testprog.rs` ] : https://github.com/rust-lang/rust/blob/master/tests/mir-opt/coverage/instrument_coverage.rs 
344- 
345- ## Implementation Details of the ` InstrumentCoverage `  MIR Pass  
346- 
347- The bulk of the implementation of the ` InstrumentCoverage `  MIR pass is performed
348- by [ ` instrument_function_for_coverage ` ] . For each eligible MIR body, the instrumentor:
349- 
350- -  Prepares a [ coverage graph] 
351- -  Extracts mapping information from MIR
352- -  Prepares counters for each relevant node/edge in the coverage graph
353- -  Creates mapping data to be embedded in side-tables attached to the MIR body
354- -  Injects counters and other coverage statements into MIR
355- 
356- The [ coverage graph]  is a coverage-specific simplification of the MIR control
357- flow graph (CFG). Its nodes are [ ` BasicCoverageBlock ` s] [ bcb ] , which
358- encompass one or more sequentially-executed MIR ` BasicBlock ` s
359- (with no internal branching).
360- 
361- Nodes and edges in the graph can have associated [ ` BcbCounter ` ] s, which are
362- stored in [ ` CoverageCounters ` ] .
363- 
364- [ `instrument_function_for_coverage` ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/fn.instrument_function_for_coverage.html 
365- [ coverage graph ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/graph/struct.CoverageGraph.html 
366- [ bcb ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/graph/struct.BasicCoverageBlock.html 
367- [ `BcbCounter` ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/counters/enum.BcbCounter.html 
368- [ `CoverageCounters` ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/counters/struct.CoverageCounters.html 
369- 
370- ### The ` CoverageGraph `   
371- 
372- The [ ` CoverageGraph ` ] [ coverage graph ]  is derived from the MIR (` mir::Body ` ).
373- 
374- ``` rust 
375-         let  basic_coverage_blocks  =  CoverageGraph :: from_mir (mir_body );
376- ``` 
377- 
378- Like ` mir::Body ` , the ` CoverageGraph `  is also a
379- [ ` DirectedGraph ` ] [ directed-graph ] . Both graphs represent the function's
380- fundamental control flow, with many of the same
381- [ ` graph trait ` ] [ graph-traits ] s, supporting ` start_node() ` , ` num_nodes() ` ,
382- ` successors() ` , ` predecessors() ` , and ` is_dominated_by() ` .
383- 
384- For anyone that knows how to work with the [ MIR, as a CFG] [ mir-dev-guide ] , the
385- ` CoverageGraph `  will be familiar, and can be used in much the same way.
386- The nodes of the ` CoverageGraph `  are ` BasicCoverageBlock ` s (BCBs), which
387- index into an ` IndexVec `  of ` BasicCoverageBlockData ` . This is analogous
388- to the MIR CFG of ` BasicBlock ` s that index ` BasicBlockData ` .
389- 
390- Each ` BasicCoverageBlockData `  captures one or more MIR ` BasicBlock ` s,
391- exclusively, and represents the maximal-length sequence of ` BasicBlocks ` 
392- without conditional branches.
393- 
394- [ ` compute_basic_coverage_blocks() ` ] [ compute-basic-coverage-blocks ]  builds the
395- ` CoverageGraph `  as a coverage-specific simplification of the MIR CFG. In
396- contrast with the [ ` SimplifyCfg ` ] [ simplify-cfg ]  MIR pass, this step does
397- not alter the MIR itself, because the ` CoverageGraph `  aggressively simplifies
398- the CFG, and ignores nodes that are not relevant to coverage. For example:
399- 
400-   -  The BCB CFG ignores (excludes) branches considered not relevant
401-     to the current coverage solution. It excludes unwind-related code[ ^ 78544 ] 
402-     that is injected by the Rust compiler but has no physical source
403-     code to count, which allows a ` Call ` -terminated BasicBlock
404-     to be merged with its successor, within a single BCB.
405-   -  A ` Goto ` -terminated ` BasicBlock `  can be merged with its successor
406-     ** _ as long as_ **  it has the only incoming edge to the successor
407-     ` BasicBlock ` .
408-   -  Some BasicBlock terminators support Rust-specific concerns--like
409-     borrow-checking--that are not relevant to coverage analysis. ` FalseUnwind ` ,
410-     for example, can be treated the same as a ` Goto `  (potentially merged with
411-     its successor into the same BCB).
412- 
413- [ ^ 78544 ] : (Note, however, that Issue [ #78544  ] [ rust-lang/rust#78544 ]  considers
414- providing future support for coverage of programs that intentionally
415- ` panic ` , as an option, with some non-trivial cost.)
416- 
417- The BCB CFG is critical to simplifying the coverage analysis by ensuring graph path-based
418- queries (` is_dominated_by() ` , ` predecessors ` , ` successors ` , etc.) have branch (control flow)
419- significance.
420- 
421- [ directed-graph ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/graph/trait.DirectedGraph.html 
422- [ graph-traits ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/graph/index.html#traits 
423- [ mir-dev-guide ] : mir/index.md 
424- [ compute-basic-coverage-blocks ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/graph/struct.CoverageGraph.html#method.compute_basic_coverage_blocks 
425- [ simplify-cfg ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/simplify/enum.SimplifyCfg.html 
426- [ rust-lang/rust#78544 ] : https://github.com/rust-lang/rust/issues/78544 
427- 
428- ### ` make_bcb_counters() `  
429- 
430- [ ` make_bcb_counters ` ]  traverses the ` CoverageGraph `  and adds a
431- ` Counter `  or ` Expression `  to every BCB. It uses _ Control Flow Analysis_ 
432- to determine where an ` Expression `  can be used in place of a ` Counter ` .
433- ` Expressions `  have no runtime overhead, so if a viable expression (adding or
434- subtracting two other counters or expressions) can compute the same result as
435- an embedded counter, an ` Expression `  is preferred.
436- 
437- [ ` TraverseCoverageGraphWithLoops ` ] [ traverse-coverage-graph-with-loops ] 
438- provides a traversal order that ensures all ` BasicCoverageBlock `  nodes in a
439- loop are visited before visiting any node outside that loop. The traversal
440- state includes a ` context_stack ` , with the current loop's context information
441- (if in a loop), as well as context for nested loops.
442- 
443- Within loops, nodes with multiple outgoing edges (generally speaking, these
444- are BCBs terminated in a ` SwitchInt ` ) can be optimized when at least one
445- branch exits the loop and at least one branch stays within the loop. (For an
446- ` if `  or ` while ` , there are only two branches, but a ` match `  may have more.)
447- 
448- A branch that does not exit the loop should be counted by ` Expression ` , if
449- possible. Note that some situations require assigning counters to BCBs before
450- they are visited by traversal, so the ` counter_kind `  (` CoverageKind `  for
451- a ` Counter `  or ` Expression ` ) may have already been assigned, in which case
452- one of the other branches should get the ` Expression ` .
453- 
454- For a node with more than two branches (such as for more than two
455- ` match `  patterns), only one branch can be optimized by ` Expression ` . All
456- others require a ` Counter `  (unless its BCB ` counter_kind `  was previously
457- assigned).
458- 
459- A branch expression is derived from the equation:
460- 
461- ``` text 
462- Counter(branching_node) = SUM(Counter(branches)) 
463- ``` 
464- 
465- It's important to
466- be aware that the ` branches `  in this equation are the outgoing _ edges_ 
467- from the ` branching_node ` , but a ` branch ` 's target node may have other
468- incoming edges. Given the following graph, for example, the count for
469- ` B `  is the sum of its two incoming edges:
470- 
471- <img alt="Example graph with multiple incoming edges to a branch node"
472-  src="img/coverage-branch-counting-01.png" class="center" style="width: 25%">
473- <br />
474- 
475- In this situation, BCB node ` B `  may require an edge counter for its
476- "edge from A", and that edge might be computed from an ` Expression ` ,
477- ` Counter(A) - Counter(C) ` . But an expression for the BCB _ node_  ` B ` 
478- would be the sum of all incoming edges:
479- 
480- ``` text 
481- Expression((Counter(A) - Counter(C)) + SUM(Counter(remaining_edges))) 
482- ``` 
483- 
484- Note that this is only one possible configuration. The actual choice
485- of ` Counter `  vs. ` Expression `  also depends on the order of counter
486- assignments, and whether a BCB or incoming edge counter already has
487- its ` Counter `  or ` Expression ` .
488- 
489- [ `make_bcb_counters` ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/counters/struct.CoverageCounters.html#method.make_bcb_counters 
490- [ bcb-counters ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/counters/struct.BcbCounters.html 
491- [ traverse-coverage-graph-with-loops ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/coverage/graph/struct.TraverseCoverageGraphWithLoops.html 
0 commit comments