Skip to content

Only refer to LLVM symbol table in calls to Symbol#to_s#15486

Merged
straight-shoota merged 3 commits intocrystal-lang:masterfrom
HertzDevil:perf/symbol-to_s-table-constant
Mar 13, 2025
Merged

Only refer to LLVM symbol table in calls to Symbol#to_s#15486
straight-shoota merged 3 commits intocrystal-lang:masterfrom
HertzDevil:perf/symbol-to_s-table-constant

Conversation

@HertzDevil
Copy link
Contributor

@HertzDevil HertzDevil commented Feb 18, 2025

The compiler places the string contents of all symbols defined in the source code into a special :symbol_table LLVM global variable in the main LLVM module:

@":symbol_table" = global [24 x ptr] [
  ptr @"'general'", ptr @"'no_error'", ptr @"'gc'", ptr @"'sequentially_consis...'",
  ptr @"'monotonic'", ptr @"'acquire'", ptr @"'evloop'", ptr @"'xchg'", ptr @"'release'",
  ptr @"'acquire_release'", ptr @"'io_write'", ptr @"'sched'", ptr @"'io_read'",
  ptr @"'skip'", ptr @"'none'", ptr @"'active'", ptr @"'done'", ptr @"'unchecked'",
  ptr @"'sleep'", ptr @"'relaxed'", ptr @"'add'", ptr @"'sub'", ptr @"'file'", ptr @"'target'"
]

Every LLVM module, without exception, also declares it:

@":symbol_table" = external global [24 x ptr]

Since the total number of symbols is part of the variable type, adding or removing any symbol invalidates all object files in the Crystal cache, even though only Symbol#to_s, or rather @[Primitive(:symbol_to_s)], has access to this variable.

This PR removes this declaration except where the primitive accesses it. Changes to the symbol table will only affect LLVM modules that call Symbol#to_s (the primitive body itself is inlined). Building an empty file, replacing it with x = :abcde, then rebuilding the same file will now give:

Codegen (bc+obj):
 - 315/317 .o files were reused

These modules were not reused:
 - _main (_main.o0.bc)
 - Crystal (C-rystal.o0.bc)

(Crystal is also invalidated because the return type of Crystal.main is now Symbol instead of Nil; appending nil to the file makes this module reusable.)

Note that it is still possible to invalidate large portions of the cache from the addition or removal of symbols, because their indices are allocated sequentially, and inlined on every use. Something like #15485 but for Symbol values would solve that.

@HertzDevil HertzDevil changed the title Only refer to symbol table in definition of Symbol#to_s Only refer to LLVM symbol table in calls to Symbol#to_s Feb 18, 2025
@straight-shoota straight-shoota added this to the 1.16.0 milestone Mar 4, 2025
@straight-shoota straight-shoota merged commit 945561f into crystal-lang:master Mar 13, 2025
32 checks passed
Blacksmoke16 pushed a commit to Blacksmoke16/crystal that referenced this pull request Mar 18, 2025
…ng#15486)

The compiler places the string contents of all symbols defined in the source code into a special `:symbol_table` LLVM global variable in the main LLVM module:

```llvm
@":symbol_table" = global [24 x ptr] [
  ptr @"'general'", ptr @"'no_error'", ptr @"'gc'", ptr @"'sequentially_consis...'",
  ptr @"'monotonic'", ptr @"'acquire'", ptr @"'evloop'", ptr @"'xchg'", ptr @"'release'",
  ptr @"'acquire_release'", ptr @"'io_write'", ptr @"'sched'", ptr @"'io_read'",
  ptr @"'skip'", ptr @"'none'", ptr @"'active'", ptr @"'done'", ptr @"'unchecked'",
  ptr @"'sleep'", ptr @"'relaxed'", ptr @"'add'", ptr @"'sub'", ptr @"'file'", ptr @"'target'"
]
```

Every LLVM module, without exception, also declares it:

```llvm
@":symbol_table" = external global [24 x ptr]
```

Since the total number of symbols is part of the variable type, adding or removing any symbol invalidates all object files in the Crystal cache, even though only `Symbol#to_s`, or rather `@[Primitive(:symbol_to_s)]`, has access to this variable.

This PR removes this declaration except where the primitive accesses it. Changes to the symbol table will only affect LLVM modules that call `Symbol#to_s` (the primitive body itself is inlined). Building an empty file, replacing it with `x = :abcde`, then rebuilding the same file will now give:

```
Codegen (bc+obj):
 - 315/317 .o files were reused

These modules were not reused:
 - _main (_main.o0.bc)
 - Crystal (C-rystal.o0.bc)
```

(`Crystal` is also invalidated because the return type of `Crystal.main` is now `Symbol` instead of `Nil`; appending `nil` to the file makes this module reusable.)

Note that it is still possible to invalidate large portions of the cache from the addition or removal of symbols, because their indices are allocated sequentially, and inlined on every use. Something like crystal-lang#15485 but for `Symbol` values would solve that.
@HertzDevil HertzDevil deleted the perf/symbol-to_s-table-constant branch April 5, 2025 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants