refactorings to prepare the compiler for IC #15935

Araq · 2020-11-12T13:54:53Z

Also speeds up the compiler in general, saved 0.4s from bootstrapping.

Todo:

enable converters and TR macros.
make every test green.

…mbols)

planetis-m · 2020-12-15T16:37:24Z

so my code that breaks looks like:

proc rnd[T: SomeFloat](x: int, v: T) = discard
template rnd[T](x: int) = rnd[T](x, T(1))
proc rnd2[T: SomeFloat](x: int, v: T) = discard
template rnd2[T](x: int) = rnd2[T](x, T(1))

I remember having hard-to-pin-down bugs with these templates before, (with my custom destructor hooks not getting used) and the workaround was to move the templates below each function!? Which was weird but worked.

planetis-m · 2020-12-15T16:48:55Z

Yeah I'm getting expression 'randMatrix[float32](rowDimension(A), columnDimension(A))' cannot be called with devel nim by moving the template.

Araq · 2020-12-15T18:29:22Z

Yeah I'm getting expression 'randMatrix[float32](rowDimension(A), columnDimension(A))' cannot be called with devel nim by moving the template.

Yes, I know. It's easy to reproduce and blocking this merge:

type
  Matrix[T] = object
    data: T


proc randMatrix*[T](m, n: int, max: T): Matrix[T] = discard
proc randMatrix*[T](m, n: int, x: Slice[T]): Matrix[T] = discard

template randMatrix*[T](m, n: int): Matrix[T] = randMatrix[T](m, n, T(1.0))

let B = randMatrix[float32](20, 10)

timotheecour · 2020-12-15T19:20:52Z

compiler/ic/packed_ast.nim

+  tree.nodes.add Node(kind: nkSym, operand: int32(s), info: info)
+
+proc addModuleId*(tree: var PackedTree; s: ModuleId; info: PackedLineInfo) =
+  tree.nodes.add Node(kind: nkInt32Lit, operand: int32(s), info: info)


use directIntLit her; even if the same right now, it's more readable

timotheecour · 2020-12-15T19:21:48Z

compiler/ic/packed_ast.nim

+    nkUInt8Lit,
+    nkUInt16Lit,
+    nkUInt32Lit,
+    nkUInt64Lit} # nkInt32Lit is missing by design!


what's the benefit of adding this odd special case?
why not add a new nkDirectIntLit instead and avoid conflating an int32 with this; the saving is insignificant but it will lead to less bugs

Totally agree but I cannot easily extend the AST without touching the macro system.

but you can add to the end, it shouldn't break code (and other node kinds have been added in past without issues)

for now it maybe doesn't even need to be mapped to macros.NimNodeKind, but if needed it can with when defined(nimHasNkDirectIntLit) (can be DRY with nim-lang/RFCs#190, or simply with duplication)

timotheecour · 2020-12-15T19:26:49Z

compiler/ic/packed_ast.nim

+  externUIntLit* = {nkUIntLit, nkUInt8Lit, nkUInt16Lit, nkUInt32Lit, nkUInt64Lit}
+  directIntLit* = nkInt32Lit
+
+proc toString*(tree: PackedTree; n: NodePos; nesting: int; result: var string) =


(+ elsewhere)
result: var string should go first, more consistent with rest of language/stdlib and works better with dup + friends

Ok, but I don't know if I'll get around to it in this PR.

timotheecour · 2020-12-16T08:46:00Z

compiler/ic/packed_ast.nim

+  ModulePhase* = enum
+    preLookup, lookedUpTopLevelStmts
+
+  Module* = object


why can't this be a ref object? it should have ref semantics

timotheecour · 2020-12-16T10:14:03Z

If the main goal is IC and serialization of modules, why can't we simply just add functionality to serialize/deserialize the existing modules (PNode, PSym, PType etc) instead of introducing a totally parallel IR that duplicates existing data structures and must co-exist with it for the foreseeable future without actually replacing it.

This would result in a much less invasive change compared to this PR.

As for the stated benefits of packed AST (vs existing unpacked one) in terms of performance, they'd need to be measured carefully; I'm not convinced it'd justify the added complexity; the AST is after all representing a tree so representing it as a packed structure leads to impedance mismatch. In any case that's not apparently the main goal, which is IC (IC itself, not packed AST, will deliver large performance improvements for compile times).

The changes here are quite invasive (and will cause lots of PRs to break and go back to drawing board) so there needs to be a tangible benefit vs alternatives.

Also speeds up the compiler in general, saved 0.4s from bootstrapping.

I'm not observing this, I get the same timings modulo noise.
before PR, ./koch boot -d:release takes 47.8 (similar if re-running a few times)
after PR, ./koch boot -d:release takes 47.5 (similar if re-running a few times)

But any such savings would be dwarfed by the gains from doing incremental compilation.

different approach for IC: compiler server

I've actually implemented a different approach based on compilation as service; it's a lot less invasive and has following features, inspired by nimsuggest:

a long living server process that listens through a port; it holds data structures (module data etc) in memory
a short lived client that users call from command line to compile / incrementally re-compile a nim project; it communicates through RPC to the server

I've used it for a working proof of concept of a REPL that can be used similar to inim except it's really fast (similar speed to nim secret, but unlike nim secret it generates/runs cgen'd code, not vm code, so importc works etc) as it doesnt' recompile everything on each new issued command, instead it (for now at least) creates 1 new module for each new command and all existing data / imported modules don't need to be semchecked again. There are still unknowns but it looks really promising, in between nim secret and inim. And it doesn't require any serialization/deserialization either.

Araq · 2020-12-16T19:50:29Z

before PR, ./koch boot -d:release takes 47.8 (similar if re-running a few times)
after PR, ./koch boot -d:release takes 47.5 (similar if re-running a few times)

So you observe a 0.3s difference. :-)

As for the stated benefits of packed AST (vs existing unpacked one) in terms of performance, they'd need to be measured carefully; I'm not convinced it'd justify the added complexity; the AST is after all representing a tree so representing it as a packed structure leads to impedance mismatch. In any case that's not apparently the main goal, which is IC (IC itself, not packed AST, will deliver large performance improvements for compile times).

The way we modelled the AST is a directed acyclic graph, not as a tree. With bad mutability tracking and hacks. In the packed representation it's a tree by construction, you cannot create cycles or share nodes, even if you wanted to. It does take up less memory and it really is faster to process, but that's not even the main goal, it's about "correct by construction". And the risks are close to zero, for complex transformations you can convert it to PNode, do the transform and pack it again afterwards.

Also, it is my hope that with the packed representation we can compute hashes more effectively, for example, to quickly deduplicate function bodies in the backend or to cache generic instantiations.

The changes here are quite invasive (and will cause lots of PRs to break and go back to drawing board) so there needs to be a tangible benefit vs alternatives.

I don't see it this way. There is a refactoring so that module imports are now done lazily. Something that we really need when we seek to load modules from disk. The PackedTree is not used anywhere yet, it's part of this PR so that I don't lose this code. If we end up not using it, so be it.

different approach for IC: compiler server

You are not the first who proposes this. In my experience it works so bad for nimsuggest that I don't want to go down this path. It's also inferior, I like to save compilation artifacts on disk, not just in memory. The way we currently do it is a dead-end, you have this fragile process and when it crashes most of the information that caused the crash is lost. It's also not how the rest of the industry does it (precompiled headers, FPC's ppu files, C#'s bytecode).

Araq · 2020-12-16T19:57:39Z

@planetis-m Please update your package so that it uses:

proc randMatrix*[T](m, n: int): Matrix[T] {.inline.} = randMatrix[T](m, n, T(1.0))
proc randMatrix32*(m, n: int): Matrix32 {.inline.} = randMatrix(m, n, 1.0'f32)
proc randMatrix64*(m, n: int): Matrix64 {.inline.} = randMatrix(m, n, 1.0)

It only works by chance with Nim devel and we're tracking the bug here, #16376

planetis-m · 2020-12-16T21:48:00Z

@Araq done.

* added ic specific Nim code; WIP * make the symbol import mechanism lazy; WIP * ensure that modules can be imported multiple times * ambiguity checking * handle converters and TR macros properly * make 'enum' test category green again * special logic for semi-pure enums * makes nimsuggest tests green again * fixes nimdata * makes nimpy green again * makes more important packages work

Araq added 27 commits October 26, 2020 06:49

added ic specific Nim code; WIP

a8cc044

minor progress

cc74e10

baby steps

df40c17

baby steps

fca3a14

stuff

a6683e8

Merge branch 'devel' into araq-ic4

711687c

make the symbol import mechanism lazy; WIP

0ba8731

progress

d48d544

take into account lazy import lists; WIP

04d35b4

progress

ecb366c

progress, but it's really ugly crap

d92c578

'hello world' compiles again

c5c7130

progress: modules can be imported multiple times (yay)

08fa199

progress

e70b91a

Merge branch 'devel' into araq-ic4

50043bb

Progress: filter out duplicate symbols (can be caused by export'ed sy…

6e2073c

…mbols)

bootstrapping works again

78e5f4f

ambiguity checking

c28a3c9

disable silly test for now

991e653

handle converters and TR macros properly

cbada55

added back the import scope

6867310

Merge branch 'devel' into araq-ic4

064f248

typo

62595b2

make 'enum' test category green again

2fa5360

special logic for semi-pure enums

b76569e

makes nimsuggest tests green again

668b000

fixes nimdata

2152577

Araq mentioned this pull request Dec 14, 2020

fix #2844 #3911; add --spellsuggest to suggest symbols in scope with similar spellings on undefined symbol error #16067

Merged

7 tasks

Araq added 2 commits December 14, 2020 18:03

makes nimpy green again

9c6cf06

makes more important packages work

63ea0a3

Araq added 2 commits December 15, 2020 15:54

updated important_packages.nim

a3e5656

see if that makes all the difference

ff0f348

timotheecour reviewed Dec 15, 2020

View reviewed changes

timotheecour reviewed Dec 16, 2020

View reviewed changes

enabled manu important package

99f3ae6

Araq merged commit 979148e into devel Dec 17, 2020

Araq deleted the araq-ic4 branch December 17, 2020 07:01

saem mentioned this pull request Dec 21, 2020

Saem nimsuggest re-enable tests #16401

Merged

timotheecour mentioned this pull request Dec 23, 2020

re-enable nimgame2 #16445

Closed

ghost mentioned this pull request Mar 6, 2021

regression: {.pure.} pragma fails to hide enum #16462

Open

ghost mentioned this pull request Sep 19, 2021

Term-rewriting macros don't work across module boundaries anymore #18863

Closed

ghost mentioned this pull request Dec 4, 2021

Unexported converters propagate through imports and affect code #19213

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactorings to prepare the compiler for IC #15935

refactorings to prepare the compiler for IC #15935

Araq commented Nov 12, 2020 •

edited

Loading

planetis-m commented Dec 15, 2020 •

edited

Loading

planetis-m commented Dec 15, 2020

Araq commented Dec 15, 2020

timotheecour Dec 15, 2020

timotheecour Dec 15, 2020 •

edited

Loading

Araq Dec 15, 2020

timotheecour Dec 15, 2020 •

edited

Loading

timotheecour Dec 15, 2020

Araq Dec 15, 2020

timotheecour Dec 16, 2020

timotheecour commented Dec 16, 2020 •

edited

Loading

Araq commented Dec 16, 2020 •

edited

Loading

Araq commented Dec 16, 2020

planetis-m commented Dec 16, 2020

refactorings to prepare the compiler for IC #15935

refactorings to prepare the compiler for IC #15935

Conversation

Araq commented Nov 12, 2020 • edited Loading

planetis-m commented Dec 15, 2020 • edited Loading

planetis-m commented Dec 15, 2020

Araq commented Dec 15, 2020

timotheecour Dec 15, 2020

Choose a reason for hiding this comment

timotheecour Dec 15, 2020 • edited Loading

Choose a reason for hiding this comment

Araq Dec 15, 2020

Choose a reason for hiding this comment

timotheecour Dec 15, 2020 • edited Loading

Choose a reason for hiding this comment

timotheecour Dec 15, 2020

Choose a reason for hiding this comment

Araq Dec 15, 2020

Choose a reason for hiding this comment

timotheecour Dec 16, 2020

Choose a reason for hiding this comment

timotheecour commented Dec 16, 2020 • edited Loading

different approach for IC: compiler server

Araq commented Dec 16, 2020 • edited Loading

Araq commented Dec 16, 2020

planetis-m commented Dec 16, 2020

Araq commented Nov 12, 2020 •

edited

Loading

planetis-m commented Dec 15, 2020 •

edited

Loading

timotheecour Dec 15, 2020 •

edited

Loading

timotheecour Dec 15, 2020 •

edited

Loading

timotheecour commented Dec 16, 2020 •

edited

Loading

Araq commented Dec 16, 2020 •

edited

Loading