v0.1.2
What's Changed
- Gemmini cleanup for the artifact by @yamaguchi1024 in #171
- Snapshot of gemmini results in the paper by @yamaguchi1024 in #176
- Bump pillow from 9.0.0 to 9.0.1 by @dependabot in #177
- Upgrade dependencies. Make PyTest more comfortable to use. by @alexreinking in #179
- Move gemmini code under platform by @yamaguchi1024 in #180
- Fix typo in README by @alexreinking in #183
- Remove deprecated par loops from reorder and split by @skeqiqevian in #184
- remove scipy and torch from requirements by @yamaguchi1024 in #164
- Improve simpilfy to normalize polynomial before constant evaluation by @yamaguchi1024 in #186
- AMX scheduling + par loop related changes by @skeqiqevian in #188
- Refactoring the API as part of Effect-Check/API audit by @gilbo in #189
- Fix code coverage by @alexreinking in #192
- Add initial cursor skeleton by @alexreinking in #193
- Fix bugs in Procedure by @alexreinking in #197
- Deprecate add_guard by @yamaguchi1024 in #195
- Improve golden output interop with PyTest diffs by @alexreinking in #196
- Fix update-golden by @alexreinking in #198
- Bump pillow from 9.1.0 to 9.1.1 by @dependabot in #199
- Accomodate Seq in bind_expr, fission_after, add_loop, double fission by @yamaguchi1024 in #203
- Fix exposure of LoopIR_pprint in init.py by @gilbo in #202
- Add failing test for stage_mem by @yamaguchi1024 in #204
- Allow size expression in add_loop bounds, and fixed parse_fragments by @yamaguchi1024 in #207
- Create LICENSE.md by @yamaguchi1024 in #209
- Update SDE version by @yamaguchi1024 in #211
- Separate malloc definition and forward declarations by @yamaguchi1024 in #208
- Fix delete_pass to delete the loop if the body is empty by @yamaguchi1024 in #212
- modified tests to run scheduling on platforms not supporting execution by @gilbo in #213
- Update Examples & README by @yamaguchi1024 in #214
- Implement product_loop by @yamaguchi1024 in #217
- Merge neon-example by @yamaguchi1024 in #218
- remove loop if the body got emptied by assert_if by @yamaguchi1024 in #221
- Cursor API Step 1 by @gilbo in #220
- Implement more precise memory analysis for LoopIR.Free by @yamaguchi1024 in #223
- adding a file to help with interactive demos by @gilbo in #226
- Deprecate par by @yamaguchi1024 in #222
- Implement commute() by @yamaguchi1024 in #224
- Add experiment scripts from PLDI paper by @alexreinking in #228
- Check generated C code from apps by @alexreinking in #230
- Add
static
to non-exported functions by @alexreinking in #232 - Update the code, name, and numbers of the examples to accomodate sche… by @yamaguchi1024 in #233
- Fix a scheduling error in gemmini conv by @yamaguchi1024 in #241
- Delete rotten gemmini code by @yamaguchi1024 in #242
- Move gemmini-rocc-tests from copy to separate repository by @yamaguchi1024 in #229
- Fixed bind_expr bug with LHS indices by @skeqiqevian in #237
- Run Gemmini tests in CI by @alexreinking in #243
- Rework Gemmini testing into standalone app by @alexreinking in #244
- Codegen cleanup by @alexreinking in #240
- Stop copying whole trees in scheduling passes by @alexreinking in #252
- Format repository, supported with tooling. by @alexreinking in #253
- Implement merge writes by @skeqiqevian in #236
- Bind expr fix by @skeqiqevian in #254
- Deprecate PatternParser by @yamaguchi1024 in #255
- Deprecate UAST.ForAll by @yamaguchi1024 in #258
- Fix lift_alloc to emit error when n_lifts is too large by @yamaguchi1024 in #259
- Expose cursor to scheduling; Delete four rotten primitives by @yamaguchi1024 in #260
- pass proc_cursors to scheduling directives by @yamaguchi1024 in #261
- Change scheduling interface to take procedure cursor and return Procedure Object by @yamaguchi1024 in #263
- Change scheduling API to take stmt cursors by @yamaguchi1024 in #266
- Use Ubuntu 20.04 for testing Gemmini by @alexreinking in #268
- Update requirements for macOS 13 by @alexreinking in #269
- Try pinning Python to 3.10 in Gemmini by @alexreinking in #272
- Update pyproject by @yamaguchi1024 in #273
- Add can_read to memories by @yamaguchi1024 in #267
- First attempt to merge #191 by @yamaguchi1024 in #277
- Fix inline_window to correctly handle stride dimension by @yamaguchi1024 in #276
- Mark slow tests as such in PyTest. by @alexreinking in #280
- Refactor pretty printer by @alexreinking in #279
- Pretty cursors by @alexreinking in #281
- Fix const-window argument passing by @alexreinking in #288
- New lift_scope rewrite operation by @skeqiqevian in #283
- CMake package fixes by @alexreinking in #290
- renamed fusion to fuse by @skeqiqevian in #291
- Effects merge attempt 2 by @yamaguchi1024 in #293
- Bump pillow from 9.1.1 to 9.3.0 by @dependabot in #295
- Fmla support added by @adcastel in #285
- Corrects expand_dim docstring by @SamirDroubi in #298
- Deprecate double_fission by @SamirDroubi in #299
- Deprecate stage_assn by @SamirDroubi in #300
- Add a test for reorder_stmts by @SamirDroubi in #303
- Fix bug of capturing stmt env in BuildEnv by @SamirDroubi in #306
- Amx memory by @skeqiqevian in #302
- Add Formatted Expressions by @SamirDroubi in #304
- Implement internal node-cursor forwarding by @alexreinking in #309
- Third attempt to merge #191 by @SamirDroubi in #307
- Start converting LoopIR_scheduling to cursors architecture by @alexreinking in #311
- Make pattern_match always return cursors. by @alexreinking in #316
- Convert reorder_stmts to use cursors by @alexreinking in #317
- Make cut_loop work on loops with a variable upper bound by @yamaguchi1024 in #315
- Div mod simplify in index expressions by @SamirDroubi in #313
- x86 vector instruction for select-builtin and reduction by @SamirDroubi in #322
- Filter1D example and memory aware replace by @yamaguchi1024 in #320
- Add -arch=arm64 when targeting neon by @yamaguchi1024 in #325
- Add fragement parsing for builtin calls by @SamirDroubi in #326
- Add PYTHONPATH feature to add_exo_library by @alexreinking in #328
- Add depfile support to CMake rules by @alexreinking in #329
- Drop stale githooks directory by @alexreinking in #330
- Use window for all the AVX2 instructions by @yamaguchi1024 in #332
- Add mechanical citation information to the repo by @alexreinking in #331
- avx2 add, broadcast, reg copy instructions by @SamirDroubi in #334
- Fix static_memory_check by @SamirDroubi in #337
- Fix assert_if and add tests by @yamaguchi1024 in #338
- Add alpha renaming in cut_loop by @yamaguchi1024 in #339
- x86 instrs tensor parameters change to windows by @SamirDroubi in #341
- Convert scheduling directives to use cursor editing APIs by @alexreinking in #323
- fixed type check in merge_writes by @skeqiqevian in #348
- Avx2 double precision by @SamirDroubi in #349
- rewrote specialize with internal cursors by @skeqiqevian in #355
- rewrote commute_expr with internal cursors by @skeqiqevian in #353
- rewrote add_loop with internal cursors by @skeqiqevian in #352
- rewrote fuse_if with internal cursors by @skeqiqevian in #354
- implemented scheduling operation for distributive property by @skeqiqevian in #344
- rewrote lift_alloc with internal cursors by @skeqiqevian in #357
- rewrote lift_scope with internal cursors by @skeqiqevian in #359
- rewrote expand_dim with internal cursors by @skeqiqevian in #360
- Add missing neon f32 instructions by @yamaguchi1024 in #358
- rewrote mult_dim with Internal cursors by @skeqiqevian in #363
- rewrote divide_dim with Internal cursors by @skeqiqevian in #365
- rewrote call_eqv with Internal cursors by @skeqiqevian in #366
- rewrote fission with Internal cursors by @skeqiqevian in #368
- rewrote write_config with internal cursors by @skeqiqevian in #367
- rewrote bind_config with Internal cursors by @skeqiqevian in #369
- Basic cursor forwarding at API level by @alexreinking in #370
- Fix divide loop by @yamaguchi1024 in #373
- Fix forwarding composition in divide_dim by @alexreinking in #375
- Fix _replace_pats_stmts by @alexreinking in #376
- rewrite divide_loop with internal cursors by @skeqiqevian in #372
- Add BLAS correctness test to Exo repo's CI by @yamaguchi1024 in #379
- replace types nodes directly by @skeqiqevian in #380
- add failing test case by @yamaguchi1024 in #381
- Makes cursor replaces less destructive by @skeqiqevian in #382
- implemented bind_expr with Internal cursors by @skeqiqevian in #387
- rewrote stage_mem with internal cursors by @skeqiqevian in #386
- rewrote replace with internal cursors by @skeqiqevian in #384
- simplify division by trying to split the denominator by @SamirDroubi in #385
- rewrote inline_window with Internal cursors by @skeqiqevian in #371
- Fix forwarding bugs by @skeqiqevian in #388
- filter1D cursor forwarding example, and some fixes in pretty printing by @yamaguchi1024 in #378
- Forwarding bug + API Cursors bug by @SamirDroubi in #391
- Fix cursor wrapper bug by @skeqiqevian in #393
- implemented simplify with internal cursors by @skeqiqevian in #389
- unexpose Sym in the cursors API by @SamirDroubi in #394
- Change Neon4f to Neon, support f64 by @yamaguchi1024 in #395
- Add missing instructions for neon by @yamaguchi1024 in #396
- Gap and Block refactoring by @alexreinking in #397
- Skeleton of argcursor by @yamaguchi1024 in #399
- Implement ExoType by @yamaguchi1024 in #400
- Undo changes to internal cursors for fnargs by @alexreinking in #401
- Fix bug in _move forwarding by @skeqiqevian in #403
- Fix reorder loops forwarding by @skeqiqevian in #404
- Forward gaps and internal cursor cleanup by @alexreinking in #406
- Fix new effectcheck bug, disable some checks in gemmini by @yamaguchi1024 in #402
- rewrite set_precision, set_memory, and set_window with internal cursors by @skeqiqevian in #408
- rewrite rearrange_dim with internal cursors by @skeqiqevian in #407
- Support arguments ordering in extract_subproc by @yamaguchi1024 in #409
- fixed bug in reorder_loops by @skeqiqevian in #411
- Try inlining constant strides at codegen time by @alexreinking in #412
- Add mask instruction for x86 by @yamaguchi1024 in #414
- set recursion limit by @yamaguchi1024 in #413
- Cleaned up _replace_pats functions by @skeqiqevian in #416
- rewrote get_reads and get_writes, cleaned up some imports by @skeqiqevian in #415
- replace_all blow through NotImplementedError by @SamirDroubi in #418
- Normalize abd range analysis bug fix by @SamirDroubi in #421
- Remove unnecessary InferEffects from schedules by @yamaguchi1024 in #422
- Reduce cursor invalidation by @skeqiqevian in #417
- Add checks on conditions in specialize by @skeqiqevian in #424
- delete unused class by @skeqiqevian in #426
- CIR for index access simplification by @yamaguchi1024 in #423
- ArgCursor::mem assert bug fix by @SamirDroubi in #428
- ArgCursor introspection testing + bug fix by @SamirDroubi in #429
- implement transpose by @skeqiqevian in #431
- Fix typo in new_eff by @yamaguchi1024 in #434
- Delete outdated clamping code by @yamaguchi1024 in #435
- Allow DoSplit to check proc predicates for modulo asserts by @skeqiqevian in #437
- Half precision support added by @adcastel in #297
- Add alias check. Fixes #275. by @rachitnigam in #427
- Allow loops to have arbitrary expressions as
lo
by @skeqiqevian in #425 - Fix a typo in new_analysis core by @yamaguchi1024 in #442
- Propagate InferEffects to subprocedure calls by @yamaguchi1024 in #439
- Fix divide_loop type bug by @yamaguchi1024 in #445
- Add self-contained install via nix flake by @gdinh in #449
- Fix CI by @yamaguchi1024 in #457
- change pull_request to pull_request_target altogether by @yamaguchi1024 in #458
- Update example for clarify by @yamaguchi1024 in #459
- First RVV backend version by @adcastel in #453
- Change the loop iterator type from int to int_fast32_t by @yamaguchi1024 in #443
- Update setup-python's version to v4 by @yamaguchi1024 in #461
- Update example's replace_all after making it memory-aware by @yamaguchi1024 in #462
- bump setup-sde's version up by @yamaguchi1024 in #463
- Fix Bcast for RVV by @adcastel in #464
- New broadcast support for RVV by @adcastel in #465
- Implement unroll_buffer operation by @yamaguchi1024 in #460
- Move instructions for repo installation to the top of the README by @rachitnigam in #467
- Implement div denominator constant folding to simplify by @yamaguchi1024 in #473
- Rework the Scheduling Example by @rachitnigam in #468
- do not try to split denominator when lhs has div or mod by @yamaguchi1024 in #476
- Use sym repr to pat match within scheduling by @SamirDroubi in #480
- Fixed issue with forwarding when procs are equivalent. by @skeqiqevian in #481
- Fixed forwarding bug in simplify for window stmts by @skeqiqevian in #482
- Revised cut_loop and implemented shift_loop by @SamirDroubi in #491
- assert_if revision by @SamirDroubi in #492
- Evolve range analysis and remove redundant exo_floor_div by @SamirDroubi in #495
- Fix unification bug when unifying conditions by @SamirDroubi in #500
- Added
sink_alloc
by @skeqiqevian in #501 - Bump pillow from 9.3.0 to 10.0.1 by @dependabot in #513
- Generalize remove_if to eliminate_dead_code by @SamirDroubi in #518
- Parrot Blur Sprint by @skeqiqevian in #511
- Precision propagation bug fix by @andrewj31415 in #519
- Fixed incorrect cursor edit in set_window by @skeqiqevian in #520
- Add replace_once and unsafe flag to fission by @yamaguchi1024 in #525
- More general bounds inference by @skeqiqevian in #522
- Allow dividing loops by 1 by @SamirDroubi in #530
- Adds parallel loops and scheduling op to parallelize loops by @skeqiqevian in #526
- Dram stack by @skeqiqevian in #528
- Missing API from AllocCursor by @SamirDroubi in #535
- Changing shrink_dim to resize_dim by @skeqiqevian in #537
- Added support for uint16 and improved
replace_all
to handle bodies with length > 1 by @skeqiqevian in #536 - Add guards to load/store stage in stage_mem by @SamirDroubi in #527
- Update Github CI workflows by @yamaguchi1024 in #538
- Bind expr fixes by @SamirDroubi in #542
- Pattern matching on cursors by @skeqiqevian in #543
- Upgrade z3-solver version to 4.12.4.0 by @SamirDroubi in #548
- Add pldi24 combinator tests by @yamaguchi1024 in #534
- Fix a bug in extract_proc by @yamaguchi1024 in #532
- Update the SDE version by @SamirDroubi in #551
- Fix two bugs in unroll_buffer by @SamirDroubi in #550
- Fix bug in inline_assign by @SamirDroubi in #555
- Implement fold_into_reduce and reassociate_expr by @SamirDroubi in #558
- Fix failing Neon tests by @yamaguchi1024 in #559
- Update the github workflow by @yamaguchi1024 in #560
- Bump pillow from 10.0.1 to 10.2.0 by @dependabot in #556
- Remove unsafe_disable_checks from expand_dim by @yamaguchi1024 in #562
- Deprecate bound_alloc and fix a resize_dim bug by @yamaguchi1024 in #563
- Add Int type in the API types by @SamirDroubi in #564
- Update CI to use the new M1 macOS runner by @yamaguchi1024 in #567
- Forward when updating predicates in simplify by @SamirDroubi in #568
- Fix bug in stage_mem by @SamirDroubi in #571
- Change DoEliminateDeadLoop Check by @SamirDroubi in #572
- Unify index inequalities with different ops by @SamirDroubi in #575
- Only run CI for main push or PR by @yamaguchi1024 in #577
- Forwarding blocks for edit functions, allow replace to forward when it is 1-to-1 by @skeqiqevian in #546
- Validate the perfectness of divide_loop with z3 by @SamirDroubi in #579
- Accept API types as arguments to primitives by @SamirDroubi in #580
- Add type inference for constants by @kehemo in #581
- Snapshot of stdlib and rewrite gemm by @SamirDroubi in #583
- Implement delete_pass using cursors by @SamirDroubi in #584
- Rerwite extract_subproc by @SamirDroubi in #585
- Deprecate bound_and_guard operation by @SamirDroubi in #586
- Update README.md by @yamaguchi1024 in #587
- Fix various bugs in memory and type setting by @SamirDroubi in #588
- Call ExoBLAS reusable workflow by @SamirDroubi in #589
- extract_subproc assigns memories to subproc params by @SamirDroubi in #590
- Move gemmini matmul code from BLAS repo by @yamaguchi1024 in #591
- Specialize blocks by @SamirDroubi in #592
- Blur sprint by @skeqiqevian in #529
- Windows fixes by @SamirDroubi in #595
- Reorganizing Halide apps and fixed blur's test/documentation by @skeqiqevian in #598
- rearrange_dim and delete_pass bug fixes by @SamirDroubi in #600
- Cleaning up Halide scheduling implementation by @skeqiqevian in #604
- Bump black from 22.10.0 to 24.3.0 by @dependabot in #597
- Guard windows definitions with "include guards" by @SamirDroubi in #606
- Simplify logical expressions containing True/False literals by @SamirDroubi in #609
- Halide unsharp masking by @skeqiqevian in #599
- Bump pillow from 10.2.0 to 10.3.0 by @dependabot in #611
- Rewrite gemmini conv with user-level schedule ops by @yamaguchi1024 in #613
- Fixed bug in resize dim by @skeqiqevian in #616
- Implement split_write scheduling operation by @SamirDroubi in #608
- Scheduling operation for the circular buffer optimization by @skeqiqevian in #605
- Add dependabot updates for pip by @SamirDroubi in #619
- Updated Halide schedule to use circular buffer optimization by @skeqiqevian in #629
- Allow divide_dim to divide to non-literal expressions by @SamirDroubi in #628
- Update dependencies and requirements by @yamaguchi1024 in #630
- Fix bug in lift_reduce_constant by @SamirDroubi in #627
- Support Defining Globals for Instructions by @SamirDroubi in #607
- Bump dependencies dependabot has discovered by @yamaguchi1024 in #636
- Small benign bug fix by @skeqiqevian in #643
- Bump pytest-cov from 3.0.0 to 5.0.0 by @dependabot in #638
- Bump pre-commit from 3.6.0 to 3.7.0 by @dependabot in #639
- Bump numpy from 1.23.4 to 1.26.4 by @dependabot in #641
- Bump black from 24.3.0 to 24.4.2 by @dependabot in #642
- Bump pytest from 8.1.1 to 8.2.0 by @dependabot in #640
- Bump coverage from 7.5.0 to 7.5.1 by @dependabot in #646
- Fix stage_mem logic for which expressions should use the staged memory by @yamaguchi1024 in #614
- Bump pytest from 8.2.0 to 8.2.1 by @dependabot in #648
- Bump pre-commit from 3.7.0 to 3.7.1 by @dependabot in #647
- Remove Halide op's dependence on loop iters having same name + documenting limitations by @skeqiqevian in #649
- Bump coverage from 7.5.1 to 7.5.3 by @dependabot in #653
- Update the pip version by @yamaguchi1024 in #655
- bump version by @yamaguchi1024 in #657
- v0.1.2 release by @yamaguchi1024 in #658
New Contributors
- @adcastel made their first contribution in #285
- @rachitnigam made their first contribution in #427
- @gdinh made their first contribution in #449
- @andrewj31415 made their first contribution in #519
- @kehemo made their first contribution in #581
Full Changelog: v0.0.2...v0.1.2