v4.8.4
Changes
- compiler: Patch double buffering @FabioLuporini (#2247)
- compiler: Fix unexpansion w custom coeffs @FabioLuporini (#2242)
API
- api: Cleanup sparse setup for precomputed @mloubout (#2353)
- api: Add Hicks (sinc) interpolation api @mloubout (#2342)
- api: add priority to fd coefficients for mixed derivatives @mloubout (#2331)
- api: add support for 45 degree FD approx @mloubout (#2326)
- api: Fix custom fd for staggered @mloubout (#2323)
- api: Fix gpu-fit for TensorFunction @mloubout (#2285)
- misc: Switch off develop-mode @FabioLuporini (#2280)
- api: Minor fixes to arithmetic operations with scalar and tensors @mloubout (#2276)
- misc: Process args for subdimensions @mloubout (#2266)
- api: Add shift and fd order option to all FD operators: @mloubout (#2258)
- api: Always make subsampling factor symbolic @mloubout (#2259)
- api: prevent derivative shortcut with incompatible fd order @mloubout (#2254)
Examples
- examples: Interpolation tutorial notebook @mloubout (#2252)
- examples: Update MPI notebook to reference inner and outer halo terminology @EdCaunt (#2319)
- Correct the Poisson equation in the cavity flow example @rafael-fuente (#2308)
- example: small cleanup of tti for easier reuse @mloubout (#2294)
Documentation
- misc: Add MPI0 logging level @georgebisbas (#2130)
- examples: Fix typo in tutorial numbering @EdCaunt (#2356)
- misc: Docstring updates @ZoeLeibowitz (#2223)
Compiler
- compiler: Tweak check_stability to ensure cleanup is performed @FabioLuporini (#2335)
- compiler: Patch Guards.simplify_and @FabioLuporini (#2334)
- compiler: Enable generation of templated function declarations @FabioLuporini (#2333)
- compiler: Add optional pass for runtime stability check @FabioLuporini (#2327)
- compiler: Tweak Weights.value @FabioLuporini (#2328)
- compiler: Add Weights.value utility @FabioLuporini (#2322)
- compiler: Revamp lowering of IndexDerivatives @FabioLuporini (#2310)
- compiler: Revamp linearization @FabioLuporini (#2317)
- compiler: Adjust names used for cire-rotate dimensions @EdCaunt (#2305)
- compiler: Optimize normalize_reductions_dense @FabioLuporini (#2311)
- compiler: Generate less integer arithmetic @FabioLuporini (#2301)
- compiler: Misc codegen enhancements @FabioLuporini (#2300)
- compiler: Revamp data alignment @FabioLuporini (#2296)
- compiler: Improve IndexDerivative lowering @FabioLuporini (#2288)
- compiler: Misc code generation improvements @FabioLuporini (#2282)
- compiler: Fix handling of redundant derivatives @FabioLuporini (#2284)
- compiler: Introduce cluster-level Temp @georgebisbas (#2281)
- compiler: Add pass to abridge SubDimension names where possible @EdCaunt (#2269)
- compiler: Improve quality of generated code @FabioLuporini (#2263)
- compiler: Add missing numpy dtypes @mloubout (#2271)
- compiler: Machinery to generate vector types @FabioLuporini (#2253)
- compiler: Introduce symbolic fencing @FabioLuporini (#2244)
- compiler: Improve robustness of dspace derivation @FabioLuporini (#2238)
MPI
- misc: Add MPI0 logging level @georgebisbas (#2130)
- CI: revamp parallel marker @mloubout (#2347)
- mpi: Generate deterministic code for overlap mode @georgebisbas (#2303)
- MPI: Fix sparse subfunction handling when used without parent @mloubout (#2278)
- mpi: Fix haloupdate with inner dim [v2] @FabioLuporini (#2272)
- mpi: Add utility to get number of ranks on a single node @mloubout (#2265)
- dsl: Patch domain decomposition bug with SubDomains @EdCaunt (#2246)
Architectures and JIT
- Use get_nvidia_cc to get Nvidia gpu architecture @ggorman (#2343)
- arch: Add denormal flag for clang @mloubout (#2304)
- arch: patch compiler version @mloubout (#2297)
- example: small cleanup of tti for easier reuse @mloubout (#2294)
- arch: support rocm for gpu info @mloubout (#2261)
- compiler: add extra platforms and language to the custom compiler @mloubout (#2255)
- arch: Intel PVC mapping @FabioLuporini (#2215)
🐛 Bug Fixes
- compiler: Make code gen of elementary funcs dtype-aware @FabioLuporini (#2349)
- compiler: Tweak device-aware blocking @FabioLuporini (#2348)
- compiler: Hotfix unevaluation.Pow(1, ...) @FabioLuporini (#2321)
- compiler: Fix min/max reductions to be backend-portable @FabioLuporini (#2315)
- misc: Use
str
for generalization @mloubout (#2313) - compiler: Block reductions irrespective of par-tile @FabioLuporini (#2309)
- compiler: Fix space conditions with loop blocking @FabioLuporini (#2302)
- data: Prevent allocator info to be lost at finalize @mloubout (#2295)
- misc: Fix gpu-fit for multiple tensors @mloubout (#2286)
- compiler: Fix minor codegen issues after pickling @FabioLuporini (#2283)
- misc: Replace dimension check in pull_dims @EdCaunt (#2275)
- misc: fix short/ushort codegen @mloubout (#2274)
- mpi: Fix haloupdate with inner dim [v2] @FabioLuporini (#2272)
- misc: fix UnboundTuple for None partile @mloubout (#2256)
- compiler: Hotfix compare-ops @FabioLuporini (#2251)
- compiler: Patch compare_ops for IndexDerivatives @FabioLuporini (#2250)
- dsl: Patch domain decomposition bug with SubDomains @EdCaunt (#2246)
- compiler: Patch symbolic coefficients over cross derivatives @FabioLuporini (#2248)
- compiler: Patch custom coefficients @FabioLuporini (#2243)
Testing
Continuous Integration
- docker: fix oneapi setup @mloubout (#2351)
- ci: Update actions for nodejs version deprecation @georgebisbas (#2312)
- deps: Update rocm version @mloubout (#2291)
- compiler: Check DeviceFunctions for SubDimensions @EdCaunt (#2279)
Installation
- pip prod(deps): update ipyparallel requirement from <8.8 to <8.9 @dependabot (#2346)
- pip prod(deps): bump pyrevolve from 2.2.3 to 2.2.4 @dependabot (#2337)
- pip prod(deps): update ipyparallel requirement from <8.7 to <8.8 @dependabot (#2324)
- deps: prevent codecov error on local docker @mloubout (#2318)
- pip prod(deps): update pytest requirement from <8.0,>=7.2 to >=7.2,<9.0 @dependabot (#2299)
- deps: Update rocm version @mloubout (#2291)
- deps: support python 3.12 @mloubout (#2270)
- pip prod(deps): update anytree requirement from <=2.12.0,>=2.4.3 to >=2.4.3,<=2.12.1 @dependabot (#2268)
- pip prod(deps): update anytree requirement from <=2.11.1,>=2.4.3 to >=2.4.3,<=2.12.0 @dependabot (#2249)
- pip prod(deps): update anytree requirement from <=2.10.0,>=2.4.3 to >=2.4.3,<=2.11.1 @dependabot (#2241)
New Contributors
- @rafael-fuente made their first contribution in #2308
- @ZoeLeibowitz made their first contribution in #2223
Full Changelog: v4.8.3...v4.8.4