Skip to content

Conversation

@crutcher
Copy link
Contributor

@crutcher crutcher commented Sep 21, 2025

  • copy workaround for ndarray and candle

@crutcher
Copy link
Contributor Author

@wingertge ping?

@crutcher
Copy link
Contributor Author

Added burn-tch support; realized that my impl and python's unfold put the new dim in a different place (had to work around it in the tch backend); unsure the best answer (bias towards compat); need to head to bed.

@crutcher
Copy link
Contributor Author

ndarray will need a workaround if upstream isn't responsive (according to @wingertge )

@crutcher crutcher changed the title [WIP] towards pytorch.unfold() [WIP] towards Tensor::unfold(dim, size, step) Sep 22, 2025
@crutcher crutcher marked this pull request as ready for review September 22, 2025 22:11
@crutcher
Copy link
Contributor Author

I implemented a copy-mode for candle and ndarray for now.
I'm getting some traction upstream on the candle PR

@crutcher crutcher changed the title [WIP] towards Tensor::unfold(dim, size, step) Tensor::unfold(dim, size, step) Sep 22, 2025
@crutcher
Copy link
Contributor Author

Maybe unfold_dim, and then we can have a general n-dim unfold op later?

@codecov
Copy link

codecov bot commented Sep 22, 2025

Codecov Report

❌ Patch coverage is 39.00574% with 319 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.09%. Comparing base (4002ecc) to head (4eab952).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/burn-fusion/src/ops/float.rs 0.00% 30 Missing ⚠️
crates/burn-fusion/src/ops/boolean.rs 0.00% 28 Missing ⚠️
crates/burn-fusion/src/ops/int.rs 0.00% 28 Missing ⚠️
crates/burn-candle/src/ops/base.rs 0.00% 20 Missing ⚠️
crates/burn-router/src/ops/op_bool.rs 0.00% 19 Missing ⚠️
crates/burn-router/src/ops/op_float.rs 0.00% 19 Missing ⚠️
crates/burn-router/src/ops/op_int.rs 0.00% 19 Missing ⚠️
crates/burn-router/src/runner.rs 0.00% 18 Missing ⚠️
crates/burn-cubecl/src/ops/base.rs 0.00% 17 Missing ⚠️
crates/burn-autodiff/src/ops/bool_tensor.rs 0.00% 8 Missing ⚠️
... and 16 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3751      +/-   ##
==========================================
- Coverage   64.19%   64.09%   -0.11%     
==========================================
  Files        1094     1095       +1     
  Lines      128528   129050     +522     
==========================================
+ Hits        82512    82718     +206     
- Misses      46016    46332     +316     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@crutcher crutcher force-pushed the crutcher/unfold branch 2 times, most recently from ffd7a1d to f7079fc Compare September 23, 2025 17:11
@antimora
Copy link
Collaborator

I implemented a copy-mode for candle and ndarray for now. I'm getting some traction upstream on the candle PR

If you can, I'd recommend having default implementations (similarly I did in #3767 for trunc and fmod) for the candle backend. Eventually, I think the support for candle might be dropped and currently it's not fully supported anyway. We can come back and improve perf.

@crutcher
Copy link
Contributor Author

@antimora I can't have default implementations; because {Type}TensorOps<B> don't require BasicOps<B> (yet); so I don't have access to default versions of slice and cat

@crutcher
Copy link
Contributor Author

This is the impact on simulations of the pending unfold change.
This repo is bound to the pending burn PR.,
It runs a 1000x1000 grid of Conway's Game of Life for 10000 steps (sequentially).,
It starts tracking elapsed time 1/10th of the way through (for warmup).,
it uses cuda, autotune, fusion,

$ cargo run --profile release -p scratch
Args: Args { steps: 10000, grid_size: 1000, unfold_views: false, warmup_fraction: 10, progress: true }
553.86 steps/sec

$ cargo run --profile release -p scratch -- --unfold-views
Args: Args { steps: 10000, grid_size: 1000, unfold_views: true, warmup_fraction: 10, progress: true }
3555.08 steps/sec

https://github.com/crutcher/clockmill/blob/main/examples/scratch/src/main.rs

@crutcher
Copy link
Contributor Author

Added an optimization on unfold4d when padding = [0, 0] and dilation = [1, 1].
This could be improved to work with any dilation; but that should wait until a full n-dim windowing wrapper.

Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor comments, after which should be good to go!

- switched to pytorch's return shape.
- added burn-router
- exposed unfold calculation module.

- ndarray and candle both need either upstream support or work-arounds.

candle has a PR in-flight (from me):
huggingface/candle#3091
- Added `unfold` function for `candle`, `ndarray` backends with a copy-based implementation.
- Updated function docs and tensor operation traits accordingly.
- Incorporated window shape calculation for `unfold`.
…pe`.

- Replaced `calculate_unfold_windows` with `calculate_unfold_shape` across tensor backends.
- Updated function documentation to reflect the new shape calculation.
- Added unit tests for `calculate_unfold_shape`.
- Simplified shape handling in tensor unfold operations.
@crutcher
Copy link
Contributor Author

also extracted calculate_unfold_shape which simplified a lot of locations.

@antimora
Copy link
Collaborator

@crutcher probably failures on linux-no-std-tests will go away if you rebase with main

@crutcher
Copy link
Contributor Author

@antimora I did rebase on main

@laggui laggui merged commit d07de76 into tracel-ai:main Sep 24, 2025
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants