-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: allow lambdas in kernels when they can be evaluated at compile time #463
Comments
Hm, I started working on this, and I am seeing existing pieces of code that look very relevant: @MoFtZ |
@lostmsu Thank you for your feature request. We have already discussed the feature in our weekly talk-to-dev sessions. We currently believe that we should add support for lambdas via ILGPU's dynamic specialization features. Also, we can translate calls to lambda functions into calls to "opaque" functions annotated with specific attributes. This avoids inlining and modifying these stubs that we generate. However, adding support for arbitrary lambdas also requires special care in capturing values and returning lambda closures within kernel functions. Moreover, we can add this feature to the v1.1 feature list 🚀 |
@m4rs-mt thanks for the promising response. Is there anyone already working on that feature? I started my own take at implementing it by replacing the key type in this dictionary: ILGPU/Src/ILGPU/Frontend/ILFrontend.cs Line 455 in 93b6551
MethodBase + Value?[] array of arguments whose values are known at compile time (in this case a delegate pointing to a known method). This approach does not seem to align with the idea of "dynamic specialization features". Should I pause it?
|
@lostmsu Yes, you are correct that lambdas are implemented as instance methods on a hidden class. Originally, ILGPU only supported static methods, which do not have a If you find that it is easier to make your changes if the parameter offset is 1, then it is fine to change. |
@lostmsu There is no one currently working on this feature, so if you have the time and passion, we would wholeheartedly welcome your contributions. We have previously discussed how to support lambda functions to provide the functionality requested. In your example, you have supplied the lambda function as a method parameter to Regarding "dynamic specialization features", I believe @m4rs-mt is referring to a technique similar to Note that this is still an open-ended discussion. For example, should we support lambdas that are static member variables like #415? Is dynamic specialization the correct approach for how it will be used? Should capturing lambdas be supported? And if so, to what extent? Also note that is is not necessary to solve all these questions now - we can slowly build some functionality while deferring other more "problematic" functionality, like capturing lambdas. |
@MoFtZ the problem I see with the That was my reasoning behind the idea to propagate lambda at the initial compile time. |
@lostmsu @lostmsu I don't think we'll run into any problems with respect to the @lostmsu Regarding your suggestion and implementation: I have experimented with different ways to implement lambdas in the compiler, as they involve handling class types inside kernels. I still believe that mapping these OpCodes to partial function calls + dynamic specialization of the call sites might be the best way to implement them. Anyway, we are always open to PRs that add new features 🤓👍 I was wondering about changing the mapping
to a tuple of a |
Sorry for a delay here @MoFtZ @m4rs-mt . Have you guys given any thought to this? Do you have notes? I checked out current code, that handles @m4rs-mt mentioned dynamic specialization. Can you elaborate on the idea? Is it different from the above? I have not looked at it, but if ILGPU already has cross-function constant propagation that might be another way to approach the problem. |
@lostmsu We have not defined a preferred API, so you are welcome to design it as you see fit. I believe that "dynamic specialization" is referring to the concept used by |
This now might be easier with new C# static abstract interface members. Relevant IL changes: https://github.com/dotnet/runtime/pull/49558/files |
@lostmsu We recently added support for Generic Math, which makes use of Static Abstract Interface members. If you would like to try it out, it is available in a preview release of ILGPU. |
I need exactly that, assume I have a dynamic composition of different algorithms ( NeuraSharp). Also something like that would be usefull: Declare the interfaces with static methods: public interface IAlgorithm1 public interface IFunction1 public class MyAlgorithm1 : IAlgorithm1 where T : IFunction1 public class NormalSum1 : IFunction1 // load this as kernel Actually I'm lookin at how to generate automatically inlined IL code but is a daunting task, if the feature is already there that would be great... What kinda of syntax is exactly supported in the preview just out of curiosity? |
hi @Darelbi. This is a long-running thread, so the information is outdated. Currently, using lambdas within a kernel is still not supported. On the plus side, Generic Math and Static Abstract Interface Member support (for net.70 onwards) is no longer in preview, and is available in the latest version of ILGPU - currently v1.5.1. There is also some sample code that might meet your requirements for using interfaces: |
Generic math works really well! Here is a small snippet in F# if you're interested. module ILGpu.GenericKernels
open System
open System.Numerics
open ILGPU
open ILGPU.Runtime
open En3Tho.FSharp.Extensions
// define a set of constraints, INumber + ILGpu default ones
type Number<'TNumber
when 'TNumber: unmanaged
and 'TNumber: struct
and 'TNumber: (new: unit -> 'TNumber)
and 'TNumber :> ValueType
and 'TNumber :> INumber<'TNumber>> = 'TNumber
module Kernels =
// use this constraint for generic parameter in the kernel
let inline executeSomeNumericOperations<'TNumber when Number<'TNumber>> (index: Index1D) (input: ArrayView<'TNumber>) (output: ArrayView<'TNumber>) (scalar: 'TNumber) =
if index.X < input.Length.i32 then
output[index] <- (input[index] * scalar + scalar) / scalar - scalar
let runKernel<'T when Number<'T>> (accelerator: Accelerator) scalar (data: 'T[]) =
use deviceData = accelerator.Allocate1D(data)
let kernel = accelerator.LoadAutoGroupedStreamKernel(Kernels.executeSomeNumericOperations<'T>)
kernel.Invoke(Index1D(deviceData.Length.i32), deviceData.View, deviceData.View, scalar)
deviceData.CopyToCPU(accelerator.DefaultStream, data)
data |> Array.iteri ^ fun index element -> Console.WriteLine($"{index} = {element}")
let genericMap() =
use context = Context.CreateDefault()
let device = context.Devices |> Seq.find ^ fun x -> x.Name.Contains("GTX 1070")
use accelerator = device.CreateAccelerator(context)
// run with ints
runKernel accelerator 10 [| 0; 1; 2; 3; 4; 5; 6; 7; 8; 9; |]
// and with floats
runKernel accelerator 10.1f [| 0.1f; 1.1f; 2.1f; 3.1f; 4.1f; 5.1f; 6.1f; 7.1f; 8.1f; 9.1f; |] |
Rationale
This request is syntax sugar for creating C# classes, that provide some GPGPU capabilities.
Imagine you are trying to implement a
ISqlCalc
, that needs to be able to perform a few ops on arrays using ILGPU.Point is it should be possible to inline
v => -v
. The delegate instance will haveMethodInfo
pointing to a body, and that method will never referencethis
, so it is essentially static.Workaround
Currently the best way to have something analogous to
UnaryOpKernel
shared for all unary ops I came up with is to use generic monomorphization like this:While this works, it is ugly and unnecessarily wordy.
The
struct
restriction also prevents me from at least doingThis fails due to "Class type 'Neg' is not supported" even though
this
is never used andApply
is essentially static.The text was updated successfully, but these errors were encountered: