Skip to content

Releases: computablee/DotMP

DotMP Pre-Release v2.0.0-pre1.2

26 Mar 06:57
Compare
Choose a tag to compare
Pre-release

This is the first pre-release of v2.0.0.

GPU Programming

This update incorporates support for GPGPU programming via an external package. Arrays can be created on the GPU, and GPU-based parallel-for loops can be run. There is limited support right now for GPU programming, and much work has yet to be completed.

Right now, the only type of loops supported are basic linear and collapsed for loops:

double[] a = new double[50_000];
double[] x = new double[50_000];
double[] y = new double[50_000];
float[] res = new float[50_000];

{
    using var a_gpu = new DotMP.GPU.Buffer<double>(a, DotMP.GPU.Buffer.Behavior.To);
    using var x_gpu = new DotMP.GPU.Buffer<double>(x, DotMP.GPU.Buffer.Behavior.To);
    using var y_gpu = new DotMP.GPU.Buffer<double>(y, DotMP.GPU.Buffer.Behavior.To);
    using var res_gpu = new DotMP.GPU.Buffer<float>(res, DotMP.GPU.Buffer.Behavior.From);

    DotMP.GPU.Parallel.ParallelFor(0, a.Length, a_gpu, x_gpu, y_gpu, res_gpu,
        (i, a_kernel, x_kernel, y_kernel, res_kernel) =>
    {
        res_kernel[i] = (float)(a_kernel[i] * x_kernel[i] + y_kernel[i]);
    });
}
// after the destructors of the buffers are called, the memory is freed and data is copied back to the CPU

For collapsed for loops, ParallelForCollapse is provided with an analogous API to the CPU API (i.e., passing in tuples of bounds).

I plan on updating the wiki over the next couple of weeks with details and tutorials.

CPU API Changes

Previously, in order to use CPU functions like Critical, Ordered, and Single, an ID parameter had to be passed in to inform the runtime which loop it was seeing. This is done away with, and the runtime is able to distinguish these functions when called in different locations (i.e., inferring the ID parameter). The old functions have been deprecated, but are available. To use the new overloads, simply remove the ID parameter.

Old:

DotMP.Parallel.Single(42, () =>
{
    // single region with ID 42
});

New:

DotMP.Parallel.Single(() =>
{
    // an ID is no longer required
});

What's Changed

  • Implement baseline GPU functionality by @computablee in #95
  • Update release/2.0 branch with GPU changes by @computablee in #125
  • Add GPU functionality by @computablee in #126
  • Add support for .NET Framework 4.7.1 and .NET Standard 2.1 by @computablee in #127
  • nuget: bump the xunit group in /DotMP-Tests with 4 updates by @dependabot in #136
  • nuget: bump the fluent group in /DotMP-Tests with 1 update by @dependabot in #135
  • actions: bump vsoch/pull-request-action from 1.0.24 to 1.1.0 by @dependabot in #134
  • actions: bump actions/setup-dotnet from 3 to 4 by @dependabot in #133
  • Add license to nupkg files by @computablee in #138
  • Fix code quality tests by @computablee in #137
  • nuget: bump the xunit group in /DotMP-Tests with 4 updates by @dependabot in #139
  • nuget: bump the xunit group in /DotMP-Tests with 3 updates by @dependabot in #141
  • nuget: bump the fluent group in /DotMP-Tests with 1 update by @dependabot in #140
  • Dependabot cannot recursively search directories by @computablee in #142
  • nuget: bump the bench group in /benchmarks/GPUHeatTransfer with 1 update by @dependabot in #143
  • nuget: bump the bench group in /benchmarks/ILGPUOverhead with 1 update by @dependabot in #144
  • nuget: bump the bench group in /benchmarks/GPUOverhead with 1 update by @dependabot in #145
  • nuget: bump the bench group in /benchmarks/Misc with 2 updates by @dependabot in #147
  • nuget: bump the bench group in /benchmarks/HeatTransfer with 2 updates by @dependabot in #148
  • nuget: bump the xunit group in /DotMP-Tests with 2 updates by @dependabot in #146
  • nuget: bump the fluent group in /DotMP-Tests with 1 update by @dependabot in #151
  • nuget: bump the xunit group in /DotMP-Tests with 3 updates by @dependabot in #152
  • nuget: bump the bench group in /benchmarks/GPUHeatTransfer with 1 update by @dependabot in #153
  • nuget: bump the bench group in /benchmarks/GPUOverhead with 1 update by @dependabot in #154
  • nuget: bump the bench group in /benchmarks/HeatTransfer with 2 updates by @dependabot in #155
  • nuget: bump the bench group in /benchmarks/Misc with 2 updates by @dependabot in #156
  • nuget: bump the bench group in /benchmarks/ILGPUOverhead with 1 update by @dependabot in #157
  • nuget: bump the xunit group in /DotMP-Tests with 2 updates by @dependabot in #158
  • nuget: bump the xunit group in /DotMP-Tests with 2 updates by @dependabot in #159
  • nuget: bump the xunit group in /DotMP-Tests with 2 updates by @dependabot in #160
  • actions: bump codecov/codecov-action from 3 to 4 by @dependabot in #161
  • nuget: bump the xunit group in /DotMP-Tests with 2 updates by @dependabot in #165
  • actions: bump vsoch/pull-request-action from 1.1.0 to 1.1.1 by @dependabot in #166
  • nuget: bump the coverlet group in /DotMP-Tests with 2 updates by @dependabot in #167
  • nuget: bump the fluent group in /DotMP-Tests with 1 update by @dependabot in #168
  • nuget: bump the xunit group in /DotMP-Tests with 2 updates by @dependabot in #169
  • nuget: bump Microsoft.NET.Test.Sdk from 17.8.0 to 17.9.0 in /DotMP-Tests by @dependabot in #170

Full Changelog: v1.6.1...v2.0.0-pre1.2

DotMP Release v1.6.1

27 Nov 06:29
Compare
Choose a tag to compare

This is a minor release. This release adds support for .NET Framework 4.7.1 and .NET Standard 2.1.

Full Changelog: v1.6.0...v1.6.1

DotMP Release v1.6.0

26 Nov 07:13
2db30d0
Compare
Choose a tag to compare

DotMP v1.6.0 doesn't incorporate a ton of new features (though there are some!). Rather, it aims to provide a wide plethora of well-needed performance improvements across the board.

Licensing Changes

The project license has been changed from MIT to LGPL 2.1.

Changes to the Scheduler API

The schedule parameter in Parallel.For and its derivatives now take implementations of the IScheduler interface instead of a DotMP.Schedule enum. The changes are fully source-compatible with previous versions, but breaks API and ABI compatibility. There are not only performance benefits to doing this, but in addition, the code is simpler, more modular, more maintainable, more readable, has less duplication, and is expansible by the user.

The IScheduler interface is public-facing and permits users to implement their own custom schedulers. Details are outlined in the new wiki!

We also introduce a new work-stealing scheduler. The work-stealing scheduler can be accessed via DotMP.Schedule.WorkStealing, and has been seamlessly integrated into the rest of DotMP. Details are also outlined in the new wiki.

Performance Improvements

There have been minor performance improvements across the board with parallel-for loops. However, collapsed for loops see substantial performance improvements, over 3x in some of my benchmarks. Static scheduling sees a performance bump as well from better avoidance of false sharing issues.

GetThreadNum has also been optimized, though this is already such a lightweight function that it's hardly noticeable.

Tasking Improvements

Previously, it was possible to spawn tasks from within other tasks, but it was not possible for a task to wait on its child tasks to complete. Now, there is a new implementation of taskwait which permits you to specify which tasks to wait on, and this version does not act as a barrier if called from within a task. If the default taskwait without arguments is called from within a task, a deadlock is detected and an exception is thrown.

Bug Fixes

Prior to this release, if an exception was thrown from inside a parallel region, it could not be caught from outside the region. Now, exceptions thrown inside parallel regions are properly caught and re-thrown in serial space, allowing for try/catch blocks to be thrown around a parallel region and catch exceptions that happen in parallel space.

Atomics

We have now implemented atomic subtraction for unsigned integer types. Two new methods were added to the Atomic static class: uint Sub(ref uint, uint), and ulong Sub(ref ulong, ulong).

Reorganizing

There has been some internal reorganizing. DotMP exceptions have been moved to the DotMP.Exceptions namespace, and the actual scheduler implementations have been moved to the DotMP.Schedulers namespace.

What's Changed

Full Changelog: v1.5.0...v1.6.0

DotMP Pre-Release v1.6.0-pre2

21 Nov 06:59
33a8e90
Compare
Choose a tag to compare
Pre-release

This is a second pre-release of v1.6.0. Starting with this pre-release moving forward, we will no longer be providing binaries on GitHub. We recommend using the NuGet package manager. This saves me some time and energy.

Performance Improvements

Index calculations have been thoroughly optimized across the board. v1.6.0-pre1 optimized index calculations for 2D and 3D loops, and pre2 optimizes index calculations for 4D and higher to a significant margin.

GetThreadNum has also been optimized, though this is already such a lightweight function that it's hardly noticeable.

Atomics

We have now implemented atomic subtraction for unsigned integer types. Two new methods were added to the Atomic static class: uint Sub(ref uint, uint), and ulong Sub(ref ulong, ulong). These are not super optimized and could probably be better before the full v1.6.0 release.

Reorganizing

There has been some internal reorganizing. DotMP exceptions have been moved to the DotMP.Exceptions namespace, and the actual scheduler implementations have been moved to the DotMP.Schedulers namespace.

Bug fixes

There are a few bug fixes and performance improvements throughout DotMP.

What's Changed

Full Changelog: v1.6.0-pre1...v1.6.0-pre2

DotMP Pre-Release v1.6.0-pre1

09 Nov 07:15
149dbd8
Compare
Choose a tag to compare
Pre-release

DotMP v1.6.0 isn't planned to incorporate a ton of new features (though there are some!). Rather, it aims to provide a wide plethora of well-needed performance improvements across the board.

This is the first pre-release of v1.6.0.

Licensing Changes

The project license has been changed from MIT to LGPL 2.1.

Changes to the Scheduler API

The schedule parameter in Parallel.For and its derivatives now take implementations of the IScheduler interface instead of a DotMP.Schedule enum. The changes are fully source-compatible with previous versions, but breaks API and ABI compatibility. There are not only performance benefits to doing this, but in addition, the code is simpler, more modular, more maintainable, more readable, has less duplication, and is expansible by the user.

The IScheduler interface is public-facing and permits users to implement their own custom schedulers. Details are outlined in the new wiki!

We also introduce a new work-stealing scheduler. The work-stealing scheduler can be accessed via DotMP.Schedule.WorkStealing, and has been seamlessly integrated into the rest of DotMP. Details are also outlined in the new wiki.

Performance Improvements

There have been minor performance improvements across the board with parallel-for loops. However, collapsed for loops in 2D and 3D see substantial performance improvements, over 3x in some of my benchmarks. Static scheduling sees a performance bump as well from better avoidance of false sharing issues.

Bug Fixes

Prior to this release, if an exception was thrown from inside a parallel region, it could not be caught from outside the region. Now, exceptions thrown inside parallel regions are properly caught and re-thrown in serial space, allowing for try/catch blocks to be thrown around a parallel region and catch exceptions that happen in parallel space.

Planned Improvements

Before DotMP v1.6.0 is fully released, some more improvements are planned:

  • Fix issue where tasks can't call taskwait without a deadlock, which limits opportunities for recursive parallelism.
  • Optimize 4+ dimensional collapsed for loops.
  • Do in-depth performance tuning across the entire tasking subsystem.

What's Changed

Full Changelog: v1.5.0...v1.6.0-pre1

DotMP Release v1.5.0

25 Oct 22:56
Compare
Choose a tag to compare

This is major release v1.5.0. There are lots of changes here:

Collapsed worksharing-for loops

Several new functions have been added to the Parallel class, including ForCollapse, ForReductionCollapse, ParallelForCollapse, and ParallelForReductionCollapse. Each of these has the ability to run n-dimensional collapsed for loops. Collapsed for loops work if you have a situation similar to the following:

for (int i = 0; i < M; i++)
{
    for (int j = 0; j < N; j++)
    {
        doWork(i, j);
    }
}

Previously, the only official way to parallelize this loop would have been to parallelize only the outermost loop, leaving the innermost one in serial:

DotMP.Parallel.ParallelFor(0, M, i =>
{
    for (int j = 0; j < N; j++)
    {
        doWork(i, j);
    }
});

For many use cases this is a sufficient amount of parallelism, but if the outermost loop has few iterations, it may be beneficial to parallelize across both loops, effectively multiplying the amount of iterations that the scheduler has to work with. This means that larger chunk sizes can be used while maintaining efficient load balancing. The new way to do this is as follows:

DotMP.Parallel.ParallelForCollapse((0, M), (0, N), (i, j) =>
{
    doWork(i, j);
});

This allows the DotMP loop schedulers to have M*N iterations to work with, increasing flexibility while scheduling.

There are overloads allowing this tuple syntax for 2D, 3D, and 4D collapsed for loops. For 5D or higher, you instead pass an array of tuples representing each dimension's start and end indices, and the action takes an array of integers as indices.

New locking API

The locking API has been updated to be more object-oriented. Previously, the following format was used to create, lock, and unlock a lock:

DotMP.Lock l = new DotMP.Lock();
DotMP.Lock.Set(l);
DotMP.Lock.Unset(l);

The new API instead calls methods on the lock object:

DotMP.Lock l = new DotMP.Lock();
l.Set();
l.Unset();

Other changes

There's lots of other changes. A basic summary is here:

  1. The project has a new logo, thanks to @exrol.
  2. Lots of documentation issues have been fixed, including a lack of NuGet documentation.
  3. There have been mountains of bug fixes throughout the project.
  4. More rigorous error checking throughout the project.
  5. More rigorous testing on our end.
  6. Some internal optimizations, refactoring, tidying up, and removing dead code.

Full Changelog

New Contributors

Full Changelog: v1.4.1...v1.5.0

DotMP Release v1.4.1

23 Sep 04:53
Compare
Choose a tag to compare

This is a minor release. This update adds support for .NET 7.0. There were no necessary changes to the codebase (hooray!) but the build pipeline is changed slightly. This does mean that future collaborators will need to have both the .NET 6.0 SDK and the .NET 7.0 SDK.

DotMP Release v1.4.0

20 Sep 06:42
655cd1b
Compare
Choose a tag to compare

This is a major release.

Changelog:

  • The Locking and Lock classes have been merged. Now there is just Lock.
  • Shared<T> now implements IDisposable and may be used within a using block.
  • SharedEnumerable<T> now exists.
  • Created factory classes for Shared<T> and SharedEnumerable<T>.
  • Added a K-nearest-neighbors example.
  • DotMP.Parallel.Schedule is now DotMP.Schedule.
  • Added a tasking system, including the DotMP.Parallel.Task, DotMP.Parallel.Taskloop, DotMP.Parallel.Taskwait methods as part of the public-facing API.
  • DotMP.Parallel.Section has been removed, and the API for DotMP.Parallel.Sections has been changed.
  • Better documentation.
  • Better code organization.
  • Better testing.

DotMP Release v1.3.0

31 Aug 17:52
3b9a92a
Compare
Choose a tag to compare

This release comes just hours after v1.2.1, but comes with a big change--the project has been renamed from OpenMP.NET to DotMP. The previous releases have been renamed to DotMP, but the provided .dll/.pdb and source code still reflects the old branding. The reason for the change is to abide by OpenMP's trademark usage guidelines, since I hope to progress to publishing on Nuget soon.

There is one new feature-- the DotMP.Parallel.Schedule.Runtime schedule. The Runtime schedule allows the schedule to be set via an environment variable (e.g., dynamic,32 for Dynamic scheduling with a chunk size of 32).

DotMP Release v1.2.1

31 Aug 13:44
fbae539
Compare
Choose a tag to compare

This release doesn't change much with the actual .dll and .pdb files, but does add significant documentation across the entire codebase. Doxygen can now generate documentation from the source code, and I've added a Makefile for easy building of the source yourself.