Skip to content

Commit 4d0b90b

Browse files
committed
Update LICENSE, README and Usage
1 parent 745cb2a commit 4d0b90b

File tree

3 files changed

+72
-61
lines changed

3 files changed

+72
-61
lines changed

LICENSE

+33-30
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,40 @@
1-
mallocMC: Memory Allocation for Many Core Architectures
1+
/*
2+
mallocMC: Memory Allocation for Many Core Architectures
23

3-
based on the work of ScatterAlloc:
4-
Massively Parallel Dynamic Memory Allocation for the GPU
4+
based on the work of ScatterAlloc:
5+
Massively Parallel Dynamic Memory Allocation for the GPU
56

6-
http://www.icg.tugraz.at/project/mvp
7-
https://www.hzdr.de/crp
7+
http://www.icg.tugraz.at/project/mvp
8+
https://www.hzdr.de/crp
89

9-
Copyright (C) 2012 Institute for Computer Graphics and Vision,
10-
Graz University of Technology
11-
Copyright (C) 2014-2015 Institute of Radiation Physics,
12-
Helmholtz-Zentrum Dresden - Rossendorf
10+
Copyright (C) 2012 Institute for Computer Graphics and Vision,
11+
Graz University of Technology
12+
Copyright (C) 2014-2024 Institute of Radiation Physics,
13+
Helmholtz-Zentrum Dresden - Rossendorf
1314

14-
Author(s): Markus Steinberger - steinberger ( at ) icg.tugraz.at
15-
Bernhard Kainz - kainz ( at ) icg.tugraz.at
16-
Michael Kenzel - kenzel ( at ) icg.tugraz.at
17-
Rene Widera - r.widera ( at ) hzdr.de
18-
Axel Huebl - a.huebl ( at ) hzdr.de
19-
Carlchristian Eckert - c.eckert ( at ) hzdr.de
15+
Author(s): Markus Steinberger - steinberger ( at ) icg.tugraz.at
16+
Bernhard Kainz - kainz ( at ) icg.tugraz.at
17+
Michael Kenzel - kenzel ( at ) icg.tugraz.at
18+
Rene Widera - r.widera ( at ) hzdr.de
19+
Axel Huebl - a.huebl ( at ) hzdr.de
20+
Carlchristian Eckert - c.eckert ( at ) hzdr.de
21+
Julian Lenz - j.lenz ( at ) hzdr.de
2022

21-
Permission is hereby granted, free of charge, to any person obtaining a copy
22-
of this software and associated documentation files (the "Software"), to deal
23-
in the Software without restriction, including without limitation the rights
24-
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
25-
copies of the Software, and to permit persons to whom the Software is
26-
furnished to do so, subject to the following conditions:
23+
Permission is hereby granted, free of charge, to any person obtaining a copy
24+
of this software and associated documentation files (the "Software"), to deal
25+
in the Software without restriction, including without limitation the rights
26+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
27+
copies of the Software, and to permit persons to whom the Software is
28+
furnished to do so, subject to the following conditions:
2729

28-
The above copyright notice and this permission notice shall be included in
29-
all copies or substantial portions of the Software.
30+
The above copyright notice and this permission notice shall be included in
31+
all copies or substantial portions of the Software.
3032

31-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
32-
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
33-
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
34-
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
35-
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
36-
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
37-
THE SOFTWARE.
33+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
34+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
35+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
36+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
37+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
38+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
39+
THE SOFTWARE.
40+
*/

README.md

+21-23
Original file line numberDiff line numberDiff line change
@@ -5,39 +5,41 @@ mallocMC: *Memory Allocator for Many Core Architectures*
55

66
This project provides a framework for **fast memory managers** on **many core
77
accelerators**. It is based on [alpaka](https://github.com/alpaka-group/alpaka)
8-
to run on many different accelerators and implements the *ScatterAlloc* algorithm.
9-
8+
to run on many different accelerators and comes with multiple allocation
9+
algorithms out-of-the-box. Custom ones can be added easily due to the
10+
policy-based design.
1011

1112
Usage
1213
-------
1314

1415
Follow the step-by-step instructions in [Usage.md](Usage.md) to replace your
1516
`new`/`malloc` calls with a *blacingly fast* mallocMC heap! :rocket:
1617

17-
1818
Install
1919
-------
2020

2121
mallocMC is header-only, but requires a few other C++ libraries to be
2222
available. Our installation notes can be found in [INSTALL.md](INSTALL.md).
2323

24-
2524
Contributing
2625
------------
2726

28-
Rules for contributions are found in [CONTRIBUTING.md](CONTRIBUTING.md).
27+
Rules for contributions are found in [CONTRIBUTING.md](./CONTRIBUTING.md).
2928

30-
On the ScatterAlloc Algorithm
29+
On the Algorithms
3130
-----------------------------
3231

33-
This library implements the *ScatterAlloc* algorithm, originally
32+
This library was originally inspired by the *ScatterAlloc* algorithm,
3433
[forked](https://en.wikipedia.org/wiki/Fork_%28software_development%29)
3534
from the **ScatterAlloc** project, developed by the
3635
[Managed Volume Processing](http://www.icg.tugraz.at/project/mvp)
3736
group at [Institute for Computer Graphics and Vision](http://www.icg.tugraz.at),
38-
TU Graz (kudos!).
37+
TU Graz (kudos!). The currently shipped algorithms are using similar ideas but
38+
differ from the original one significantly.
39+
40+
From the original project page (which is no longer existent to the best of our
41+
knowledge):
3942

40-
From http://www.icg.tugraz.at/project/mvp/downloads :
4143
```quote
4244
ScatterAlloc is a dynamic memory allocator for the GPU. It is
4345
designed concerning the requirements of massively parallel
@@ -51,21 +53,18 @@ execution time is almost independent of the thread count.
5153
ScatterAlloc is open source and easy to use in your CUDA projects.
5254
```
5355

54-
Original Homepage: http://www.icg.tugraz.at/project/mvp
55-
56-
Our Homepage: https://www.hzdr.de/crp
57-
58-
59-
Branches
60-
--------
61-
62-
| *branch* | *state* | *description* |
63-
| ----------- | ------- | ----------------------- |
64-
| **master** | [![Build Status Master](https://travis-ci.org/alpaka-group/mallocMC.png?branch=master)](https://travis-ci.org/alpaka-group/mallocMC "master") | our latest stable release |
65-
| **dev** | [![Build Status Development](https://travis-ci.org/alpaka-group/mallocMC.png?branch=dev)](https://travis-ci.org/alpaka-group/mallocMC "dev") | our development branch - start and merge new branches here |
66-
| **tugraz** | n/a | *ScatterAlloc* "upstream" branch: not backwards compatible mirror for algorithmic changes |
56+
Our Homepage: <https://www.hzdr.de/crp>
6757

58+
Versions and Releases
59+
---------------------
6860

61+
Official releases can be found in the
62+
[Github releases](https://github.com/alpaka-group/mallocMC/releases).
63+
We try to stick to [semantic versioning](https://semver.org/) but we'll bump
64+
the major version number for major features.
65+
Development happens on the `dev` branch.
66+
Changes there have passed the CI and a code review but we make no guarantees
67+
about API or feature stability in this branch.
6968

7069
Literature
7170
----------
@@ -81,7 +80,6 @@ Just an incomplete link collection for now:
8180
- Junior Thesis [![DOI](https://zenodo.org/badge/doi/10.5281/zenodo.34461.svg)](http://dx.doi.org/10.5281/zenodo.34461) by
8281
Carlchristian Eckert (2014)
8382

84-
8583
License
8684
-------
8785

Usage.md

+18-8
Original file line numberDiff line numberDiff line change
@@ -13,21 +13,23 @@ There is one header file that will include *all* necessary files:
1313
Step 2a: choose policies
1414
-----------------------
1515

16-
Each instance of a policy based allocator is composed through 5 **policies**. Each policy is expressed as a **policy class**.
16+
Each instance of a policy based allocator is composed through 5 **policies**.
17+
Each policy is expressed as a **policy class**.
1718

1819
Currently, there are the following policy classes available:
1920

2021
|Policy | Policy Classes (implementations) | description |
2122
|------- |----------------------------------| ----------- |
22-
|**CreationPolicy** | Scatter`<conf1,conf2>` | A scattered allocation to tradeoff fragmentation for allocation time, as proposed in [ScatterAlloc](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6339604). `conf1` configures the heap layout, `conf2` determines the hashing parameters|
23-
| | OldMalloc | device-side malloc/new and free/delete syscalls as implemented on NVidia CUDA graphics cards with compute capability sm_20 and higher |
24-
|**DistributionPolicy** | XMallocSIMD`<conf>` | SIMD optimization for warp-wide allocation on NVIDIA CUDA accelerators, as proposed by [XMalloc](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5577907). `conf` is used to determine the pagesize. If used in combination with *Scatter*, the pagesizes must match |
23+
|**CreationPolicy** | Scatter`<conf1,conf2>` | A scattered allocation to tradeoff fragmentation for allocation time, as proposed in [ScatterAlloc](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6339604). `conf1` configures the heap layout, `conf2` determines the hashing parameters|
24+
| | FlatterScatter`<conf1,conf2>` | Another scattered allocation algorithm similar in spirit to `Scatter` but with a flatter hierarchy and stronger concurrency invariants. `conf1` and `conf2` act as before.
25+
| | OldMalloc | Device-side malloc/new and free/delete syscalls as implemented on the given device.
26+
|**DistributionPolicy** | XMallocSIMD`<conf>` | SIMD optimization for warp-wide allocation on NVIDIA CUDA accelerators, as proposed by [XMalloc](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5577907). `conf` is used to determine the pagesize. If used in combination with *Scatter*, the pagesizes must match |
2527
| | Noop | no workload distribution at all |
2628
|**OOMPolicy** | ReturnNull | pointers will be *nullptr*, if the request could not be fulfilled |
2729
| | ~~BadAllocException~~ | will throw a `std::bad_alloc` exception. The accelerator has to support exceptions |
28-
|**ReservePoolPolicy** | SimpleCudaMalloc | allocate a fixed heap with `CudaMalloc` |
30+
|**ReservePoolPolicy** | AlpakaBuf | Allocate a fixed-size buffer in an `alpaka`-provided container. |
2931
| | CudaSetLimits | call to `CudaSetLimits` to increase the available Heap (e.g. when using *OldMalloc*) |
30-
|**AlignmentPolicy** | Shrink`<conf>` | shrinks the pool so that the starting pointer is well aligned, applies padding to requested memory chunks. `conf` is used to determine the alignment|
32+
|**AlignmentPolicy** | Shrink`<conf>` | shrinks the pool so that the starting pointer is well aligned, applies padding to requested memory chunks. `conf` is used to determine the alignment|
3133
| | Noop | no alignment at all |
3234

3335
The user has to choose one of each policy that will form a useful allocator
@@ -51,6 +53,7 @@ struct ShrinkConfig : mallocMC::AlignmentPolicies::Shrink<>::Properties {
5153
5254
Step 2c: combine policies
5355
-------------------------
56+
5457
After configuring the chosen policies, they can be used as template
5558
parameters to create the desired allocator type:
5659
@@ -86,7 +89,6 @@ Notice, how the policy classes `Scatter` and `XMallocSIMD` are instantiated with
8689
template arguments to use the default configuration. `Shrink` however uses the
8790
configuration struct defined above.
8891

89-
9092
Step 3: instantiate allocator
9193
-----------------------------
9294

@@ -100,8 +102,14 @@ The allocator object offers the following methods
100102
101103
| Name | description |
102104
|---------------------- |-------------------------|
105+
| getAllocatorHandle() | Acquire a handle from the allocator that can be used in kernels to allocate memory on device.
103106
| getAvailableSlots(size_t) | Determines number of allocatable slots of a certain size. This only works, if the chosen CreationPolicy supports it (can be found through `mallocMC::Traits<ScatterAllocator>::providesAvailableSlots`) |
104107
108+
One should note that on a running system with multiple threads manipulating
109+
memory the information provided by `getAvailableSlots` is stale the moment it's
110+
acquired and so relying on this information to be accurate is not recommended.
111+
It is supposed to be used in initialisation/finalisation phases without dynamic
112+
memory allocations or in tests.
105113
106114
Step 4: use dynamic memory allocation in a kernel
107115
-------------------------------------------------
@@ -114,9 +122,11 @@ The handle offers the following methods:
114122
|---------------------- |-------------------------|
115123
| malloc(size_t) | Allocates memory on the accelerator |
116124
| free(size_t) | Frees memory on the accelerator |
117-
| getAvailableSlots() | Determines number of allocatable slots of a certain size. This only works, if the chosen CreationPolicy supports it (can be found through `mallocMC::Traits<ScatterAllocator>::providesAvailableSlots`) |
125+
| getAvailableSlots() | Determines number of allocatable slots of a certain size. This only works, if the chosen CreationPolicy supports it (can be found through `mallocMC::Traits<ScatterAllocator>::providesAvailableSlots`).|
118126
127+
The comments on `getAvailableSlots` from above hold all the same.
119128
A simplistic example would look like this:
129+
120130
```c++
121131
#include <mallocMC/mallocMC.hpp>
122132

0 commit comments

Comments
 (0)