Implemented linear synaptic bit-slicing layer #631

HeatPhoenix · 2024-03-27T10:47:02Z

Related issues

Pull request opened to contribute to Issue #287.

Description

I felt that bit-slicing may give a more realistic approximation on how current crossbar arrays are implemented in hardware (as to represent 8 or 16 bits, multiple devices are often used), and found issue #287 with @maljoras's instructions on how to implement this sufficient to try contributing a solution. Main difference from this implementation is that the number of slices can be arbitrarily set by the user.

This pull request includes mainly linear_sliced.py, which contains the implementation of the new module. The implementation is fairly naive, and is likely lacking some details that may stand out to someone more experienced with the codebase. I have tested its functionality by comparing its performance to a regular AnalogLinear using the simple_layer.py example.

I would highly appreciate feedback on the implementation.

Details

linear_sliced.py implements AnalogLinearBitSlicingLayer. The constructor adds the following parameters to AnalogLinear:
number_slices, which sets the number of slices the weights should be divided over. evenly_sliced: whether the slices should have equal value (factor of 1) when True, or should be generated as MSB to LSB (factor of 2^x) when False. Finally, significance_factors, which allows for arbitrary setting of the factors, it must have a length equal to number_slices.

The to_digital (recombine all weights/factors together and assign) and to_analog (take digital weights and divide/split over number of slices) methods are also implemented, but since Linear would map to AnalogLinear I'm not sure how necessary or useful this is.
Specifically for to_analog, I did not see an elegant way to let the user set how many slices they'd like to generate or other parameters. As stated previously, feedback is highly appreciated, I'd like this contribution to both be useful and properly implemented.

… as to why it should be floats, and some code formatting to match codebase style

maljoras · 2024-04-02T07:22:37Z

@HeatPhoenix very nice contribution, many thanks.

maljoras · 2024-05-26T17:10:13Z

src/aihwkit/nn/modules/linear_sliced.py

+        return forward_output  # type: ignore
+
+    @classmethod
+    def from_digital(


It would be better to add some arguments in this from_digital call to set the slices etc. To use the convert_to_digital one has to use a custom mapping anyway. So the best would be to add a convert_to_digital_sliced function that converts all Linear to AnalogLinearSliced` which also accepts and passes the slicing parameters during the construction.

convert_to_analog_sliced, you mean?

maljoras · 2024-05-26T17:19:44Z

src/aihwkit/nn/modules/linear_sliced.py

+from aihwkit.nn import AnalogLinear
+
+
+class AnalogLinearBitSlicing(AnalogLayerBase, Linear):


Actually it would be better if you would just write a new module which inherits from AnalogContainerBase or simply from Module instead of AnalogLayerBase. This is because if AnalogLinear layers are used as children of an AnalogLayer which will cause issues. See for instance the RNN module that internally also use AnalogLinear layers, see here and here.

It might also not necessary to inherit from Linear at all, as this will cause additional issues with the Linear weight etc.

maljoras · 2024-05-26T17:23:22Z

Many thanks @HeatPhoenix for this great contribution. I think it needs some adaption, see my comments above. Also would be great if you would add some unit tests and an example.

kaoutar55 · 2024-07-22T15:49:23Z

@charles-mackin can you please review this PR and check how it is aligned with your implementation of bit-slicing

HeatPhoenix · 2024-07-22T16:04:20Z

FWIW, I still intend to follow up on the comments by @maljoras, just haven't had time to as of yet.

charlesmackin · 2024-09-18T17:59:21Z

It seems weight bit slicing can be implemented (within a unit cell) by simply writing a new g_converter class as seen here:

aihwkit/src/aihwkit/inference/converter/conductance.py

Line 218 in 8cb5063

class CustomPairConductanceConverter(BaseConductanceConverter):

.

The CustomPairConductanceConverter class already almost does this - just in a non-quantized (interpolated for continuous weights) way. The f_lst parameter can be used to set the significance to equivalent 1s, 2^N, or arbitrary values. I would suggest creating a new class and modifying the convert_to_conductance method to quantize the weights as opposed to interpolating (as is done currently) and a new convert_back_to_weights method to accomplish this. It seems a much simpler way to implement weight bit slicing. It should also be compatible with linear layers, convs, etc. It should also be compatible with activation slicing (e.g. bit-wise and split-PWM modes).

For example, you should be able to simply provide parameters below for +/- weight programming for a 4-device unit cell with 2^N weighting.

f_lst = [1, 2]
g_lst = [[0, 0, 0, 0, 1, 0, 1], # gp0
              [1, 0, 1, 0, 0, 0, 0], # gm0
              [0, 0, 0, 0, 0, 1, 1], # gp1
              [1, 1, 0, 0, 0, 0, 0]] # gm1
#            -1 -0.67 -0.3 0 0.3 0.67 1  in unitless weights
# replace 0, 1 with g_min, g_max to specify how to program device

Use the modified convert_to_conductance method to convert continuous weight distributions to a quantized one using a scheme similar to the one above.

I think you could even substitute with non g_min, g_max values if you want to enable less trivial programming strategies. For instance, programming at the absolute g_max of a device may have some undesirable characteristics. You could easily substitute 1 = 0.9 * g_max which may program better at a slightly reduced weight range. Same goes with g_min.

HeatPhoenix · 2024-10-03T13:34:33Z

It seems weight bit slicing can be implemented (within a unit cell) by simply writing a new g_converter class as seen here:

aihwkit/src/aihwkit/inference/converter/conductance.py

Line 218 in 8cb5063

class CustomPairConductanceConverter(BaseConductanceConverter):

.
The CustomPairConductanceConverter class already almost does this - just in a non-quantized (interpolated for continuous weights) way. The f_lst parameter can be used to set the significance to equivalent 1s, 2^N, or arbitrary values. I would suggest creating a new class and modifying the convert_to_conductance method to quantize the weights as opposed to interpolating (as is done currently) and a new convert_back_to_weights method to accomplish this. It seems a much simpler way to implement weight bit slicing. It should also be compatible with linear layers, convs, etc. It should also be compatible with activation slicing (e.g. bit-wise and split-PWM modes).

For example, you should be able to simply provide parameters below for +/- weight programming for a 4-device unit cell with 2^N weighting.
f_lst = [1, 2]
g_lst = [[0, 0, 0, 0, 1, 0, 1], # gp0
              [1, 0, 1, 0, 0, 0, 0], # gm0
              [0, 0, 0, 0, 0, 1, 1], # gp1
              [1, 1, 0, 0, 0, 0, 0]] # gm1
#            -1 -0.67 -0.3 0 0.3 0.67 1  in unitless weights
# replace 0, 1 with g_min, g_max to specify how to program device
Use the modified convert_to_conductance method to convert continuous weight distributions to a quantized one using a scheme similar to the one above.

I think you could even substitute with non g_min, g_max values if you want to enable less trivial programming strategies. For instance, programming at the absolute g_max of a device may have some undesirable characteristics. You could easily substitute 1 = 0.9 * g_max which may program better at a slightly reduced weight range. Same goes with g_min.

I'm not really deep into it enough anymore to understand if this implies what I've implemented is superfluous and should be re-implemented basically entirely differently?

charlesmackin · 2024-10-07T19:22:28Z

@HeatPhoenix Yes, I believe it would be more appropriate (and likely simpler) to implement the synaptic bit slicing via a new ConductanceConverter class. It would be great if @maljoras could provide his thoughts as well here.

HeatPhoenix · 2024-10-07T21:49:56Z

@HeatPhoenix Yes, I believe it would be more appropriate (and likely simpler) to implement the synaptic bit slicing via a new ConductanceConverter class. It would be great if @maljoras could provide his thoughts as well here.

I see, in that case it might be better to close this pull request and maybe open a new issue with instructions outlining this. I probably won't re-implement this, though.

maljoras-sony · 2024-10-08T07:57:10Z

Hi @charlesmackin, g_converter indeed could handle one form of bit-slicing and it makes sense in some use cases. However, it would only slice the weights after the convergence of the DNN so there is no gradient pass through the bit-slices of the weights. Also, the weight-slices are summing currents in analog and only then are subject to digital conversion. On the other hand, the approach by @HeatPhoenix I think was to use AnalogLinear as the base of the weight slices, which would mean that the gradient could pass through the individual slices during training and also that the summing of the outputs is done in digital instead. So there are slight differences to the approaches.

HeatPhoenix · 2024-10-21T12:12:21Z

Hi @charlesmackin, g_converter indeed could handle one form of bit-slicing and it makes sense in some use cases. However, it would only slice the weights after the convergence of the DNN so there is no gradient pass through the bit-slices of the weights. Also, the weight-slices are summing currents in analog and only then are subject to digital conversion. On the other hand, the approach by @HeatPhoenix I think was to use AnalogLinear as the base of the weight slices, which would mean that the gradient could pass through the individual slices during training and also that the summing of the outputs is done in digital instead. So there are slight differences to the approaches.

In this case, should I move forward with the improvements as outlined by @maljoras-sony previously? As an alternative to @charlesmackin's implementation.

charlesmackin · 2024-11-12T22:12:02Z

@HeatPhoenix Yes, you should move forward with @maljoras-sony suggestions in this case

PabloCarmona · 2024-12-09T10:22:33Z

@HeatPhoenix did you had the chance to take a look at @maljoras-sony suggestions like @charlesmackin advice you? Let us know if you need any further help to be able to finish this PR. Thanks!

HeatPhoenix · 2024-12-20T14:28:10Z

@HeatPhoenix did you had the chance to take a look at @maljoras-sony suggestions like @charlesmackin advice you? Let us know if you need any further help to be able to finish this PR. Thanks!

Still interested in finishing up, just really strapped for time, generally. Will try to get it done near the start of the new year.

kaoutar55 · 2024-12-20T15:09:38Z

Thanks @HeatPhoenix please let us know when you have any new updates!

PabloCarmona · 2025-01-16T17:42:12Z

Hello @HeatPhoenix! Did you have any updates on this? Can we help you out in some manner? For us this is a good addition to the kit capabilities and we want to add it, thanks for your work!

HeatPhoenix · 2025-01-17T14:46:59Z

Hello @HeatPhoenix! Did you have any updates on this? Can we help you out in some manner? For us this is a good addition to the kit capabilities and we want to add it, thanks for your work!

I'll try to finish the commented (by Maljoras) things by next week, it'd be good for me to know what kind of tests/unit tests are expected of me also.

PabloCarmona · 2025-02-07T12:38:34Z

Hello @HeatPhoenix! Did you have any updates on this? Can we help you out in some manner? For us this is a good addition to the kit capabilities and we want to add it, thanks for your work!

I'll try to finish the commented (by Maljoras) things by next week, it'd be good for me to know what kind of tests/unit tests are expected of me also.

Hello @HeatPhoenix in terms of the examples try to add one that showcase the use of that linear_sliced layer on a simple neural network, you can find inspiration if you go through the various examples in our examples folder in the root of the project.

And for the tests try to add one or two that makes sense on the creation of this new AnalogLinearBitSlicing class you created and the different methods that it has. Thanks for this great work!

HeatPhoenix and others added 4 commits March 26, 2024 17:27

Update __init__.py to add new layer type

7f51c97

Update __init__.py typo

f85cf38

First draft for AnalogLinearBitSlicing layer

1f038f7

Added documentation and finished bitslicing layer conversion methods.

dab83c5

HeatPhoenix mentioned this pull request Mar 27, 2024

synaptic bit-slicing #287

Open

HeatPhoenix and others added 4 commits March 27, 2024 15:16

Fixed some issues related to reset_params and the conversion methods

de9c19d

Set significance_factors to int because there's really no good reason…

0228668

… as to why it should be floats, and some code formatting to match codebase style

Code formatting

e64d661

Merge branch 'master' into master

a33b4fd

maljoras self-requested a review April 2, 2024 07:22

maljoras added the enhancement New feature or request label Apr 2, 2024

Merge branch 'master' into master

adbd978

maljoras requested changes May 26, 2024

View reviewed changes

Merge branch 'master' into master

e177539

kaoutar55 requested a review from charlesmackin July 22, 2024 15:48

Merge branch 'IBM:master' into master

6de36fc

PabloCarmona assigned HeatPhoenix Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented linear synaptic bit-slicing layer #631

Implemented linear synaptic bit-slicing layer #631

HeatPhoenix commented Mar 27, 2024

maljoras commented Apr 2, 2024

maljoras May 26, 2024

HeatPhoenix Jun 12, 2024 •

edited

Loading

maljoras-sony Oct 8, 2024

maljoras May 26, 2024

maljoras May 26, 2024

maljoras commented May 26, 2024

kaoutar55 commented Jul 22, 2024

HeatPhoenix commented Jul 22, 2024

charlesmackin commented Sep 18, 2024 •

edited

Loading

HeatPhoenix commented Oct 3, 2024

charlesmackin commented Oct 7, 2024

HeatPhoenix commented Oct 7, 2024

maljoras-sony commented Oct 8, 2024 •

edited

Loading

HeatPhoenix commented Oct 21, 2024

charlesmackin commented Nov 12, 2024

PabloCarmona commented Dec 9, 2024

HeatPhoenix commented Dec 20, 2024

kaoutar55 commented Dec 20, 2024

PabloCarmona commented Jan 16, 2025

HeatPhoenix commented Jan 17, 2025

PabloCarmona commented Feb 7, 2025

		from aihwkit.nn import AnalogLinear


		class AnalogLinearBitSlicing(AnalogLayerBase, Linear):

Implemented linear synaptic bit-slicing layer #631

Are you sure you want to change the base?

Implemented linear synaptic bit-slicing layer #631

Conversation

HeatPhoenix commented Mar 27, 2024

Related issues

Description

Details

maljoras commented Apr 2, 2024

maljoras May 26, 2024

Choose a reason for hiding this comment

HeatPhoenix Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

maljoras-sony Oct 8, 2024

Choose a reason for hiding this comment

maljoras May 26, 2024

Choose a reason for hiding this comment

maljoras May 26, 2024

Choose a reason for hiding this comment

maljoras commented May 26, 2024

kaoutar55 commented Jul 22, 2024

HeatPhoenix commented Jul 22, 2024

charlesmackin commented Sep 18, 2024 • edited Loading

HeatPhoenix commented Oct 3, 2024

charlesmackin commented Oct 7, 2024

HeatPhoenix commented Oct 7, 2024

maljoras-sony commented Oct 8, 2024 • edited Loading

HeatPhoenix commented Oct 21, 2024

charlesmackin commented Nov 12, 2024

PabloCarmona commented Dec 9, 2024

HeatPhoenix commented Dec 20, 2024

kaoutar55 commented Dec 20, 2024

PabloCarmona commented Jan 16, 2025

HeatPhoenix commented Jan 17, 2025

PabloCarmona commented Feb 7, 2025

HeatPhoenix Jun 12, 2024 •

edited

Loading

charlesmackin commented Sep 18, 2024 •

edited

Loading

maljoras-sony commented Oct 8, 2024 •

edited

Loading