-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented linear synaptic bit-slicing layer #631
base: master
Are you sure you want to change the base?
Conversation
… as to why it should be floats, and some code formatting to match codebase style
@HeatPhoenix very nice contribution, many thanks. |
return forward_output # type: ignore | ||
|
||
@classmethod | ||
def from_digital( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to add some arguments in this from_digital call to set the slices etc. To use the convert_to_digital one has to use a custom mapping anyway. So the best would be to add a convert_to_digital_sliced
function that converts all Linear
to AnalogLinearSliced` which also accepts and passes the slicing parameters during the construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convert_to_analog_sliced
, you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right
from aihwkit.nn import AnalogLinear | ||
|
||
|
||
class AnalogLinearBitSlicing(AnalogLayerBase, Linear): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it would be better if you would just write a new module which inherits from AnalogContainerBase
or simply from Module
instead of AnalogLayerBase
. This is because if AnalogLinear
layers are used as children of an AnalogLayer
which will cause issues. See for instance the RNN module that internally also use AnalogLinear
layers, see here and here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also not necessary to inherit from Linear
at all, as this will cause additional issues with the Linear
weight etc.
Many thanks @HeatPhoenix for this great contribution. I think it needs some adaption, see my comments above. Also would be great if you would add some unit tests and an example. |
@charles-mackin can you please review this PR and check how it is aligned with your implementation of bit-slicing |
FWIW, I still intend to follow up on the comments by @maljoras, just haven't had time to as of yet. |
It seems weight bit slicing can be implemented (within a unit cell) by simply writing a new g_converter class as seen here:
The CustomPairConductanceConverter class already almost does this - just in a non-quantized (interpolated for continuous weights) way. The f_lst parameter can be used to set the significance to equivalent 1s, 2^N, or arbitrary values. I would suggest creating a new class and modifying the convert_to_conductance method to quantize the weights as opposed to interpolating (as is done currently) and a new convert_back_to_weights method to accomplish this. It seems a much simpler way to implement weight bit slicing. It should also be compatible with linear layers, convs, etc. It should also be compatible with activation slicing (e.g. bit-wise and split-PWM modes). For example, you should be able to simply provide parameters below for +/- weight programming for a 4-device unit cell with 2^N weighting.
Use the modified convert_to_conductance method to convert continuous weight distributions to a quantized one using a scheme similar to the one above. I think you could even substitute with non g_min, g_max values if you want to enable less trivial programming strategies. For instance, programming at the absolute g_max of a device may have some undesirable characteristics. You could easily substitute 1 = 0.9 * g_max which may program better at a slightly reduced weight range. Same goes with g_min. |
I'm not really deep into it enough anymore to understand if this implies what I've implemented is superfluous and should be re-implemented basically entirely differently? |
@HeatPhoenix Yes, I believe it would be more appropriate (and likely simpler) to implement the synaptic bit slicing via a new ConductanceConverter class. It would be great if @maljoras could provide his thoughts as well here. |
I see, in that case it might be better to close this pull request and maybe open a new issue with instructions outlining this. I probably won't re-implement this, though. |
Hi @charlesmackin, |
In this case, should I move forward with the improvements as outlined by @maljoras-sony previously? As an alternative to @charlesmackin's implementation. |
@HeatPhoenix Yes, you should move forward with @maljoras-sony suggestions in this case |
@HeatPhoenix did you had the chance to take a look at @maljoras-sony suggestions like @charlesmackin advice you? Let us know if you need any further help to be able to finish this PR. Thanks! |
Still interested in finishing up, just really strapped for time, generally. Will try to get it done near the start of the new year. |
Thanks @HeatPhoenix please let us know when you have any new updates! |
Hello @HeatPhoenix! Did you have any updates on this? Can we help you out in some manner? For us this is a good addition to the kit capabilities and we want to add it, thanks for your work! |
I'll try to finish the commented (by Maljoras) things by next week, it'd be good for me to know what kind of tests/unit tests are expected of me also. |
Hello @HeatPhoenix in terms of the examples try to add one that showcase the use of that linear_sliced layer on a simple neural network, you can find inspiration if you go through the various examples in our examples folder in the root of the project. And for the tests try to add one or two that makes sense on the creation of this new AnalogLinearBitSlicing class you created and the different methods that it has. Thanks for this great work! |
Related issues
Pull request opened to contribute to Issue #287.
Description
I felt that bit-slicing may give a more realistic approximation on how current crossbar arrays are implemented in hardware (as to represent 8 or 16 bits, multiple devices are often used), and found issue #287 with @maljoras's instructions on how to implement this sufficient to try contributing a solution. Main difference from this implementation is that the number of slices can be arbitrarily set by the user.
This pull request includes mainly
linear_sliced.py
, which contains the implementation of the new module. The implementation is fairly naive, and is likely lacking some details that may stand out to someone more experienced with the codebase. I have tested its functionality by comparing its performance to a regularAnalogLinear
using thesimple_layer.py
example.I would highly appreciate feedback on the implementation.
Details
linear_sliced.py
implementsAnalogLinearBitSlicingLayer
. The constructor adds the following parameters toAnalogLinear
:number_slices
, which sets the number of slices the weights should be divided over.evenly_sliced
: whether the slices should have equal value (factor of 1) whenTrue
, or should be generated as MSB to LSB (factor of 2^x) whenFalse
. Finally,significance_factors
, which allows for arbitrary setting of the factors, it must have a length equal tonumber_slices
.The
to_digital
(recombine all weights/factors together and assign) andto_analog
(take digital weights and divide/split over number of slices) methods are also implemented, but sinceLinear
would map toAnalogLinear
I'm not sure how necessary or useful this is.Specifically for
to_analog
, I did not see an elegant way to let the user set how many slices they'd like to generate or other parameters. As stated previously, feedback is highly appreciated, I'd like this contribution to both be useful and properly implemented.