Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Units, 1e9 vs 1G vs 1000000000 #56

Closed
AdriaanRol opened this issue Mar 8, 2016 · 14 comments
Closed

Units, 1e9 vs 1G vs 1000000000 #56

AdriaanRol opened this issue Mar 8, 2016 · 14 comments

Comments

@AdriaanRol
Copy link
Contributor

@alexcjohnson
Yesterday we had a quite intense discussion in the lab on units.
I am personally a big fan of using SI units for everything and then adding e9 (or e-9 or e6 etc) to it to convert it to whatever quantity we need.
There are several pros and cons for this way of doing it
Pros:

  • The same everywhere
  • Intuitive to remember/convert

Cons:

  • type eX every time you set a value/range/linspace, everything!
  • hard to read when printed or output to terminal unless explicit formatter is specified

displaying problem

I looked a bit in the print problem but it looks like there is no easy solution to this. I found the following things which are related but don't cover it completely: numpy set printoptions, converting to a custom float class.
I guess other alternatives (which I haven't found) would be , add a very short conversion function to "pretty float" for easy printing i.e. print(pretty(x)).
Modify the base python float class repr (not possible as it is immutable and would require recompiling python)
Note if we are going down the pretty float road we can also incorporate the G,M,k, , m, u, n prefixes.

Specifying input problem

I do not know the best way to address this one, from what I understood typing e9 every time is considered a pain. A valid alternative would be to use scientifix prefixes after typing the number i.e. x=1.24G would set x to 1.24e9.

@alexcjohnson
Copy link
Contributor

Intense indeed, I'm sure everyone is going to have different ideas about this. Here are mine, but I'd love to hear from more people: @damazter @guenp @MerlinSmiles @akhmerov @spauka @MarkusJacobsen and anyone else who has an opinion!

  • You're right that if we stick to SI then everyone knows immediately how to enter any number. But very often users don't think in SI, they think in GHz or mV or nA, so allowing them to use those units as the way numbers are entered, stored, and displayed will not only save keystrokes but will save the mental conversion they would otherwise have to do at every step.
  • Re: entering things like x=1.24G - I don't know of a way to change how tokens like this are interpreted in code... we could do it in places we're interpreting a string as a number (like when we get to GUI features) but offhand that sounds like it would engender more confusion than clarity/efficiency.
  • For widest applicability the exponent should be separate from the unit itself, so we don't need ugly logic to handle things like miles (mi): is that a milli-i? I'd vote that you have to specify an exponent as a number, so if you want GHz you'd specify unit='Hz', exponent=9 but we could support unit='Hz', exponent='G' as well if people would prefer. Alternatively, we could do it with a separator like G*Hz or 1e8*Hz? If we do that it has to be a character you're not allowed to use in the unit name (hence the *, which is nice as this is multiplication, is that an OK restriction that you can't use * in units name?).
  • We should, whenever possible, display and save numbers in a unit-aware way.
    • On a plot, if there is either no exponent or the data gets too far from the given exponent, we should pick the closest prefix. So if the range is 0-0.1 GHz we should stick with GHz rather than convert to MHz, but if it's 0 - 0.00001 GHz or 0-1e4 Hz we should display it as 0 - 10 kHz. This may take some tweaking and should anyway be easily overridden.
    • When saving data (either datasets or log files) we should not alter the exponent (makes reloading and analysis more difficult) but we should include exponent and units. It would seem more readable if we could make this a single field, as in the separator version above.
    • So there are going to be several versions of pretty-printing, which means it's probably going to have to be a separate function rather than trying to overload what's already there.

@AdriaanRol
Copy link
Contributor Author

  • But very often users don't think in SI, they think in GHz or mV or nA, so allowing them to use those units as the way numbers are entered, stored, and displayed will not only save keystrokes but will save the mental conversion they would otherwise have to do at every step.

This was the exact argument in our discussion, the thing I am worried about is mistakes due to being inconsistent in how it is entered.

  • Re: entering things like x=1.24G - I don't know of a way to change how tokens like this are interpreted in code... we could do it in places we're interpreting a string as a number (like when we get to GUI features) but offhand that sounds like it would engender more confusion than clarity/efficiency.

I think we want to support this for sure in the GUI like things but unless we find a proper way of doing it it will indeed cause more confusion than anything else.

  • For widest applicability the exponent should be separate from the unit itself.

👍

  • We should, whenever possible, display and save numbers in a unit-aware way.

I am worried that using custom datatypes (e.g. Giga floats?) will break compatibility with things like numpy arrays and other built in python functionality. I would say always use SI in the code but allow the set, get, and display functionalities to return it in G, M, m, u etc. I think it would be good to also be able to get and set the units (only possible to change the prefix) of every parameter. Such a unit conversion to SI would then work in a similar way as the get and set-parser.

  • On a plot, if there is either no exponent or the data gets too far from the given exponent, we should pick the closest prefix. ...
  • When saving data (either datasets or log files) we should not alter the exponent (makes reloading and analysis more difficult) but we should include exponent and units. ....

Agree, this also relates to another point which I think is starting to get important, standard analysis classes (find-data, provide basic plot and do basic fitting routine). I think all should be using the same way to do the unit conversions.

  • So there are going to be several versions of pretty-printing, which means it's probably going to have to be a separate function rather than trying to overload what's already there.

Why would there be separate functions? Would not one that always rounds to the nearest power of 3 and give a G,M, etc as a suffix be the way to go? (maybe pretty is not the right name though)

@alexcjohnson , I asked @cdickel to send you an email. He previously had an invite to QCodes but it seems to have gotten lost.

@ghost
Copy link

ghost commented Mar 8, 2016

I like SI units with exponents too.

Re: pretty printing: In Jupyter, the %precision magic command can be used to set the default output format. For example,

%precision %.5e
5.5**10

outputs 2.53295e+07. It doesn't work with print() though, so I'm not sure how useful it is.

@AdriaanRol
Copy link
Contributor Author

I like the precision magic, however I would prefer if it did rounding in powers of 3 (e.g. 2.53e7 -> 25.3e6 or 25.3 M) I guess we can do that with a default formatter.

@guenp
Copy link
Contributor

guenp commented Mar 8, 2016

  • For widest applicability the exponent should be separate from the unit itself, so we don't need ugly logic to handle things like miles (mi): is that a milli-i? I'd vote that you have to specify an exponent as a number, so if you want GHz you'd specify unit='Hz', exponent=9 but we could support unit='Hz', exponent='G' as well if people would prefer. Alternatively, we could do it with a separator like G*Hz or 1e8*Hz? If we do that it has to be a character you're not allowed to use in the unit name (hence the *, which is nice as this is multiplication, is that an OK restriction that you can't use * in units name?).

Why not enable usage of e.g. unit = 'GHz' and parse the string for exponents?

  • On a plot, if there is either no exponent or the data gets too far from the given exponent, we should pick the closest prefix. So if the range is 0-0.1 GHz we should stick with GHz rather than convert to MHz, but if it's 0 - 0.00001 GHz or 0-1e4 Hz we should display it as 0 - 10 kHz. This may take some tweaking and should anyway be easily overridden.

Agree

  • When saving data (either datasets or log files) we should not alter the exponent (makes reloading and analysis more difficult) but we should include exponent and units. It would seem more readable if we could make this a single field, as in the separator version above.

Same as above. If the units say 'GHz', the data should be saved in GHz.

@alexcjohnson
Copy link
Contributor

Why not enable usage of e.g. unit = 'GHz' and parse the string for exponents?

that works for most units... but it's fragile - the example I gave above is miles (mi - is that milli-i?)

@guenp
Copy link
Contributor

guenp commented Mar 8, 2016

Oh, are miles SI units? :)

@alexcjohnson
Copy link
Contributor

OK, how about 'mol', we need some lookup table that tells us 'ol' isn't a unit? Anyway I don't think anyone is going to be happy if we only support preexisting SI units... what about e^2/h? And it gets worse if we want qcodes to get use in the wider scientific world.

We could take the opposite tack and support some whitelisted set of prefixed units... but then that seems liable to confuse people even more if they ever stray outside that, like if we define THz, GHz, MHz, kHz, then someone has a really fine sweep and wants mHz and it all breaks. I'd much prefer to do something totally robust, even if it isn't the prettiest.

@guenp
Copy link
Contributor

guenp commented Mar 8, 2016

I would do it the other way around. Try to parse 'mol' as an SI unit, if it fails, then try and see if the first character is an exponent symbol. The code doesn't have to be that idiot proof. 2e^2/h is just G0. :) Also, for those cases where it's dubious you can have users add a space (e.g.'mol' is mol, 'm ol' is mili ol (whatever that may be)). This looks pretty robust to me.
Let's just focus on QDev-QTech-Sydney for now, and ignore the rest of the scientific world. They can always make pull requests if they want miles/mols/apples/etc :)

@MerlinSmiles
Copy link
Contributor

I prefer to have units and exponents separated too, it seems more robust to me.
And I guess this is much harder to change at a later step, no?

@guenp
Copy link
Contributor

guenp commented Mar 8, 2016

It would make sense if the units/exponents are separated on the lowest level. The parsing can then happen on a higher level and is optional. This way we can all be happy 😀

@spauka
Copy link
Contributor

spauka commented Mar 8, 2016

I tend to agree with @adriaan, use of SI units is clear, and the added cost of having to type e9, e-9 after entering ranges is not huge.... If we have pieces of equipment where it is more natural to think in terms of mV or GHz, specifying an exponent manually also does not seem like a huge burden, but I am not sure that automatic parsing of prefixes makes sense, particularly for a wider release.

Apart from the examples given above, an additional difficulty may arise for people in CS, trying to specify prefixes in powers of 2, for example the difference between units such as MB (megabyte, 10^6) and MiB (mibibytes, 2^20) etc.

Depending on how the higher level is handled, it also may make comparison of traces more difficult. For two different pieces of equipment, one specified in units of Hz, the other in GHz, would d1.units == d2.units be true?

@damazter
Copy link
Contributor

I would like to add that some numerical simulations are not done in SI units at all, for example it can be useful in simulations to use eV, m_e, nm as a unit system. This would break if qcodes would parse nm as nanometer instead of a baseunit 'nm'. The problem is similar to kg which is not 1000 g but should be stored as 10^-3 kg in this system.

Hence I would think that hardcoding any units in qcodes is not a good idea. I would always enter the value of a parameter as a float number in python without any other stuff attached.
To enable units in qcodes, I would add it as a seperate variable inside the parameter class which can then be used to do conversion etc, but the unit itself would then be free to think of as a user.
hence i would do:

class parameter():
    self.value = 1.3e9
    self.unit = "Hz"

Which would also allow for

class parameter():
    self.value = 1.3
    self.unit = "GHz"

or maybe:

class parameter():
    self.value = 1.3
    self.unit = ["G","Hz"]

if that is appropriate for the problem at hand.

I know that this hybrid approach can lead to problems. But users should be free to define their own unit systems.

@giulioungaretti
Copy link
Contributor

Cosing as per #494 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants