Skip to content

Commit b845a76

Browse files
committed
Update to new UCUM version v2.2 (June-2024)
1 parent 3dd0b73 commit b845a76

11 files changed

+65
-48
lines changed

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Note that UCUM does non provide a canonical representation, e.g. `m/s` and `m.s-
1616

1717
- Parser for UCUM unit strings that implements the full grammar.
1818
- Converter for creating [pint](https://pypi.org/project/pint/) units from UCUM unit strings.
19-
- A pint unit definition file [pint_ucum_defs.txt](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/pint_ucum_defs.txt) that extends pint´s default units with UCUM units. All UCUM units from Version 2.1 of the specification are included.
19+
- A pint unit definition file [pint_ucum_defs.txt](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/pint_ucum_defs.txt) that extends pint´s default units with UCUM units. All UCUM units from the new version 2.2 of the specification (June 2024) are included.
2020

2121
**ucumvert** generates the UCUM grammar by filling a template with unit codes, prefixes etc. from the official [ucum-essence.xml](https://github.com/ucum-org/ucum/blob/main/ucum-essence.xml) file (a copy is included in this repo).
2222
So updating the parser for new UCUM releases is straight forward.
@@ -126,7 +126,7 @@ To (re)generate this tsv-file from the official xlsx-file in the [UCUM repositor
126126

127127
```bash
128128
pip install openpyxl
129-
python src/src/ucumvert/vendor/get_ucum_example_as_tsv.py
129+
python src/ucumvert/vendor/get_ucum_example_as_tsv.py
130130
```
131131

132132
## Useful links

src/ucumvert/parser.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@
8585
# instead of deca-r which does not exist.
8686

8787
UCUM_GRAMMAR = """
88-
# Based on UCUM specification (Version 2.1, 2017-11-21)
88+
# Based on UCUM specification (Version 2.2, 2024-06-28)
8989
# Includes ucumvert-specific fixes to handle all common UCUM units
9090
# and some edge cases not present in the official examples.
9191
# This file is auto-created by parser.update_lark_ucum_grammar_file

src/ucumvert/pint_ucum_defs.txt

+4-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ homeopathic_potency_of_quintamillesimal_korsakovian_series = 1 = _ = kp_Q
4141
high_power_field = 1 = _ = HPF
4242
low_power_field = 1 = _ = LPF
4343
international_unit = 1 = _ = i.U. = IU = iU
44-
arbitary_unit = 1 = _ = arb_U
44+
arbitrary_unit = 1 = _ = arb_U
4545
US_pharmacopeia_unit = 1 = _ = USP_U
4646
GPL_unit = 1 = _ = GPL_U
4747
MPL_unit = 1 = _ = MPL_U
@@ -88,6 +88,9 @@ diopter = 1 / meter = _ = diop
8888
slope = tan(1 rad)
8989
prism_diopter = 100 * tan(1 rad) = _ = p_diop
9090

91+
nephelometric_turbidity_unit = 1 = _ = NTU
92+
formazin_nephelometric_unit = 1 = _ = FNU
93+
9194
mil_i = inch / 1000
9295
cml_i = π/4 * mil_i**2
9396
hd_i = 4 * inch

src/ucumvert/pint_ucum_defs_mapping_report.txt

+16-14
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
# Ti --> tebi (default registry)
2828

2929
# === metric ===
30-
# mol --> mole (default registry) # mol = 6.0221367 * 10*23 # METRIC, mole, amount of substance (si)
30+
# mol --> mole (default registry) # mol = 6.02214076 * 10*23 # METRIC, mole, amount of substance (si)
3131
# sr --> steradian (default registry) # sr = 1 * rad2 # METRIC, steradian, solid angle (si)
3232
# Hz --> hertz (default registry) # Hz = 1 * s-1 # METRIC, hertz, frequency (si)
3333
# N --> newton (default registry) # N = 1 * kg.m/s2 # METRIC, newton, force (si)
@@ -53,18 +53,18 @@
5353
# ar --> are (ucumvert registry) # ar = 100 * m2 # METRIC, are, area (iso1000)
5454
# t --> metric_ton (default registry) # t = 1e3 * kg # METRIC, tonne, mass (iso1000)
5555
# bar --> bar (default registry) # bar = 1e5 * Pa # METRIC, bar, pressure (iso1000)
56-
# u --> unified_atomic_mass_unit (default registry) # u = 1.6605402e-24 * g # METRIC, unified atomic mass unit, mass (iso1000)
56+
# u --> unified_atomic_mass_unit (default registry) # u = 1.66053906660e-24 * g # METRIC, unified atomic mass unit, mass (iso1000)
5757
# eV --> electron_volt (default registry) # eV = 1 * [e].V # METRIC, electronvolt, energy (iso1000)
5858
# pc --> parsec (default registry) # pc = 3.085678e16 * m # METRIC, parsec, length (iso1000)
5959
# [c] --> speed_of_light (default registry) # [c] = 299792458 * m/s # METRIC, velocity of light, velocity (const)
60-
# [h] --> planck_constant (default registry) # [h] = 6.6260755e-34 * J.s # METRIC, Planck constant, action (const)
61-
# [k] --> boltzmann_constant (default registry) # [k] = 1.380658e-23 * J/K # METRIC, Boltzmann constant, (unclassified) (const)
60+
# [h] --> planck_constant (default registry) # [h] = 6.62607015e-34 * J.s # METRIC, Planck constant, action (const)
61+
# [k] --> boltzmann_constant (default registry) # [k] = 1.380649e-23 * J/K # METRIC, Boltzmann constant, (unclassified) (const)
6262
# [eps_0] --> vacuum_permittivity (default registry) # [eps_0] = 8.854187817e-12 * F/m # METRIC, permittivity of vacuum, electric permittivity (const)
6363
# [mu_0] --> vacuum_permeability (default registry) # [mu_0] = 1 * 4.[pi].10*-7.N/A2 # METRIC, permeability of vacuum, magnetic permeability (const)
64-
# [e] --> elementary_charge (default registry) # [e] = 1.60217733e-19 * C # METRIC, elementary charge, electric charge (const)
65-
# [m_e] --> electron_mass (default registry) # [m_e] = 9.1093897e-28 * g # METRIC, electron mass, mass (const)
66-
# [m_p] --> proton_mass (default registry) # [m_p] = 1.6726231e-24 * g # METRIC, proton mass, mass (const)
67-
# [G] --> newtonian_constant_of_gravitation (default registry) # [G] = 6.67259e-11 * m3.kg-1.s-2 # METRIC, Newtonian constant of gravitation, (unclassified) (const)
64+
# [e] --> elementary_charge (default registry) # [e] = 1.602176634e-19 * C # METRIC, elementary charge, electric charge (const)
65+
# [m_e] --> electron_mass (default registry) # [m_e] = 9.1093837139e-31 * kg # METRIC, electron mass, mass (const)
66+
# [m_p] --> proton_mass (default registry) # [m_p] = 1.67262192595e-27 * kg # METRIC, proton mass, mass (const)
67+
# [G] --> newtonian_constant_of_gravitation (default registry) # [G] = 6.67430e-11 * m3.kg-1.s-2 # METRIC, Newtonian constant of gravitation, (unclassified) (const)
6868
# [g] --> standard_gravity (default registry) # [g] = 980665e-5 * m/s2 # METRIC, standard acceleration of free fall, acceleration (const)
6969
# [ly] --> light_year (default registry) # [ly] = 1 * [c].a_j # METRIC, light-year, length (const)
7070
# gf --> force_gram (default registry) # gf = 1 * g.[g] # METRIC, gram-force, force (const)
@@ -290,7 +290,7 @@
290290
# [S] --> svedberg (default registry) # [S] = 1 * 10*-13.s # NON_METRIC, Svedberg unit, sedimentation coefficient (chemical)
291291
# [HPF] --> high_power_field (ucumvert registry) # [HPF] = 1 * 1 # NON_METRIC, high power field, view area in microscope (chemical)
292292
# [LPF] --> low_power_field (ucumvert registry) # [LPF] = 100 * 1 # NON_METRIC, low power field, view area in microscope (chemical)
293-
# [arb'U] --> arbitary_unit (ucumvert registry) # [arb'U] = 1 * 1 # NON_METRIC, arbitary unit, arbitrary (chemical)
293+
# [arb'U] --> arbitrary_unit (ucumvert registry) # [arb'U] = 1 * 1 # NON_METRIC, arbitrary unit, arbitrary (chemical)
294294
# [USP'U] --> US_pharmacopeia_unit (ucumvert registry) # [USP'U] = 1 * 1 # NON_METRIC, United States Pharmacopeia unit, arbitrary (chemical)
295295
# [GPL'U] --> GPL_unit (ucumvert registry) # [GPL'U] = 1 * 1 # NON_METRIC, GPL unit, biologic activity of anticardiolipin IgG (chemical)
296296
# [MPL'U] --> MPL_unit (ucumvert registry) # [MPL'U] = 1 * 1 # NON_METRIC, MPL unit, biologic activity of anticardiolipin IgM (chemical)
@@ -311,10 +311,10 @@
311311
# [PFU] --> plaque_forming_unit (ucumvert registry) # [PFU] = 1 * 1 # NON_METRIC, plaque forming units, amount of an infectious agent (chemical)
312312
# [FFU] --> focus_forming_units (ucumvert registry) # [FFU] = 1 * 1 # NON_METRIC, focus forming units, amount of an infectious agent (chemical)
313313
# [CFU] --> colony_forming_unit (ucumvert registry) # [CFU] = 1 * 1 # NON_METRIC, colony forming units, amount of a proliferating organism (chemical)
314-
# [IR] --> allergene_index_of_reactivity (ucumvert registry) # [IR] = 1 * 1 # NON_METRIC, index of reactivity, amount of an allergen callibrated through in-vivo testing using the Stallergenes® method. (chemical)
315-
# [BAU] --> bioequivalent_allergen_unit (ucumvert registry) # [BAU] = 1 * 1 # NON_METRIC, bioequivalent allergen unit, amount of an allergen callibrated through in-vivo testing based on the ID50EAL method of (intradermal dilution for 50mm sum of erythema diameters (chemical)
314+
# [IR] --> allergene_index_of_reactivity (ucumvert registry) # [IR] = 1 * 1 # NON_METRIC, index of reactivity, amount of an allergen calibrated through in-vivo testing using the Stallergenes® method (chemical)
315+
# [BAU] --> bioequivalent_allergen_unit (ucumvert registry) # [BAU] = 1 * 1 # NON_METRIC, bioequivalent allergen unit, amount of an allergen calibrated through in-vivo testing based on the ID50EAL method of (intradermal dilution for 50mm sum of erythema diameters (chemical)
316316
# [AU] --> allergen_unit (ucumvert registry) # [AU] = 1 * 1 # NON_METRIC, allergen unit, procedure defined amount of an allergen using some reference standard (chemical)
317-
# [Amb'a'1'U] --> allergen_unit_for_Ambrosia_artemisiifolia (ucumvert registry) # [Amb'a'1'U] = 1 * 1 # NON_METRIC, allergen unit for Ambrosia artemisiifolia, procedure defined amount of the major allergen of ragweed. (chemical)
317+
# [Amb'a'1'U] --> allergen_unit_for_Ambrosia_artemisiifolia (ucumvert registry) # [Amb'a'1'U] = 1 * 1 # NON_METRIC, allergen unit for Ambrosia artemisiifolia, procedure defined amount of the major allergen of ragweed (chemical)
318318
# [PNU] --> protein_nitrogen_unit (ucumvert registry) # [PNU] = 1 * 1 # NON_METRIC, protein nitrogen unit, procedure defined amount of a protein substance (chemical)
319319
# [Lf] --> limit_of_flocculation (ucumvert registry) # [Lf] = 1 * 1 # NON_METRIC, Limit of flocculation, procedure defined amount of an antigen substance (chemical)
320320
# [D'ag'U] --> D_antigen_unit (ucumvert registry) # [D'ag'U] = 1 * 1 # NON_METRIC, D-antigen unit, procedure defined amount of a poliomyelitis d-antigen substance (chemical)
@@ -324,11 +324,13 @@
324324
# Ao --> angstrom (ucumvert registry) # Ao = 0.1 * nm # NON_METRIC, Ångström, length (misc)
325325
# b --> barn (default registry) # b = 100 * fm2 # NON_METRIC, barn, action area (misc)
326326
# att --> technical_atmosphere (ucumvert registry) # att = 1 * kgf/cm2 # NON_METRIC, technical atmosphere, pressure (misc)
327-
# [psi] --> pound_force_per_square_inch (default registry) # [psi] = 1 * [lbf_av]/[in_i]2 # NON_METRIC, pound per sqare inch, pressure (misc)
327+
# [psi] --> pound_force_per_square_inch (default registry) # [psi] = 1 * [lbf_av]/[in_i]2 # NON_METRIC, pound per square inch, pressure (misc)
328328
# circ --> turn (ucumvert registry) # circ = 2 * [pi].rad # NON_METRIC, circle, plane angle (misc)
329-
# sph --> sphere (ucumvert registry) # sph = 4 * [pi].sr # NON_METRIC, spere, solid angle (misc)
329+
# sph --> sphere (ucumvert registry) # sph = 4 * [pi].sr # NON_METRIC, sphere, solid angle (misc)
330330
# [car_m] --> carat (ucumvert registry) # [car_m] = 2e-1 * g # NON_METRIC, metric carat, mass (misc)
331331
# [car_Au] --> carat_of_gold_alloys (ucumvert registry) # [car_Au] = 1/24 # NON_METRIC, carat of gold alloys, mass fraction (misc)
332332
# [smoot] --> smoot (ucumvert registry) # [smoot] = 67 * [in_i] # NON_METRIC, Smoot, length (misc)
333333
# [m/s2/Hz^(1/2)] --> meter_per_square_second_per_square_root_of_hertz (ucumvert registry) # [m/s2/Hz^(1/2)] = 1 * sqrt(1 m2/s4/Hz) # NON_METRIC, meter per square seconds per square root of hertz, amplitude spectral density (misc)
334+
# [NTU] --> nephelometric_turbidity_unit (ucumvert registry) # [NTU] = 1 * 1 # NON_METRIC, Nephelometric Turbidity Unit, turbidity (misc)
335+
# [FNU] --> formazin_nephelometric_unit (ucumvert registry) # [FNU] = 1 * 1 # NON_METRIC, Formazin Nephelometric Unit, turbidity (misc)
334336
# bit_s --> bit (ucumvert registry) # bit_s = 1 * ld(1 1) # NON_METRIC, bit, amount of information (infotech)

src/ucumvert/ucum_grammar.lark

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Based on UCUM specification (Version 2.1, 2017-11-21)
1+
# Based on UCUM specification (Version 2.2, 2024-06-28)
22
# Includes ucumvert-specific fixes to handle all common UCUM units
33
# and some edge cases not present in the official examples.
44
# This file is auto-created by parser.update_lark_ucum_grammar_file
@@ -74,7 +74,8 @@ UNIT_NON_METRIC: "10*" |"10^" |"[pi]" |"%" |"[ppth]" |"[ppm]" |"[ppb]"
7474
|"[CCID_50]" |"[TCID_50]" |"[EID_50]" |"[PFU]" |"[FFU]" |"[CFU]"
7575
|"[IR]" |"[BAU]" |"[AU]" |"[Amb'a'1'U]" |"[PNU]" |"[Lf]" |"[D'ag'U]"
7676
|"[FEU]" |"[ELU]" |"[EU]" |"Ao" |"b" |"att" |"[psi]" |"circ" |"sph"
77-
|"[car_m]" |"[car_Au]" |"[smoot]" |"[m/s2/Hz^(1/2)]" |"bit_s"
77+
|"[car_m]" |"[car_Au]" |"[smoot]" |"[m/s2/Hz^(1/2)]" |"[NTU]"
78+
|"[FNU]" |"bit_s"
7879

7980
EXPONENT : ["+"|"-"] NON_ZERO_DIGITS
8081
FACTOR: NON_ZERO_DIGITS

src/ucumvert/ucum_pint.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -385,5 +385,6 @@ def run_examples(): # pragma: no cover
385385

386386

387387
if __name__ == "__main__":
388-
run_examples()
389-
# find_ucum_codes_that_need_mapping()
388+
# run_examples()
389+
find_matching_pint_definitions()
390+
find_ucum_codes_that_need_mapping()

src/ucumvert/vendor/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This directory contain copies of files from the [UCUM repository](https://github.com/ucum-org/ucum) to enable running the code without internet access. The copied files fall under the [UCUM Copyright Notice and License](https://github.com/ucum-org/ucum/blob/main/LICENSE.md) (Version 1.0).
44

5-
* `ucum-essence.xml` - Version 2.1 (revision date: 2017-11-21 19:04:52 -0500).
5+
* `ucum-essence.xml` - Version 2.2 (revision date: 2024-06-17).
66
* Used to build the terminals of the lark parser.
77
* `ucum_examples.tsv` - Extracted from [TableOfExampleUcumCodesForElectronicMessaging.xlsx](https://github.com/ucum-org/ucum/blob/main/common-units/TableOfExampleUcumCodesForElectronicMessaging.xlsx), Version 1.5, released 06/2020
88
* Used in unit tests. The tsv was created with the script `get_ucum_examples_as_tsv.py`.

0 commit comments

Comments
 (0)