Hacker News new | past | comments | ask | show | jobs | submit login
Improving our safety with a physical quantities and units library (open-std.org)
67 points by limoce 8 months ago | hide | past | favorite | 68 comments



Surprised that I'm the only person so far mentioning F#, where units of measure have been built in to the language since 1.0: https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref...

I wrote a longer comment in reply to nraynaud elsewhere describing how I found it useful for certain things at a past job. It's hardly a killer feature, but it's very nice to have the compiler (partially) verify the semantic correctness of certain floating-point computations. It's not just physics - finance has "units" as well, and the compiler can complain if you gave a function an array of dollars/week when it wanted dollars/month.

If you don't mind writing some (admittedly tedious) overhead it can save a lot of headaches.


Given that months have varying amounts of days, does it even make sense to have both dollars/month and dollars/day in the same setting? I would figure you would use either one or the other.


> does it even make sense to have both dollars/month and dollars/day in the same setting?

Parking downtown often has daily and monthly rates: is it cheaper to pay for two months, or one month and ten days? If you're doing the math yourself it's immediately obvious if you accidentally swap the rates, but not so if you write a program that swaps the variables. You wouldn't write a program at all for this example, but it's a legitimate use case.

Your general point is correct: it typically doesn't make sense to have them in the same setting if by "setting" you mean "specific equation." But they could very well be in the same program.


Note that this is heavily dependent on C++20. I'm not even caught up on C++17, so I was thrown off by statements like this:

    quantity q1 = 42 * J;
where quantity is a class template:

    template<Reference auto R,
        RepresentationOf<get_quantity_spec(R).character> Rep = double>
    class quantity;
I didn't know you could use auto in a template parameter. I gather that Reference is a concept, something else I've yet to explore. Just when I thought I'd gotten the hang of "modern" C++.


One option for C++14 is the Au units library. Here's a comparison to other unit libraries, including mp-units: https://aurora-opensource.github.io/au/main/alternatives/


auto in this context creates a non-type template parameter (NTTP). They've been in the language since C++11 (std::array<int, 3>).


Non-type template parameters (NTTP) were already in the original standard, they've existed since C++ had templates.

Being able to use "auto" for the type of the NTTP was new in C++17; and using class types like std::array for NTTPs was only added in C++20.


If you're interested in this kind of thing, I wrote a python package called "unit-syntax"(https://github.com/ahupp/unit-syntax) that adds physical unit syntax to python:

   >>> speed = 5 meters/second
   >>> (2 seconds) * speed
   10 meter


I'd been using Jupyter notebooks as a calculator for engineering problems and was wishing for the clarity and type safety of real units, but with the succinctness of regular python.

It works in both Jupyter notebooks (with an input transform) and stock python (with an import hook). The actual unit checks and conversions are handled by the excellent `pint` package.


I don’t understand why you’d go to the enormous trouble of extending the syntax when you can write it with perfectly normal syntax. A value-with-unit is, after all, just a scalar multiplied by the unit, and I suppose this would work with the underlying library pint:

   >>> speed = 5*meters/second
   >>> (2 * seconds) * speed
   10 meter
Given the ecosystem problems and other problems extending the syntax gets you, I don’t get why you’d do this at all, when all it gets you is the ability to write ' ' instead of '*'.


IMO using operator overloading for this kind of thing makes it hard to read, since I have to be extra careful to mentally parse whether that `*` is a multiplication or units, remember what variables are in scope etc. Notation matters, and if I didn't care about that I'd just write `pint.Quantity(5, "meters/second")` and be done with it. Or more likely, not go to the trouble of using them at all.

> I don’t understand why you’d go to the enormous trouble

But more importantly, it was just really fun to get it working.


> IMO using operator overloading for this kind of thing makes it hard to read, since I have to be extra careful to mentally parse whether that `*` is a multiplication or units

Units are inherently exactly multiplication.

5 meters is:

5 (unitless) * (1) meter


Yes. There are many (infinitely!) ways to write an equivalent expression, most of which are not as clear to read as the standard format.


You neglected to say that this does automatically determine the type instead of type checking as in the article.

You could also use python type hints for this stuff.

Here are some links about units in python:

https://socialcompare.com/en/comparison/python-units-quantit...


All the motivating examples but one are about mixing units from different system. And of those, all but one are about errors in mixing US Customary units with the metric system. Maybe Americans should just stop using the US Customary Units as a start?


Even in a metric monoculture, mixing code with different system of units poses a problem. Worse, mixing code with no system and just a convention for what units numbers are assumed to be in. Joining two simulation packages, on with length in mm and the other in cm provides a bounty of potential bugs.

Sometimes this variation is happenstance but sometimes it can be important for precision. nm and ns are the relevant scales for particle physics while parsecs and seconds for astro and yet astro-particle physics is a thing.


Parsec is a unit of distance, not time, even if Star Wars is confused about that.


The need would still exist. TFA covers the often-overlooked issue of different "kinds" of quantity that have the same dimensions but shouldn't be mixed. Eg. Hz and radian/s for frequency and angular frequency. Both have dimensions of 1/time but they're different. You have to be an extreme purist to refuse to use Hz and only express frequencies as angular frequencies.

And by the way, if the metric system is so great, why does Avogadro's constant still exist? Since 2019, it's fortunately been downgraded to an exact arbitrary number but it's an ugly legacy of even metric-users refusing to accept change.


Switching costs would be very high. Just consider this library. There is no standard way to handle units in C++. This presumably means that every single piece of C++ software that was ad-hoc programmed to work in customary units needs modification, or some sort of shim/interface layer to become 'metric'. Or be saddled with decades of dual-systeming.

As many of the examples point out, the very act of switching systems can cause problems (the Air Canada flight is the classic example).

This is not to say that switching is a hopeless cause, but it's certainly not a cheap or easy task to perform.

Maybe we can leverage Y2K38 as a breakpoint to help this transition along =P


Yes that would work, but why stop there. Having hundreds of different spoken languages is wasteful, so why not standardize everyone on Esperanto while we're at it, unify all electric plug standards to the one used in the EU and change the last few countries over to driving on the right side of the road.

Don't even get me started on all the duplicate work being done on text editors, programming languages, web servers and databases.

/s of course.


It's easy enough to mess up conversions between amplitudes of a unit (ex. milliamps vs amps) and having a standardized and well-tested way of doing so is important for reliability and safety critical systems IMO.


We can’t because football fields are 100 yards and gun caliber is based on the inch. Changing those would result in civil war.


[flagged]


Actual scientific institutions like NASA already do, and all of the imperial units have metric backings and definitions now, so it’s just ignorance, stubbornness, or apathy at this point.


Oh boy, NASA, a scientific organization using metric. Lemme know when you’ve converted every recipe, ruler and wrench to metric. So ridiculously out of touch.


Practicality over purity. How are we to divide a meter into whole number thirds?

Easily done with yards.

Get stuffed, Talleyrand.


> How are we to divide a meter into whole number thirds?

We can't divide a metre into whole number anythings; and likewise a metre doesn't itself divide evenly into anything else. That's because the metre is the only unit of distance in metric/SI; so there's nothing else to divide into. That's a feature, not a bug, since it makes metric "coherent" https://en.wikipedia.org/wiki/Coherence_(units_of_measuremen...


You can divide a meter into four 25 centimeter pieces, or five 2 decimeter pieces.

Alternatively, you could restrict US units to just yards if you wanted, and get nearly the same inconvenient behavior as meters.

Just because most metric unit ratios are a power of ten doesn’t mean there is no ratio.


A meter is three 1/3-meter pieces.


> Just because most metric unit ratios are a power of ten doesn’t mean there is no ratio.

There is one metric unit for distance (the meter). Likewise, there is one unit for force (the Newton); one for energy (the Joule); one for mass (the kilogram); one for electric charge (the Coulomb); etc.

> You can divide a meter into four 25 centimeter pieces, or five 2 decimeter pieces.

The parent specifically asked for "whole numbers". The term "centi" means hundredth, and "deci" means tenth, so your examples are 25/100 metres and 2/10 metres, which are not whole numbers. If we allow fractions (which we should!) then sure, we can make any rational ratio we like.

> Just because most metric unit ratios are a power of ten doesn’t mean there is no ratio.

All ratios between units in metric/SI are 1, by design. You're probably referencing power-of-ten ratios between prefixes, like "centi", "kilo", "mega", "micro", etc. Those are not units, they're just an alternative way of writing numbers. For example, 2km = 2000m because 2k = 2000 (by definition of k); nothing to do with units, which in both cases are meters. Likewise, both "2 dozen meters" and "24 meters" are in units of meters; the first isn't in a separate unit of "dozen meters".

> Alternatively, you could restrict US units to just yards if you wanted

I've never seen anyone restrict to using yards; usually the restriction is to feet, like the foot-pound-second system https://en.wikipedia.org/wiki/Foot%E2%80%93pound%E2%80%93sec...

> and get nearly the same inconvenient behavior as meters

The explicit convenience of using meters (rather than the implicit convenience of not having any length conversions) is that all of its conversion factors to combinations of other units are one, by definition. For example:

Pushing a force over a distance takes energy (see https://en.wikipedia.org/wiki/Work_(physics) ) so distance = energy / force. In metric, 1 meter = 1 Joule per Newton. In contrast, 1 yard = 0.968 calories per pound-force, which is less convenient.

Newton's second law says force = mass × acceleration, so distance = force × time² / mass. In metric, 1 meter = 1 Newton second² per kilogram. In contrast, 1 yard = 2.978 pound-force second² per slug, which is less convenient.

etc.

I actually wrote a page about this on my blog a while ago http://www.chriswarbo.net/projects/units/metric_red_herring....


> I’m a firm believer that seemingly-innocuous complications, like those found in imperial units of measurement, are in fact significant risks; they impede learning, potentially turning people away from areas like maths and science; and their compounding, confounding behaviour on the large scale constrains what we’re capable of achieving as a species.

Hey now: one man's impediment is another man's business model. ;-)


If for some reason you really need to represent exactly a third of a meter (or anything else) rationals are not exactly difficult to use on a computer.


This is the weirdest justification to keep using a non-metric system.

Not able to divide a meter into a whole number thirds, is the least of the metric system's problems.

This is vastly better than having no unit smaller than an inch, and having to write a smaller length in fraction of an inch (e.g. "3/16 inch")


Why are eggs sold by the dozen instead of 10?

Factors 2,3,4,6 >>> 2,5.

The real reason probably has more to do with switching costs for ports, &c.

That, and copious people making fat piles of cash off of the status quo.


  std::vector<quantity<si::milli<si::seconds>>> vec;
I've seen efforts like this in the past, but in my experience there's a substantial readability cost when you replace a single-line calculation with a multi-line one to accommodate giant type definitions like this.

And you just know anyone who's adopting this library is also going to have long variable names and a strict line length limit, both of which they will describe as "clean code" :)


Shorthands for some heavily used units are not forbidden. ;)

For vectors you can take that one step further even...

https://cpponsea.uk/2022/sessions/taking-static-type-safety-...


Doesn't your calculation happen after you've declared your types? I agree that this is needlessly long, but in my experience you can say what the physical units of your input variables and constants are up top and the let everything else be inferred and your computations should look the same as ever, but with extra checking at compile time.


Scala's squants library is a nice implementation of units-of-measure/dimensional-analysis http://www.squants.com

In particular it uses types for dimensions; whilst units are just constructors. Hence `Meters(2)` and `Microns(7)` have the same type (`Length`).


What do you get if you add Meters(2) to Microns(7), out of curiosity?


Each of the dimensions has a "primary unit", which the constructors convert into. For Length it's Meters, so Meters(2) stores a 2, whilst Microns(7) will store a 0.000007 https://www.javadoc.io/doc/org.typelevel/squants_2.13/1.6.0/...


Surely that spoils some cases where you use different units to keep the values of reasonable magnitude? Sometimes, such as with electromagnetics, the exponents are so extreme they lead to numerical error. Sometimes also, people (inappropriately) use arbitrary epsilons to compare to zero and end up nuking micron-scale dimensions expressed in meters. A volume of 1 μm^3 is 1e-18m^3 which you might not want to operate on too hard.


I have a project for which I wrote a simple units library. I don't think I'd be able to write any physics-related project now without using it (or a similar library). My Quantity class has a set of 8 parameters (7 SI base units and a hack that allows conversion between Hz and radians) + additional Scale and Offset parameters. Scale allows representing units other than SI (like Nautical Miles), Offset is for units like Celsius, for which 0°C == 273.15 K.

I can do things like:

    si::Length length = 15_m + 12_nm; // _nm for Nautical Miles
    si::Area area = 1_m * 1_km; // Equals to 1000_m2
    si::Power power = 1_m / 1_sec / 1_sec; // Compilation error, 1_m/s² is not a si::Power
I don't have every possible User-Defined Literal, of course, so I end up doing this for less common units:

    using SomeLocalTypeName = decltype(1_rpm / 1_V);
Something to thing about, when designing such library:

* What is 0_degC + 1_degC? 1_degC or… 273.15_K + 274.15_K = 547.3_K = 274.15_degC? I forbid operations between units if any of them has Offset parameter different than 0. I'm not sure if this is the good solution, though.

* Nm (Newton-meters) is the same unit as Joules. ;-)


Yeah, Celsius is awkward because it's basically used as both an actual temperature and as an interval depending on context, and mixing them is bad news. A similar sort of distinction often shows up in datetime libraries between an instant and a duration/timespan, but I'm not sure it's worth doing in a unit library. Just forcing kelvin for intervals probably isn't too onerous for most uses, and I can't think of any unit it would really apply to other than °C/F, whosoever dares mix units of measure with time zones notwithstanding.


For your first point about temp, the proper way would be to convert to Kelvin, do the addition, and back to Celsius.


Yup. So I explicitly deleted operators for °C, so things like 5_degC + 10_degC won't compile.


But cant you just implement operator+(celcius, celcius) that will do the conversion to kelvin and back, since you know for certain the conversion math?


Adding temperatures is not done usually. Mixing substances adds thermal energy. There's no reason I can hink of why allowing plus operator on temperatures is a good idea. You can add temperature differentials, but it is plain addition so long as units are in agreement. And 0C differential is 0F differential, no offsets in conversions.


I did that, but then adding 0_degC + 0_degC gave 273.15_degC. I think that would be surprising to many people. In reality I don't remember ever having to deal with physics formulas that used °C instead of Kelvins, so the only time I need to use °C is when presenting the temperature in the UI:

    temperature.in<si::Celsius>()


Duh. I was thinking too much on the syntax side and not thinking about the physical world; which is ironic considering the discussion topic.


I’m not really a fan of the syntax in the article where you multiply the unit with the value.

I much prefer your user defined literal approach.

But is there any way to make it so you can introduce a new user defined literal for a new composed unit with reasonable syntax?


Something like this:

    auto
    operator"" _rpmPerVolt (long double x) {
        return decltype (1_rpm / 1_V)(x);
    }
I have a macro for that:

    #define SI_DEFINE_LITERAL(xUnit, xliteral) \
     [[nodiscard]] \
     constexpr Quantity<units::xUnit> \
     operator"" xliteral (long double value) \
     {  \
         return Quantity<units::xUnit> (value); \
     } \
     \
     [[nodiscard]] \
     constexpr Quantity<units::xUnit> \
     operator"" xliteral (unsigned long long value) \
     { \
         return Quantity<units::xUnit> (value); \
     }

    // Base SI units:
    SI_DEFINE_LITERAL (Meter, _m)
    SI_DEFINE_LITERAL (Kilogram, _kg)
    SI_DEFINE_LITERAL (Second, _s)
    SI_DEFINE_LITERAL (Ampere, _A)
    SI_DEFINE_LITERAL (Kelvin, _K)
    SI_DEFINE_LITERAL (Mole, _mol)
    SI_DEFINE_LITERAL (Candela, _cd)
    SI_DEFINE_LITERAL (Radian, _rad)
...and a loong list of other common units here.

The additional types are defined like this, in another file:

    using Foot          = ScaledUnit<Meter, std::ratio<1'200, 3'937>>;
    using Mile          = ScaledUnit<Meter, std::ratio<1'609'344, 1'000>>;
    using NauticalMile  = ScaledUnit<Meter, std::ratio<1'852, 1>>;
    using Inch          = ScaledUnit<Meter, std::ratio<254, 10'000>>;
And then I use SI_DEFINE_LITERAL (NauticalMile, _nm); etc.


This is the one area I most sorely miss in Rust. Not that I don’t miss it anywhere else (I absolutely do, currently writing lots of TS and trying to make do with nominal types), but that it’s a very important safety mechanism in all of engineering and a unit/system of measure built into the language would fit so well.


Just use a crate? First Google hit https://docs.rs/uom/latest/uom/


That looks nice, but appears to only handle SI.


The crate defines square feet etc.:

https://docs.rs/uom/latest/uom/si/area/index.html

Am I missing something?


It has feet, inches, yards, and other less-used(?) units like chains and rods. What units are missing? https://docs.rs/uom/latest/uom/si/length/index.html


Mateusz delivers high quality C++ code since a long time. Having implemented a similar library with C++ 03 in 2008 I highly appreciate the amount of attention to detail he invests here. We had numerous bugs in formulas found by that library over the years.


If working in python, Pint is an excellent choice: https://pypi.org/project/Pint/


The manifold[1] project for Java lets you write unit expressions directly.

Force force = 5kg * 9.807 m/s/s;

1. https://github.com/manifold-systems/manifold/tree/master/man...


There is an FAQ for the mp-units project on which this proposal is based about why they chose not to use user-defined literals:

https://mpusz.github.io/mp-units/latest/getting_started/faq/

If I understand correctly, though, the unit expressions that manifold implements aren't limited to literals, is that correct?


>the unit expressions that manifold implements aren't limited to literals, is that correct?

Correct. Unit expressions with manifold are more expressive than with mp-units, as I understand it. With mp-units the unit types themselves are expressively inactive.

For instance, with mp-units the type for velocity, _q_m_per_s (?), must be defined statically as a literal. But with manifold it may be defined as an expression in terms of the component unit types. As such the unit for Velocity, m/s, is an _expression_ of dimension units LengthUnit and TimeUnit.

    Velocity sixtyFiveMilesPerHour = 65 mi/hr;
And this provides for user defined literals.

    VelocityUnit mph = mi/hr;
    . . .
    Length distance = 80mph * 3.5hr;
It's also a convenience that applies to unitless expressions. For example, manifold defines a Rational number type, which defines a unit-based coercer:

    RationalCoercion r = RationalCoercion.INSTANCE;
    . . .
    var oneThird = 1r/3;
Which satisfies other uses, such as:

    var sixMillionDollars = 6M USD;


Adacore developed an interesting related system for the GNAT Ada compiler: https://www.adacore.com/gems/gem-136-how-tall-is-a-kilogram


Love this! I had to implement a half-assed version of this a long time ago at a job because there was a class of subtle bugs that popped up due to poor naming/documentation/spaghetti code that would've been eliminated by things like this.


I am curious about people's experience with this kind of system, I have been thinking about it for years, but never actually tried it.

I sometimes feel like finding the explicit unit of some sub-expressions in geometry might be complicated. I know "auto" avoids that particular problem, but I don't have a good policy for when to put "auto" to avoid expressing a complex type and when to explicit the type to block the propagation of errors.

Also during debugging we might need to display some customary unit for sub expressions/watchpoints because the SI unit means nothing to humans in some fields (pressure comes to mind, some people use mm of water, mm of mercury, bars, etc.)


It is built in to F# and I think it's a very underrated feature of the language for certain applications, although it comes with some overhead and tedium. There is a cost.

I used F# for financial analysis in the power industry, where there is just an atrocious lack of unit standardization leading to tons of subtle bugs. It's not just stuff about volts and kilowatt hours: I have nightmares about a function that wanted on-peak hourly averages but the caller gave it overall hourly averages - or did it??? By itself, the debugger does not really help you figure out what's going on here unless you are eagle-eyed about time zones. But the compiler can complain if you gave it an `float<price/hour> array` instead of `float<peakPrice/hour> array`, which makes the source of the bug much more transparent, or prevents the bug from occurring at all. Implementing certain computations in F# instead of C# saved me a lot of time and headaches, while also making the code more descriptive to humans. And of course the compiler also verifies things like kilowatts versus megawatts, etc. It's a specific use case: many intricate floating-point computations with highly sophisticated business logic, where common bugs can be avoided by simple labelling of the units.

It is all static, so there isn't boxing or performance penalty, but it does have syntactical overhead. And of course the compiler doesn't know physics - you can invalidly coerce things by arbitrarily multiplying with `1<newUnit/oldUnit>.` What you're saying about some subexpressions is true, but the units will only ever be in simple algebraic combinations so you won't get any truly confounding `decltype` head-scratchers. It might be tedious but it'll never be particularly deep. In practice, yes, it gets annoying. But in many cases I found the annoyance was a helpful check on my thinking - "somehow I got dollars^2/week, that doesn't make sense, what did I do?"

Instead of `auto`, if the units are getting too tedious I think it's better to cast down to floats/etc and do things the old fashioned way, with helpful documentation, then cast to a unit further up the call stack, where things may be more transparent.

You can see how it works in F# here: https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref... They are smaller examples, so it is hard to judge the safety-overhead from this alone. But I think they have a nice syntax, and clever ways to use generics.

> Also during debugging we might need to display some customary unit for sub expressions/watchpoints because the SI unit means nothing to humans in some fields (pressure comes to mind, some people use mm of water, mm of mercury, bars, etc.)

It seems like this would be in principle easy to add to a debugger, as long as all those units are in scope of the function. But it would be a very specific feature. And I am not sure if it's a good idea for software engineers to be debugging things in different units than the computer is currently working with - what if the bug is in the unit conversion itself? This just seems like a confusing practice.


Thank you, having a look at F#, I didn't know about its unit system.

As for the last point, it comes to mind because of electronics/mechanics engineering where the teachers always tell you to look if the numbers make sense in the context or that you should know the rough answer to a computation before hand, that is folklore that can only happen in customary units.


> Although it makes physical sense to add heights of daily climbs, there is no sense in adding altitudes.

What if I want to compute the average altitude? This requires adding altitudes.


Does anybody know about a unit library in C?


While this is a nice thing to have I dread the day that codes start to require people get with this. Yes, it's a source of bugs, sometimes people need to accept that bugs naturally arise because of something called the undecideability and we need to stop shoehorning things into languages to make things safe. It makes sense in a host of features but the problem with these sorts of features is they make a lot of a assumptions, including the fact that codes with units will actually be better with units.


You think that a lot of arithmetic bugs are due to the math being undecidable?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: