It should be noted that in C89, both behaviors of division and modulo are allowed and it's implementation-defined whether division results are truncated or rounded towards negative infinity. This has changed in C99 which only allows truncation towards zero.
I've always considered it a mistake to refer to "%" as a "modulo" operator in C because of its ability to return negative numbers. That's not how modulo arithmetic works. C's "%" is a remainder operator.
We should be a bit careful with what one means by "modulo arithmetic" here. If we are talking about arithmetic in Z/nZ (read "integers modulo n"), then the objects being acted upon are no longer integers, but equivalence classes of integers. That is, the set of all integers that satisfy the equivalence relation "~" where "a ~ b" means a - b = k*n for some integer k. For example, one of the equivalence classes in Z/3Z would be the set {..., -4, -1, 2, 5, ...}.
Since every element of the set is equivalent under that relation, we typically will choose a representative of that equivalence class, e.g., [2] for the previous class. If we view the "mod" or "modulo" operator to be a mapping from the integers to a particular representative of its equivalence class, there's no reason that negative values should be excluded. [-1] refers to the same equivalence class as [2], [32], and [-7]. The details of how integers are mapped to a chosen representative seem to vary from language to language, but modular arithmetic works all the same between them.
But 2 mod 3 should still give the same as -1 mod 3. If the representative for the same equivalence class is allowed to be different depending on the input, you could just as well use the number itself.
At the risk of being snarky, I should certainly hope the output depends on the input.
On a more serious note, there are quite reasonable properties that are satisfied by keeping the sign of the first argument as the chosen representative. In particular, for any two nonzero integers a and b, we have that:
a = (a / b) * b + (a mod b)
Where division should be interpreted as integer division that rounds towards zero. If a=-1 and b=3, then (a/b) would be zero which would require (a mod b) to be -1 if we want the above to hold. Also note that other rounding choices (e.g., rounding towards negative infinity) could impose (a mod b) = 2.
So choosing a particular representative comes down to choosing what properties you want your function to have in relation to other arithmetic.
Neither of them is wrong. Languages (like python) which only return a positive number are returning the least positive residue of division under the modulus. That is the most common interpretation of it in number theory at least far as I know.
C generally does return negative numbers when you use the % operator.
Under modulo math, -1, 9 and 19, for example, are all the same element of the modulo 10 congruence. 9 is called the "least positive residue", which is sometimes important.
"operator" is being used as a term of art in programming language design, to describe a symbol used to indicate that an operation should be performed on one or more operands.
So the "%" symbol is an operator in C and Python, and it is standard usage to describe the operator that performs a mathematical function "X" as the "X operator".
As sibling comments have mentioned, "operator" here is being used in the programming language sense of the word, not the mathematical sense. Using the name "mod" [as in mod(a,n)] or "modulo" for the mapping that takes integers to a particular representative of its equivalence class in Z/nZ ("integers modulo n") doesn't seem like an unreasonable choice to me.
(mod N) is an operator which indicates that the arithmetic in the preceding formula or equation is understood as taking place in a modulo N congruence.
A triple equal sign is usually used for modular equations. That triple equal sign, together with the (mod N) out to the right, constitute an operator.
-1 ≡ 9 (mod 10)
Since it is written as part of the syntax of a formula or equation, and modifies its semantics, it is an operator. Just like d/dx, sigma notation and whatever not.
One thing that is still "implementation-defined" in C and C++ is the result of a right shift of a negative integer.
On pretty much all platforms, the >> operator shifts in the sign bit and does not round the result — which makes it equivalent to flooring division by a power of two.
It is consistent with the division operator only when the left-hand value is positive.
However, the oposite shift is undefined if a 1 goes into the sign bit.
More precisely, regarding E1 << E2, it is written (in the April 2023 ISO C draft):
"If E1 has a signed type and nonnegative value, and E1 × 2ᴱ² is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined."
Thus if E1 is negative, or if the result overflows, UB.
On some CPUs architectures, the operand size for some instructions could be larger than an `int`, in which case the upper part of a CPU register would become invalid on overflow instead of containing an extension of the sign bit.
There are also CPUs that do "saturating arithmetic" where overflow results in INT_MAX instead of wrapping.
Because the shift operations are arithmetically defined, and the situations that are undefined correspond to overflow. v << 1 means the same thing as v * 2 and is undefined if v * 2 is undefined.
mentioned in the comments on the post, which merit reading in this case
i haven't read the rationale, but presumably the committee did this because it's what virtually all cpus do (because it's what fortran does), so it's the only thing that can be implemented efficiently on virtually any cpu with a division instruction, and so virtually all c implementations did it, and standardizing the behavior of virtually all c implementations is better than leaving it implementation-defined or standardizing a behavior that conflicts with virtually all existing implementations and can't be implemented efficiently on most high-end hardware
Too bad they made the wrong decision in not supporting the expected floating point behavior of division through zero (python throwing errors, rather than return inf or nan)
I am soooo glad that Python throws errors there. Instead of being surprised when your math starts giving unexpected results, it blows up and makes you deal with the error. There's no situation where I'd prefer silent incorrectness.
But it is incorrect. Positive or negative number divided by zero is either +infinity or -infinity. For a positive dividend it would be +infinity only if the divisor got there from a positive number towards absolute 0, but at the time of doing the division we only have absolute 0 as the divisor.
The usual floating point behavior of having signed zero (+0 and -0) is not enough for this to be correct. You would also need a normal proper unsigned absolute 0 as well. Then you could give +inf for division with +0 and -inf for division with -0 and error or nan with division by the absolute (unsigned) 0. As far as I understand there is no unsigned 0 in IEEE 754, 0 is +0, so in reality it is always signed.
Furthermore, it's almost never the case that you legitimately expect to divide by zero. In the case of Python where exception handling is pervasive and idiomatic, it's far more reasonable to treat that as, well, an exception. If you're doing an operation that sometimes execute divide-by-zero, you can watch out for it and handle it appropriately. When you're not expecting to be doing that calculation, it's an error.
You're right about the unexpected results in that case, but I'd bet good money that accidentally dividing by zero in code like `avg=sum(lst)/len(lst)` is far more common in standard Python code than dividing very small numbers.
in https://stackoverflow.com/questions/78064239/does-python-not... it seems like ieee 754 does permit throwing errors on division by zero, and although it isn't the default behavior on any hardware i've used, the name of unix's division-by-zero exception signal strongly suggests that it's also the default behavior on the (non-ieee-754-compliant) pdp-11. https://stackoverflow.com/questions/12954193/why-does-divisi... makes the opposite assertion, though that ieee 754 does not permit division by zero traps. i'm not sure what to believe
people commonly opt out of python's behavior in this case by using numpy
python's decision is maybe suboptimal for efficient compilation, but it has a lot of decisions like that
edit: downthread https://news.ycombinator.com/item?id=39540416 pclmulqdq clarifies that in fact trapping on division by zero is not only ieee-754-compliant but (unlike flooring integer division!) efficiently implementable on common high-performance architectures
IEEE 754 has an infinity, so division by zero isn't the catastrophic thing that it is in integer. However, division by zero is still an exception as defined by the 754 standard.
What hardware does with these exceptions is a separate question, though. Some CPUs will swallow them for performance.
there's a whole set of terminology about 'exceptions', 'traps', and 'signaling' in ieee 754 that i don't really understand, but in particular the second thread i linked there seems to claim that an ieee 754 'exception' is almost, but not quite, completely unlike a python 'exception', which it apparently calls a 'trap' (as do many cpu architectures):
> When exceptional situations need attention, they can be examined immediately via traps or at a convenient time via status flags. Traps can be used to stop a program, but unrecoverable situations are extremely rare.
you know quite a bit about cpus. do you know of any ieee-754 cpus that trap on floating-point division by zero by default? is there a way to configure commonly-used cpus to do so?
I have been involved in the IEEE 754 committee on and off, so I will say that exceptions in the current standard are purposely sort of ambiguous in order to allow parallel and hardware implementations to also be standards-compliant (rather than just serial CPUs), and the exception-handling discussion has already been raised again for 754-2029. Exceptions as defined in the standard are the general form of error handling for IEEE 754.
Signaling NaNs (sNaN) are the only form of "signaling" that I am aware of, and sNaNs are just set up to trigger an "invalid operation" exception. I believe the idea is that an operation raises an sNaN when it creates a NaN, and a subsequent operation can catch it and issue the exception so you can figure out exactly where the calculation went wrong. If you ignore that exception, the sNaN becomes a quiet NaN (qNaN) and propagates.
The idea of traps have been written out of the standard in favor of exceptions. Traps had a defined behavior, which is what you would expect as an "exception" in most programming languages. A sibling comment indicated how to enable the FP exceptions as CPU interrupts, which become OS traps, and then become exceptions in Python/Java/C++ (any language with exceptions). Totally not confusing terminology at all.
I don't know off the top of my head of any architecture that treats all the exceptions in this way by default. One of the other exceptions is "inexact" which comes up on many calculations. I don't think you would want that to generally cause a CPU interrupt. Most implementations dump the exceptions as flags into a status register and let you see what happened if you care.
// Demonstrate C99/glibc 2.2+ feenableexcept
#define _GNU_SOURCE
#include <math.h>
#include <fenv.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char **argv)
{
if (argc != 3) {
fprintf(stderr, "Usage: %s a b # floating-point divide a by b\n", argv[0]);
return 1;
}
feenableexcept(FE_DIVBYZERO);
double a = atof(argv[1]), b = atof(argv[2]);
printf("%g ÷ %g = %g\n", a, b, a/b);
return 0;
}
is the machine still ieee-754-compliant after you call feenableexcept? i'm guessing that if it weren't, intel wouldn't have defined the architecture to have this capability, given the historically close relationship between 8087 and ieee-754
IEEE 754 has infinity, it has signed zeros, but it doesn't have the absolute zero. For the math to be entirely correct it is not enough. I wonder why it was done like this.
+/-0 should generally be thought of as 0. It's just "0 from the right" and "0 from the left." The reason it has +/-0 is for a few specific applications:
- Some people really do want 1/(-x) to be negative infinity when a positive x reaches underflow. This helps with some kinds of numerical simulations since it can preserve the sign bit through underflow.
- Interval arithmetic implementations need to be able to disambiguate closed and open intervals at 0.
If you turn on fast math mode, you can also essentially disable the special treatment of the zeros.
But it is not possible as there is no unsigned/absolute zero: 0 means +0. I guess it makes things much simpler, but IMHO it makes it a bit less useful and strange.
I think if you treat infinity as a special case of NaN (which it basically is for many math systems), you get the behavior you want. A lot of people think that +/-0 represent +/-epsilon, but they really don't. +0 is the default 0, and -0 is what you get when you underflow specifically from the left.
you're in luck! that's such a frequently asked question that kahan has written several papers about it, and there's a wikipedia page too: https://en.wikipedia.org/wiki/Signed_zero
I confess I have grown increasingly in favor of how common lisp does this. The basic `/` creates rationals. And "floor" is a multiple value return where the first value is the floor and the second is the remainder.
Granted, getting used to multiple value calls takes getting used to. I think I like it, but I also think I emphatically did not like it initially.
Scheme in the form of R7RS does something similar, though since failure to capture all returned values is an error (unlike in Common Lisp), there's more functions involved. It also offers a choice between truncation and floor behavior, so there's `truncate-quotient`, `truncate-remainder`, `floor-` versions and for both values `truncate/` and `floor/`.
Continuing in fairness, lisp is a bit more involved, here. If you do `(/ 1.0 3)`, you do not get a rational. Similarly, any division that is not integer/rational there will get treated mostly as expected.
Basically, it seems as soon as you introduce a floating point number, it stays there. Which is roughly what I would expect.
Yeah, I think this is what I really didn't like about it at the start. Learning that you probably want `floor` if you want things to be integer took more time to learn than feels natural.
Note the top comment by “ark” - there’s really no perfect solution here.
In the floating-point case, you have to choose between negative remainders or potentially inexact results. And you definitely want integer division to work the same as float division.
Python's floating point flooring division operator also has a pitfall: it gives a correctly rounded result as if you computed (a / b) with infinite precision before flooring. This can lead to the following:
>>> 1 / 0.1
10.0
>>> 1 // 0.1
9.0
This is because 0.1 is in actuality the floating point value value 0.1000000000000000055511151231257827021181583404541015625, and thus 1 divided by it is ever so slightly smaller than 10. Nevertheless, fpround(1 / fpround(1 / 10)) = 10 exactly.
I found out about this recently because in Polars I defined a // b for floats to be (a / b).floor(), which does return 10 for this computation. Since Python's correctly-rounded division is rather expensive, I chose to stick to this (more context: https://github.com/pola-rs/polars/issues/14596#issuecomment-...).
In an abstract sense, floating-point division is a fundamentally different operation than integer division. There is not generally expected to be a remainder fron floating-point division aside from 0.5 ULP that will be rounded away.
If you use it without considering rounding modes carefully and then round to integer, you can get some funny results.
I mean "inexact results" is essentially float's life motto.
That said, I don't really see why you would necessarily want float and integer division to behave the same. They're completely different types used for completely different things. Pick the one that is appropriate for your use case.
(It seems like abs(a % b) <= abs(b / 2) might be the right choice for floats which is pretty clearly not what you want for integers. I also just learned that integer division // can be applied to floats in Python, but the result is not an integer, for some reason?)
to people who use floating-point math seriously, it's very important for floating-point results to be predictably inexact; if they aren't, floating point is at best useless and usually harmful
Sure, but the inexactness of this modulus operation would not be any more unpredictable than all other kinds of float inexactness. Unless you're talking about the case where you don't know whether your operands are ints or floats. But in that case, there are tons of other unpredictabilities. For example, a + b will round when adding (sufficiently large) integers-represented-as-floats while it will never round for integers. So if you take floating-point math seriously, knowing that you're actually dealing with floats is the first step.
it's true that it would be deterministic, but it would behave differently from the ieee 754 standard, which is at least surprising, which is another sense of the word 'unpredictable'. admittedly, floating-point math that is inexact in a surprising way is not necessarily useless, and to someone who isn't deep into numerical analysis, all floating-point math is inexact in surprising ways
still, it would have real pitfalls if numerical algorithms that give right answers in every other programming language gave subtly wrong answers in python (raising a division-by-zero exception is less troublesome)
> i also didn't know python supported // on floats
Like most surprising features in python, it would be terribly annoying if it didn't. Especially since you couldn't make sure the argument to the function you're writing wasn't a float. At that time, anyway.
This does something different. a // b tries to "fit" b into a as many times as possible. For example, 1.0 // 0.4 == 2.0 because 0.4 fits twice into 1.0 (with a remainder of 0.2). Though as the result should always be an integer, I'd argue that the result should actually be 2, not 2.0. But alas, it's not.
With your change, you calculate 1 // 0 instead, which crashes.
That said, I think checking isinstance(x, float) was always possible. (And nowadays, you can put a type annotation.)
Assuming a, b are integers, the following answers are exact:
def div_floor(a, b):
return a // b
def div_ceil(a, b):
return (a + b - 1) // b
def div_trunc(a, b):
return a // b if (a < 0) == (b < 0) else -(-a // b)
def div_round(a, b):
return (2*a + b) // (2*b)
because it's a float and not an int, it's strictly talking about integers here, not floats. All int conversions of floats are converted by discarding the remainder afaik but I could be wrong here (e.g. int(1.9) == 1, and int(-1.9) == -1)
edit: -3//2 == -2 for instance, since it's strictly integer division
>The submission is a tweet which doesn't really have a title, so the submitter was forced to make up one. Unfortunately the assertion in the chosen title is not correct. In Python the mod operator returns a number with the same sign as the second argument:
>Since its 1983 standard (Forth-83), Forth has implemented floored division as standard. Interestingly, almost all processor architectures natively implement symmetric division.
>What is the difference between the two types? In floored division, the quotient is truncated toward minus infinity (remember, this is integer division we’re talking about). In symmetric division, the quotient is truncated toward zero, which means that depending on the sign of the dividend, the quotient can be truncated in different directions. This is the source of its evil.
>I’ve thought about this a lot and have come to the conclusion that symmetric division should be considered harmful.
>There are two reasons that I think this: symmetric division yields results different from arithmetic right shifts (which floor), and both the quotient and remainder have singularities around zero.
>If you’re interested in the (gory) details, read on. [...]
Did you read the article? It's about what happens in negative numbers. Of course everyone agrees that 7/2=3 in integer division, but -7/2 is less obvious.
Changing well known behavior for something no one is really going to need. The justification makes sense, but it breaks convention and the relationship with modulo doesn't need to hold for negative numbers.
I strongly disagree. I would estimate that in 90% of cases where I use modulo in languages that truncate (instead of flooring), I write (a % b + b) % b, or something similar, just to get the right behaviour. The exceptional cases are those where I can convince myself that negative numbers simply won't come up. (It's never because I actually want the other behaviour for negative numbers.)
- When using modulo to access an array cyclically. (You might get lucky that your language allows using negative numbers to index from the back. In that case, both conventions work.)
- When lowering the resolution of integers. If you round to zero, you get strange artifacts around zero, because -b+1, ..., -1, 0, 1, ..., b-1 all go to zero when dividing by b. That's 2b-1 numbers. For every other integer k, there are only b numbers (namely bk, bk+1, ..., bk+b-1).
I have never seen a case where truncation was the right thing to do. (When dealing with integers. Floats are different, of course, but they are not what this is about.)
A common approach (e.g. Cobol, Ada, Common Lisp, Haskell, Clojure, MATLAB, Julia, Kotlin) seems to be to provide two operators: One that uses truncated division, one that uses floored division. By convention, rem truncates, and mod floors.
> I have never seen a case where truncation was the right thing to do.
Splitting a quantity into units of differing orders of magnitude. For example, −144 minutes is −2 hours and −24 minutes, not −3 hours and 36 minutes. This is about the only case I know of, though.
In generic numerics work there are two broad cases when you have a sea of sample points (eg: a 2D plane with samples taken all over) and you cast a mesh across the sea to divide the work into cells.
\1 samples have new coords relative to cell left edge and bottom edge.
We're interested in sample distance from "classic" first quadrant axis.
\2 samples are weighted by distance from cell centre.
We're pooling, combining channel layers, making representative central 'blobs', we're interested in cell sample positions relative to the centre point of the cell.
Your example is a good one, and falls in the case 2 grouping.
This, of course, generalises to 3D and higher dimensions.
Oh, that's a somewhat reasonable one. (Though I would expect that you actually want −(2 hours and 24 minutes) in many cases instead, thus needing to handle the negative case separately anyway.)
Quantify "well known". Historically enough variation existed in this area [1], and C only happened to copy FORTRAN's behavior for the sake of compatibility.
Fortran is old – 1958 onwards. It has precedence here, though at what point it separated the two behaviours into mod and modulo functions I don’t know.
Edit: From what I can tell, standardised in Fortran 90, presumably older than that.
> Changing well known behavior for something no one is really going to need.
On the contrary I can't imagine when and why anybody would want truncation. That's just a side effect of the used algorithm and not something that actually makes much (any?) sense.
- given a number t of seconds after the epoch, what time of day does it represent? using python's definition, (t + tz) % 86400
- given an offset d between two pixels in a pixel buffer organized into sequential lines, what is the x component of the offset? using python's definition, d % width is right when the answer is positive, width - (d % width) when the answer is negative, so you could write d % width - d > p0x ? d % width : width - d % width. this gets more complicated with the fortran definition, not simpler
edit: the correct expression is (d % width if p0x + d % width < width else d % width - width) or in c-like syntax (p0x + d % width < width ? d % width : d % width - width). see http://canonical.org/~kragen/sw/dev3/modpix.py