I don't think this is a naming issue at all. In the provided example, 'Accuracy' is the correct name for the parameter, as that's what the parameter represents, accuracy. The fact that accuracy should be given as a value in an interval from 0 to 1 should be a property of parameter type. In other words, the parameter should not be a float, but a more constrained type that allows floats only in [0,1].
EDIT: Some of you asked what about languages that don't support such more constrained types, so to answer all of you here: different languages have different capabilities, of course, so while some may make what I proposed trivial, in others it would be almost or literally impossible. However, I believe most of the more popular languages support creation of custom data types?
So the idea (for those languages at least) is quite simple - hold the value as a float, but wrap it in a custom data type that makes sure the value stays within bounds through accessor methods.
velocity() simply returns the first argument, after doing validity checking based on the second argument.
You probably couldn't reasonably use this everywhere that you would use actual constrained types in a language that has them, but you could probably catch a lot of errors just using them in initializers.
This got me thinking: What about a situation where the accuracy is given in a real-life unit. For example, the accuracy of a GPS measurement, given in meters. I've sometimes used names like 'accuracyInMeters' to represent this, but it felt a bit cumbersome.
Edit: Thinking more about it, I guess you could typealias Float to Meters, or something like that, but also feels weird to me.
More complex type systems absolutely support asserting the units of a value in the type system. For example, here's an implementation of SI types in C++: https://github.com/bernedom/SI
I've used "fraction" for this purpose .. but that isn't general enough. In fact a convention I've used for nearly 2 decades has been varName_unit .. where the part after the underscore (with the preceding part being camel case) indicates the unit of the value. So (x_frac, y_frac) are normalized screen coordinates whereas (x_px, y_px) would be pixel unit coordinates. Others are like freq_hz, duration_secs and so on.
Another thing you can do is define a "METER" constant equal to 1. You can then call your function like this: func(1.5 * METER), and when you need a number of meters, you can do "accuracy / METER". The multiplication and division should be optimized away.
Good thing about that is that you can specify the units you want, for example you can set FOOT to 0.3048 and do "5. * FOOT" and get back your result in centimeters by doing "accuracy / CENTIMETER". The last conversion is not free if the internal representation is in meter but at least, you can do it and it is readable.
If you are going to use such distances a lot, at least in C++, you can get a bit of help from the type system. Define a "distance" class operator overloads, constants and convenience functions to enforce consistent units. Again, the optimizer should make it not more costly than using raw floats if that's what you decide to use as an internal representation.
Some languages provide more than just an alias. Eg Haskell lets you wrap your Float in a 'newtype' like 'GpsInMeters'.
The newtype wrapper doesn't show up at runtime, only at compile time. It can be set up in such a way that the compiler complains about adding GpsInMeters to GpsInMiles naively.
> While true, if your language doesn’t support such a type
You'd be surprised where the support is. In C#, you would declare a struct type with one read only field of type double, and range validation (x <= x <= 1) in the constructor.
Yes there's a bit of boilerplate - especially since you might want to override equality, cast operators etc. But there is support. And with a struct, not much overhead to it.
I don't think runtime validation is that special. You can bend most languages into this pattern one way or another. The real deal is having an actual compile-time-checked type that resembles a primitive.
Actually what I want is just the reading experience of seeing
"public Customer GetById(CustomerId id)" instead of "public Customer GetById(string id)" when only some strings (e.g. 64 chars A-Z and 1-9) are valid customer ids.
Compile-time validation would be ideal, but validation at the edges, well before that method, is good enough.
The main issue with techniques such as this, which are certainly easy to do, is that if it’s not in the type system and therefore not checked at compile time, you pay a run time cost for these abstractions.
Great that you like typed languages, and ones that allow for such constrained/dependent typing as well.
It seems disingenuous to me to suggest that anyone using
other languages do not have this problem. And really, there are quite a few languages to not have this form of typing, and even some reasons for a language to not want this form of typing.
So please, don't answer a question by saying "your questions is wrong" it is condescending and unhelpful.
Equally condescending is saying "It's great that you like X, but I don't so I'm going to ignore the broader point of your argument."
The point remains that the fact a given parameter's valid values are [0,1] is not a function of it's name. You can check the values within the method and enter various error states depending on the exact business rules.
"your question is wrong" is indeed unhelpful, especially as a direct response to someone asking a question.
"here is what seems like a better question" is helpful, especially in a discussion forum separate from the original Q/A.
But if "here is what seems like a better question" is the _only_ response or drowns out direct responses, then thats still frustrating.
> condescending
As a man who sometimes lacks knowledge about things, when I ask a question, please please please err on the side of condescending to me rather than staying silent. (No, I don't know how you should remember my preferences separately from the preferences of any other human)
I'm genuinely sorry if I came across as condescending, that was not my intention at all.
I merely wanted to point out that, in my opinion, this property should be reflected in parameter type, rather than the name. Just like, if we wanted a parameter that should only be a whole number, we wouldn't declare it as a float and name it "MyVariableInteger" and hope that the callers would only send integers.
You mentioned that there are quite a few languages that do not permit what I proposed, would you mind specifying which ones exactly? The only one that comes to my mind is assembly?
So, then the user calling the library with foo(3.5) will get a runtime error (or, ok, maybe even a compile time error).
To avoid that, you need to document that the value should be between 0 and 1, and you could do that with a comment line (which the OP wanted to avoid), or by naming the variable or type appropriately: And that takes us back to the original question. (Whether the concept is expressed in the parameter name or parameter type (and its name) is secondary.)
> So, then the user calling the library with foo(3.5) will get a runtime error (or, ok, maybe even a compile time error).
I'm not sure I understand this. See below, but the larger point here is that the type can never lie -- names can and often do because there's no checking on names.
I think what is being proposed is something similar to
newtype Accuracy = Accuracy Float
and then to have the only(!) way to construct such a value be a function
mkAccuracy :: Float -> Maybe Accuracy
which does the range checking, failing if outside the allowable range.
Any functions which needs this Accuracy parameter then just take a parameter of that type.
That way you a) only have to do the check at the 'edges' of your program (e.g. when reading config files or user input), and b) ensure that functions that take an Accuracy parameter never fail because of out-of-range values.
It's still a runtime-check, sure, but but having a strong type instead of just Float, you can ensure that you only need that checking at the I/O edges of your program and absolute assurance that any Accuracy handed to a function will always be in range.
You can do a similar thing in e.g. C with a struct, but unfortunately I don't think you can hide the definition such that it's impossible to build an accuracy_t without going through a "blessed" constructor function. I guess you could do something with a struct containing a void ptr where only the implementation translation unit knows the true type, but for such a "trivial" case it's a lot of overhead, both code-wise and because it would require heap allocations.
You're solution is the ideal one and safest, although in the interest of maximum flexibility since the goal here seems more documentative than prescriptive, it could also be as simple as creating a type alias. In C for example a simple `#define UnitInterval float`, and then actual usage would be `function FuncName(UnitInterval accuracy)`. That accomplishes conveying both the meaning of the value (it represents accuracy) and the valid value range (assuming of course that UnitInterval is understood to be a float in the range of 0 to 1).
Having proper compile time (or runtime if compile time isn't feasible) checks is of course the better solution, but not always practical either because of lack of support in the desired language, or rarely because of performance considerations.
That's fair, but I do personally have a stance that compiler-checked documentation is the ideal documentation because it can never drift from the code. (EDIT: I should add: It should never be the ONLY documentation! Examples, etc. matter a lot!)
There's a place for type aliases, but IMO that place is shrinking in most languages that support them, e.g. Haskell. With DerivingVia, newtypes are extremely low-cost. Type aliases can be useful for abbreviation, but for adding 'semantics' for the reader/programmer... not so much. Again, IMO. I realize this is not objective truth or anything.
Of course, if you don't have newtypes or similarly low-cost abstractions, then the valuation shifts a lot.
EDIT: Another example: Scala supports type aliases, but it's very rare to see any usage outside of the 'abbreviation' use case where you have abstract types and just want to make a few of the type parameters concrete.
Sure, such other languages have the problem too, it's just that they are missing the best solution. It's possible for a solution to be simultaneously bad and the best available.
In languages with operator overloading you can make NormalizedFloat a proper class with asserts in debug version and change it to an alias of float in release version.
Similarly I wonder why gemoetry libraries don't define separate Point class and Vector class, they almost always use Vector class for vectors and points.
I understand math checks out, and sometimes you want to add or multiply points, for example:
Pmid = (P0 + P1) / 2
But you could cast in such instances:
Pmid = (P0 + (Vector)P1)/ 2
And the distinction would surely catch some errors.
Point - Point = Vector
Point + Point = ERROR
Vector +/- Vector = Vector
Point +/- Vector = Point
Point * scalar = ERROR
Vector * scalar = Vector
Point */x Point = ERROR
Vector * Vector = scalar
Vector x Vector = Vector
I work in games where these values are extremely common and 'accuracy' wouldn't be very descriptive in a lot of circumstances: explosion radius falloff damage, water flow strength, positional/rotational lerps or easing, and more.
I wish I were commenting here with an answer, but I don't have one. "brightness01" is a common naming convention for values of this type in computer graphics programming, but niche enough that it got raised in review comments by another gameplay programmer.
That's a very good observation. We still could need a (new) term for this common type. Maybe floatbit, softbit, qubit(sic), pot, unitfloat, unit01 or just unitinterval as suggested?
This begs an interesting tangential question: Which programming languages allow such resticted intervals as types?
type percentage:=int[0,100]
type hexdigit:=int[0,15]
…
since this might be overkill, sane programming languages might encourage assert statements inside the functions.
I think this is right, but it's still IMO basically a natural language semantics issue. For instance in haskell (which has a pretty advanced static type system), I would still probably be satisfied with:
-- A float between 0 and 1, inclusive.
type UnitInterval = Float
foo :: UnitInterval -> SomeResultPresumably
foo accuracy = ...
i.e. I think the essential problem in the SO question is solved, even though we have no additional type safety.
A language without type synonyms could do just as well with CPP defines
Looks like it’s actually possible to string something like this together in Python; custom types are of course supported, and you can write a generic validation function that looks for your function’s type signature and then asserts that every UnitInterval variable is within the specified bounds.
You’d have to decorate/call manually in your functions so it’s not watertight, but at least it’s DRY.
Nothing wrong with the name ZeroToOneInclusive. Seems like a great type to have around, and a great name for it. UnitFloat or UnitIntervalFloat or other ideas ITT are cuter but not much clearer.
It's a succint way to say "No, not types that will automatically work as primitive types (which normally the variable passed for 0 to 1 would be), and that will work with numeric operators".
Or in other words, a succint way to say "Technically yes, but practically useless, so no".
In Elm (and many other languages, I assume, I'm just most familiar with Elm) there's a pattern called "opaque data type". [0] You make a file that contains the type and its constructor but you don't export the constructor. You only export the getter and setter methods. This ensures that if you properly police the methods of that one short file, everywhere else in your program that the type is used is guaranteed by the type system to have a number between zero and one.
-- BetweenZeroAndOne.elm
module BetweenZeroAndOne exposing (get, set)
type BetweenZeroAndOne
= BetweenZeroAndOne Float
set : Float -> BetweenZeroAndOne
set value = BetweenZeroAndOne (Basics.clamp 0.0 1.0 value)
get : BetweenZeroAndOne -> Float
get (BetweenZeroAndOne value) = value
You would just make the constructor return a possible error if it's not in range, or maybe some specialty constructors that may clamp it into range for you so they always succeed.
It's the same question of, how can you convert a string to a Regexp type if not all strings are valid Regexps?
Does anyone know of a good site for these kind of questions other than the English stack exchange.
I frequently run into programming related naming issues (who doesn't eh?). But I struggle to find accurate search terms to help answer them...and the results are usually downed out by non-technical related language Q&A's
E.g. I was trying to name a table yesterday that would store events related to boxes, pallets, containers and vessels and was looking for a generic name to group them, e.g. goods_events, entity_events, object_events, domain_object_events etc. but I had no idea how to phrase my question and not get a bunch of junk back
Would be awesome if you could tweak some suggestion algo to be trained on repo's specific to your domain and have it spit out suggestions based on human language questions, e.g. gpt3 but focussed on some domain.
It is the English Language and Usage Stack Exchange (for linguists and etymologists), as distinct from the English Language Learners Stack Exchange (for people learning English), and indeed from the Computer Science Stack Exchange (for computer scientists) and the Software Engineering Stack Exchange (for software engineers).
So the question to ask onesself is whether one wants answers to a programming language question on how to name a variable from an audience of linguists or from an audience of software engineers. (-:
Woo, that's a tough one. Even as a native speaker, I'd be hard-pressed to find one word in English that describes all these similar but different terms.
I looked up some of these in a dictionary for definitions and synonyms. OK, the word "holder" seems to be the most general term that includes all these types of objects.
So I'd name it "holder_events_table", with a column "holder_type" being box, pallet, etc.
This issue of finding precise naming is related to "ontologies", how to establish an organization of agreed-upon terms to classify objects in the world. I agree with you, it would be valuable if there was a community-developed reference where we could search for the most appropriate names of things.
yeah I ended up leaning towards something similar, but avoided the whole issue of naming the "thing" by instead naming it based on the class of events contained in the table...ended up calling it tracking_events, and making it reference the "thing" as trackable_type,trackable_id (ala polymorphic relation in Rails world).
but yeah, naming things...wonder how much time I've spent pondering over names in my career (probably too much haha)
If you aren't leaning on the English Language at large, then you should ask the community of people who will use your software. Who are your stakeholders?
If you don't have a relationship with your stakeholders because your company has paid lots of money to become Agile[1][2], then you are indeed in trouble.
Names of types usually refer to individual members thereof: number, integer, vector, string, list… A variable of a UnitInterval type would one that holds unit intervals. This is no fix.
"Belongs to the unit interval" is correct, but I like to think most of us have gotten enough exposure to cognitive science to know that the brain prefers to have a word or adjective for everything, and that's the problem here.
Also, Unit Interval is only an answer to this exact question, not a class of questions to which this one belongs. If the range was [0,5] then you couldn't even shorten it with Unit Interval.
IEEE 754 sort-of has that as a part of a float, too (https://en.wikipedia.org/wiki/Significand), and people use the term “mantissa” for it, but it has to special-case 0.
⇒ barring better suggestions, I think I would stretch the definition of mantissa a bit further.
Thing is, whilst a mantissa is indeed between 0 and 1, not every number between 0 and 1 is a mantissa. Specifically, a mantissa suggests the existence of a significant.
One problem with the word normalised is that it implies that there is some true original value from before normalisation that isn't necessarily between 0 and 1. Of course if that's true then the word is ideal, but if not then it's confusing.
Genuine question, i'm no math expert: if you're using the value by multiplying it with some other value, does that not imply it qualifies as normalised?
i.e. say you have a width of 500 and you want to move half way across so you have this value of 0.5 to get 250. By dividing 250 by 500, aren't we in fact normalising it?
> if you're using the value by multiplying it with some other value, does that not imply it qualifies as normalised
Certainly, "normalisation" can mean something more general than "rescale to the interval [0,1]".
For example, "rescaling to the interval [0, 20]" might make sense in some contexts and would usually count as a type of normalisation, and would still involve muliplying by a number. But that would be multiplying by (20 / max possible value) rather than the typical case of multiplying by (1 / max possible value).
So normalisation can mean something more general than the most usual common case, but it's not just any old "multiplying it with some other value". It has to specifically for the purpose of rescaling to some fixed, more useful/sensible range.
> say you have a width of 500 and you want to move half way across so you have this value of 0.5 to get 250
This is a great example of multiplication that isn't normalisation! You've multiplied by 0.5 to get the midpoint, but that process isn't normalisation because you've not ended up with some more sensible range. You started with [0,infinity) (the set of all possible widths) and ended up with that same infinite range.
> By dividing 250 by 500
Hang on, I'm confused about your example: are you asking about 500*0.5 (=250) or 250/500 (=0.5)? If it's the latter than that's normalisation, and not even something fancy or general but classic linear rescaling to the interval [0,1]. Yes, that's certainly normalisation.
----
Going back to my original comment: I was talking about values that were in the range 0 to 1 that hadn't reached that range by being multiplied by anything at all; they were just naturally in that range to begin with. For example, a probability would fit this bill.
And so, even though we aren't necessary calculating the normal in the second equation, we can still name our variable a 'normal' as rearranging proves that is indeed what the value is.
I think generally normalized quantities maintain their original units whereas in your ratio case the unit is dropped. Units in the abstract sense I guess, such as this is a measurement along this vector or this is a ratio of any vector.
I think it's the opposite way round: normalisation results in units being dropped.
The typical, most common, definition of normalisation is value ÷ max possible value, giving a result in [0,1]. (More general definitions of normalisation exist e.g. if you rescale so the standard deviation is a fixed value, or even use non linear rescaling, that could count, but never mind all that.) The parent comment's example of "position along width ÷ total width" certainly fits that bill.
Whenever you divide something by the max of that something, the max is going to have the same units as the original value and you're bound to end up cancelling them. Or put another way, if you rescale 10cm into 0.5, it's certainly not 0.5cm so the units are either dropped or, at least, changed e.g. you could argue you've got 0.5x where x is the unit equal to 20cm.
Hmm when I think of 'normalizing' I don't think of dividing by a max at all - in my experience it is more taking a quantity (perhaps in English measurement) and transferring to a more 'standard' unit (say metric).
In general I don't think normalization always includes a sense of being in a bounded interval. From a mathematical perspective you could perhaps say normalization is achieved by multiplying your quantity by a 1D operator. You can't change the dimensionality this way, but are certainly changing 'units' a la mm in the x direction -> m in the x direction for example. I guess what I'm saying is that 'normalized' and the like are not the best fit for the SO question.
If I could take my own shot at the SO challenge from a mathematical perspective it would perhaps be sigmoid. Where the result of a normalization function takes a 1D value and maps it to a similar 1D value, the sigmoid takes a 1D value and maps it to a similar 1D value between (0, 1). So if I want to drop the previous information and only keep the resulting map, I can say 'this is my sigmoided value' - IE it is impossible for it to be outside of that range. Unfortunately sigmoid also connotates a differentiable curve which is extraneous information...
Of course the literal meaning of "normalisation" is to make more "normal", and that can mean almost anything at all. Even the Wikipedia article you linked to starts with "normalization can have a range of meanings". If you have heard that word most used with one meaning and I have heard it most used with another meaning then that doesn't invalidate either of those definitions.
The Wikipedia article you linked to gives two very broad definitions in the lead. The definition covered in the first paragraph is "adjusting values measured on different scales to a notionally common scale", which seems to be what you're talking about.
The definition covered in the second paragraph is "the creation of shifted and scaled versions of statistics", in particular "some types of normalization involve only a rescaling, to arrive at values relative to some size variable". I'm not comfortable with the use of "of statistics" in that second definition: the very first example is in the article standard score [1], which is about a rescaled element of the population, not a rescaled statistic. In any case, outside of statistics, a rescaling is a common meaning for this word, and a rescaled statistic is clearly just a special case of this. I think it was clear from the context that we were talking about this meaning originally. By far the most common case of this is a linear rescaling (including translation) to [0,1] but I was already up front that this is just a special case.
As for your sigmoid comment, I may have misunderstood but it sounds like you're saying that if you have a variable in the range in [0,1] then it can be described as the result of a sigmoid function. My objection to this is the same as my original objection to calling such as variable "normalised": it is a confusing variable name to use unless you actually did get it by applying a sigmoid function to something, not just because it holds a value that could hypothetically be obtained from a sigmoid function (but you didn't).
That's what I use, too: "normalized floats." I would prefer a better word, though. I work a lot with audio, and "signed normalized floats" doesn't exactly roll off the tongue.
This would be like calling numbers constrained between 0 and 100 "multiplied". Yes, in some common situations you normalize data and it ends up between 0 and 1. And in some common situations you multiply and end up with numbers between 0 and 100 (like percentages).
But normalization doesn't always result in numbers between 0 and 1 and multiplication doesn't always result in numbers between 0 and 100.
Similarly, not all numbers constrained between 0 and 1 have been normalized and not all numbers constrained between 0 and 100 have been multiplied.
Add in the fact that the normalization statisticians most commonly use is z-score standardization (subtracting by the mean and dividing by standard deviation, resulting in data centered around 0), and you're going to end up with a lot of confusion. In fact, this example highlights that while normalization does mean "to scale" it doesn't mean that the result will always be bounded to a particular interval.
Perhaps I'm misunderstanding you, but with floating point, denormalized/denormal numbers aren't just numbers with a magnitude less than 1, they're numbers so small that it's no longer possible to adjust the exponent in the usual way, so leading zeroes are used in the mantissa. This reduces precision.
That's a good suggestion. 'Normalizing' is also used to refer to taking a vector and scaling it to make its length equal to 1. Google tells me it's also used in an analogous way elsewhere in maths.
I don't think "normalised" is specific enough. Normalising can refer to division by values other than the maximum. I'm also not convinced it implies non-negativity.
That isn't the right term. The standard meaning of "normalisation" in most scientific and engineering domains is shifting and scaling to get the sample mean to 0 and sample standard deviation to 1. I.e. it will almost always include both negative numbers and numbers larger than 1.
I use this term for https://github.com/VCVRack/Rack. Sometimes I say "`x` is a normalized float between 0 to 1" to somewhat reveal the definition of the term.
You could also call that centred to clear up confusion, or mean-subtracted to be more specific. I think if you're specifically dealing with statistical quantities it should be obvious, e.g. standardised if you also scale to unit variance.
Personally if I saw normalised I would assume scaled by a constant, not offset (as in subtraction).
I typically use "_fraction" to indicate this in floating point variable names, along with a comment explicitly mentioning the range from 0 to 1 somewhere. Seems to minimize confusion pretty well.
I’ve also seen “fraction” used for this concept in Apple/NeXT APIs. E.g., “-[NSImage drawInRect:fromRect:operation:fraction:]” and “-[UIViewPropertyAnimator fractionComplete]”.
Shoot. I thought "Proportion" was the One True Correct word, but it's not.
Float values between 0 and 1 are very very useful, and I use them all over the place, and I actually thought everyone did (and everyone called them 'Proportions')
Unlike percentages, you can just multiply them with the number you want the 'proportion' of,
One of the answers was edited to “portion”, which is the same idea, I think either portion or proportion is elegant and simple. If anyone has been using proportion then I’d stick with that, though portion is nice and succinct.
In general usage, 'proportion' is not limited to the unit interval [1][2]. Offhand, however, I cannot think of a use of 'portion' that does not refer to part of a whole [3]. I think this is the best suggestion so far.
I like it, but as I do not work in a Dutch-speaking environment, if I were to do this, someone would eventually come along and 'fix' it (assuming I could get it through review in the first place.)
Not constrained to the interval [0,1], though: The change in the stockmarket was -5 percent or -0.05 perunage today, but +300 percent or +3 perunage over the last xyz years.
I like this idea, although I don't think English speakers would know how to pronounce "zadehan" by default. "Zadean" might be better. I don't think it will catch on, though, because "boolean" is really easy to say but "zadean" isn't.
I'd only use Zadehan if it not only was a value on the interval [0, 1] but a fuzzy membership value on that interval to which the Zadeh fuzzy logic operators apply.
If it was a probability value to which Bayesian operators apply, Zadehan would be a singularly inappropriate name for the type. Types aren't just ranges but they also define the valid operations (whether syntactically functions, methods, or operators) on the type.
Bayesian probability is, from a certain perspective, a fuzzy logic but the operators of Bayesian logic are not the operators Zadeh’s fuzzy logic. Since types define operators as well as values, I'd argue that a “bayes(ian)” type should be distinct from a “zadeh(an)” type, even if they might usefully have a common “fuzzy_membership” supertype.
"Zad" means "rear" or "behind" in slavic languages and it's closely related to "butt" and "ass". So the appeal of your suggestion, as clever as it is, is definitely not universal :)
I think I like portion. It is one of the only nouns that suggests an upper limit of 1. The other contender is 'unitized'. I don't like normalized because it feels too much like something that was actively normalized.
I would actually call this a bit of an antipattern.
`accuracy` seems fine to me -- although a more descriptive function name than `FuncName` might suggest a better parameter name as well.
Digging through the dictionary to find the perfect word means that whoever reads this code is likely going to have to do the same -- why would you ask them to do that?
If you aren't referring to a common mathematical or physical concept, and the word you need it isn't a part of your domain language, you're better off with "accurate but slightly vague" over "almost precise" or "precise but obscure".
It seems like the author's need would be better met by readable tests, guard conditions, and -- if this is part of an API -- documentation that describes how `accuracy` is used.
If you can't express it using the features of the language itself (custom domain-specific type, compile contracts?), limit yourself to putting this info into the docs, but keep naming simple.
Don't come up with fancy names. One of the commenters already declared (jokingly??) they'd "call it PropUnity, for proportion of unity, unity being 1".
Holy smokes Batman. As a code maintainer, I imagine encountering "accuracy" (even though it isn't fully accurately named itself) would confuse me somewhat less than "propUnity".
Weird that it's been flagged as off-topic with no explanation. "I'm looking for a word" seems very on-topic for an stack exchange about a human language.
It seems every StackExchange site is overrun by sophomoric gatekeepers, searching for (& competing with each other?) to find any possible reason to shoot-down questions.
It's so much easier to hit that 'downvote' or 'close' button than to actually try to understand the asker, or compose a meaningful answer, or even explain clearly why a question might truly need work.
So, I see a lot of 'closes' from people who hardly ever answer any questions, and plenty by people who simply don't understand the question's domain enough to see that yes, that is a meaningful question with a compact, well-matched answer. And once closed, it seems no one really looks at 'reopen' requests, even after the question text improved.
Not speaking to this question... but I largely don't even participate in stackoverflow anymore as most questions seem to be what should be an RTFA, have an existing answer, or homework help for what is way beyond the scope of the site.
It's frustrating when 9/10 of the questions aren't even deserving of an answer (at least that's my opinion) for the above reasons.
Gotcha on the edit... I saw a question that was more pertaining to documenting what a parameter is, which seems to be a valid question. Like other answers, portion or proportion seem like the most appropriate answer as part of a whole.
Anyone inclining towards the view that comments are harmful, as good code is always self-documenting, should ponder the amount of debate that has already gone into this one tiny issue.
Perunage, yes. Lamentably that is not restricted to [0,1], just as a percentage isn't restricted to [0%, 100%] (look at the stock market return over the last few days, and the last decade).
I've called this 'unit real' in the past, but wasn't happy with it. 'Portion' is good, from one of the answers. ('Proportion' and 'fraction' unfortunately both suggest a ratio of integers instead of a unit real. It'd sure be convenient if these two different words weren't fixed to that same meaning.)
For neologisms, 'tweening' comes to mind. (Think of the coefficient of a convex combination. 'Tweenness' feels just a little too long.) 'Fractile' from the stackexchange thread would also work.
Booleans have the same problem that we lack a plain English short word for "yes or no answer". I think we ought to start pushing 'bool' or 'boole' out of our jargon and into common English.
>'Proportion' and 'fraction' unfortunately both suggest a ratio of integers instead of a unit real.
I don't mean to pick on you in particular abecedarius, but I find it amusing to be worried about signalling that irrational numbers are OK given the context is computing and all the numbers are rational anyway.
You got me there, but I think this kind of discussion is "you know what I mean" territory: ratios aren't limited to 0..1, and the central examples the terms evoke involve small integers.
A fraction or proportion in general doesn't have to be representable as a ratio of natural numbers. That's called a "common fraction", and as it happens, floating point representation stores the value as a ratio of natural numbers anyway. It's like asking someone to cut you 1/pi of a pie. It isn't total nonsense, but it is weird and you're going to get an approximation. They might get confused by your odd request and give you half a pie instead a third of a pie though!
I understand, but context typically gets you the rest of the way. We're talking about natural languages after all; this is about as good as you can hope for.
Is there a dictionary or thesaurus or something similar specifically designed to help programmers name things? I run into questions like this all the time. “There has to be a commonly used word for this concept, but I don’t know what it is”
While percentages are denoted as numbers between 0-100, they actually represent numbers between 0-1. Though you can have things like 120% so it's not perfect.
Really the percentage sign just denotes a centi-1. Just like a centi-meter is 1 hundreths of the unit 'a meter', a percentage is 1 hundreths of the a (unitless) unit '1'.
Not really. The often represents, for instance, a population of 100M, then 0 means 0, 1 (or 100%) means 100M and 1.2 (or 120%) means 120M (in case of population growth).
We are really arguing unimportant semantics here but I don't agree with you, when you say 50% of the population, you are saying 0.5 of the population or 0.5 of 100M. The 50% always represents 0.5 when used correctly, in fact the % sign is a different way of writing "/100" (with some artistic freedoms).
In my code, rather than use a generic term to describe the number, I try to name what it actually represents in context. For example, if I'm scoring items so I can rank them, and I need to normalize some number to do that, I'll call it a "score" or a "rank".
Calling it a "proportion" or "ratio" or "normalized" is like calling a variable "myinteger" or "myfloat". Your naming convention should be related to a variable's purpose, not its data type.
Is there a resource for "help naming stuff"? Naming is hard. Especially so when you dabble in domains you're not learned in. I have almost no math background so sometimes I really struggle to know what to name things.
My latest example is, "I need to provide a flag to switch if this element is positive or negative. What do I name the flag?" (came up with simply: sign, or polarity)
I don't like "ratio" because, though this might just be me, I already have a bad habit of assuming, incorrectly, that every ratio in my code is meant to stay smaller than +1.
FWIW, that's not dependent typing, because the type doesn't depend on run-time values. A bool is either true or false, a [0...1] is either 0 or, ..., or 1.
Isn't that just the decimal part? In my mother tongue (Serbo-Croatian) it's common to call this just the decimals - e.g. in 12.34 the decimals would be 34... I'm really surprised that English doesn't have a similar short name for this?
Would that include 1 however?
If you take the decimal part of something then there is no difference between 0.00 and 1.00 which is I think part of the problem here
Related question: what would be the best way to store such a value? A float would only use the mantissa-bits, so it would be rather inefficient. Perhaps as a uint16 with a representation of uint16 / (2^16)?
I'm not an expert on the ins-and-outs of floating-point numbers, but I've always heard fp types have a disproportionately large amount of their accuracy at values < 1. My guess is that it's not that wasteful sticking with floats or doubles.
What I do wish the languages I used had support for, is number types that the user could clamp to a specific range. It would be nice to have my code throw an error if some function resulted in clipped audio, for example. I mean, without the programmer writing any extra code, of course.
You're likely thinking of subnormal or denormal numbers, which helps floating point representation better represent tiny numbers near zero. The typical floating point number is normalized so that it's 1.xxxx*base^exponent. Since the 1 is always there, it is made implicit as a rule and only xxxx and exponent is stored in memory. There's a limit to the range of the exponent, though, and so for very tiny numbers near zero, the exponent is capped at its smallest value and the rule changes to have an implicit 0 instead of 1. Then, you can get a few more orders of magnitude 0xxx, 00xx, 000x, etc. You gradually trade significant digits for more negative exponents.
I don't think it really helps for the interval 0 to 1, but rather for the interval 0 to 0.000000000000000000000000000000000001 or whatever. It doesn't give you more significant digits, but rather a gradual degradation of significant digits as they get used for exponents.
Subnormal numbers are a bit controversial because of their relative utility vs. added complexity to implement them, especially in hardware.
It depends on how you intend to use it. For example, in digital signal processing (audio), most processing is done using uint32 values. It would make no sense to receive the value as a float, only to convert it back to int before using it.
Case in point: most software volume controls allow to specify the output volume (or rather, control the attenuation) using the full uint16 range. In this case, the entire volume range is expressed from 0x0000 to 0xFFFF. But in practice, this is just an amplification (multiplier) value between 0 and 1.
Because chances are, I am going to be multiplying other floats by this number. And the performance penalty (not to mention the maintenance penalty) of converting back-and-forth between whatever you choose and floats will probably out-weigh the gains of storing it efficiently.
That said something like a uint would probably be most efficient if you did care. At that point, allignment issues also start cropping up performance wise.
This is about as good as it gets for the concept, but lacks as a name. I propose boof.
But actually there's two "natural" hardware implementations for this. First is like floating point with no sign bits (mantissa is positive or zero, exponent is negative). The other is fixed point, basically re-interpreting unsigned ints as being implicitly divided by 0xfff...f -- both have advantages and drawbacks. And then comes the operations. Multiplication is the only closed operation -- addition, subtraction and division need special cases: should they be clamped, or wrap around? Both, probably -- wrapping is useful for trig functions, for example.
My arbitrary suggestion for what I sometimes use in coding is "stage". As in "what stage has the animation reached?". At least it's short! Crops up a lot in gaming and graphic code.
I would love to have a symbol that represents this unit scale as an analogue to '%'. I.e. you could say 33% or 0.33x where where 'x' indicates a normalized decimal from 0.0 to 1.0.
Interesting, although that goes the other way, indicating different degrees of implied scaling (rather than none). If you were labeling a header or axis on a chart, or naming a variable you'd probably include the word "decimal", "fraction", or "scalar".
Since they didn't want to document, I think `float Accuracy0to1` is a nice compromise that most people would understand instantly. Add `Incl` or `Excl` if you really need to disambiguate that.
What I was taught in school is that "percent" doesn't refer to 0 - 100, it refers to 0% - 100% which is equal to 0 - 1 (not really a conversion, actually equal).
As 0% and 100% aren't floats (they're more like formatted values), seeing AccuracyPercent in the question had me thinking there was nothing more accurate.
So if you build a tax calculator and add a field with the label "Please add your tax rate in percent" you expect people to enter a value between 0 and 1?
No, but you're defining a unit here; I'd expect people to put in values as 0..100 with unit being percent. So if a person puts in 20, and you'll want to multiply it by 3, you'd not expect 60 as end result; you'd expect 60% which is 0.6
You're still raising valid point with the ambiguity - if I were to see a variable named `population_percentage = 0.01` I'd still need to ask the author if the intended value here is 1 or 0.01, which I'd say disqualifies this naming.
The real answer to this XY is comments and documentation, imo. If you want to make it obvious without reading docs or code, get an IDE with doxygen-aware completion.
Edit: also, put it into typename, not into argname.
From a practical point of view, a new/academic word would still require serp lookup in most cases. If it were in a type, like float01, one could easily jump to the function definition and then to the type definition, described by a comment. In the case of argument naming, there is no place to jump to, only googling remains.
Yeah, I like that. Or fractional_ratio. With part_ratio tho I'm still not sure which part... (the lesser or greater than 1) part. But I like it because it's shorter. Frac_ratio feels like cheating a bit.
But I guess a number that's always between 0 and 1 is an integer reciprocal. But again that's probably getting too deep for this.
I guess another way is UnitIntervalPoint or ProperFraction
I guess in Java we don't have to choose
FractionalPartRatioForAccuracyNormalized hehehe
I suppose we could make up a word
bition, or bitum
anabit or cobit - analog bit (continuous on 0 to 1)
proratio - proper ratio
Making up a word seems easier than the rest because it's more concise. :)
> “ Most things I can name easy enough, such as "Name" or "URL" or "MaxSizeN". The first 2 are self explanatory, the last one is the maximum size of something in relevant data units(as opposed to bytes or some other unit) but that one is easily understood by other programmers too.”
Programmers who think like this frighten me a lot. Those parameter names are not self explanatory at all, and cannot be assumed readily understood by developers picking up the code.
If you think “this is obvious” you need to add “to me” on the end, and “not necessarily to other people.”
This can go off the rails when people start making braindead claims that code should be self-documenting and need for comments or docstrings indicates poor code. But it can also just lead to bad documentation where you think a word is clear enough and you don’t realize others will cone into that segment of code with totally different use cases or contexts in mind, assumptions about the code’s usage constraints, different levels of skill and experience, different levels of comfort with the natural language that comments are written in.
Over-communicating within documentation is a hugely beneficial thing. It only costs a small amount of hassle for the person writing it and a small amount of hassle dealing with the eyesore of it for people already familiar, but it saves so much cognitive load for everyone else.
The main downside is keeping it up to date, but the false adage that wrong documentation is worse than no documentation is no excuse.
I'd personally go with "percent", but because there's so many correct options, it shouldn't matter as long as it's consistent in the code base. Otherwise it just becomes bike-shedding bait.
Again, this is bike shedding bait. I'm going by my own experience with basic mathematics in my American education where "percent" is something you multiply a number by, and usually ranges from 0 to 1.
Ex. 50% of 12 marbles is 6 (*.5, 50%).
But just looking at all the different answers in the comments here means there's no consensus on a "right" way. Hence bike shedding. A decent code base should be consistent so it can focus on more important things.
fwiw, this post was inspired by a twitter thread i created yesterday (https://twitter.com/v21/status/1301451641699340288), where i collected examples of people getting annoyed that this term doesn't exist... and most of those were inspired by frustration at this being called a percentage when it wasn't. so if you want a nice collection of people stubbing their toe on the word percentage, it's out there.
Technically even that's incorrect - percentages can be negative and greater than 100 as well.
A doubling expressed as a percentage of a given baseline would be 200%; economic growth during a recession is typically expressed by a negative percentage.
It's still not really "correct" IMO, I think it needs to be "UnitIntervalValue" as someone said above. I'd expect that things of type "UnitInterval" would be intervals, not values. "UnitInterval" would be like calling the type "Int32Range" instead of "Int32".
(NB: "In real life" I would never "correct" someone for calling it "UnitInterval", it's good enough -- but given this is a pure discussion of semantics...)
This is the correct answer IMHO. (The variable name could also be "proportion" or "probability" depending on context, and the type name could more precisely be ClosedUnitInterval, but the idea of putting it in the type is IMHO correct.)
I don't know much about C (? I assume that's what this is), but in other languages you can even build a wrapper type and enforce invariants in construction if really necessary.
Of course, all that ceremony only makes sense if this is really a type you use multiple times. Otherwise I wouldn't bother.
ClosedUnitInterval surely describes the interval itself, not a member of that interval. The member would be ClosedUnitIntervalValue or something. This is the thing it would be nice to have a short name for.
Just as we say an integer is a member of the set called the integers.
We say our "X" (the word we are looking for) is a member of the set called the closed unit interval.
EDIT: Some of you asked what about languages that don't support such more constrained types, so to answer all of you here: different languages have different capabilities, of course, so while some may make what I proposed trivial, in others it would be almost or literally impossible. However, I believe most of the more popular languages support creation of custom data types? So the idea (for those languages at least) is quite simple - hold the value as a float, but wrap it in a custom data type that makes sure the value stays within bounds through accessor methods.