It looks like it's a metric constructed to justify Range-ish voting systems, a la Warren Smith's Bayesian regret models.
It performs the same core sleigh of hand: it takes a very simple view of strategic voting, assuming a fixed share of the electorate vote "strategically" regardless of election method, and making big assumptions about what's a good strategy.
That is, to put it mildly, not a reasonable assumption. It varies widely how much information about other people's preferences you need to "vote strategically", and the strategies also vary widely in how risky they are. If I have false beliefs about other people's preferences, how likely is it that I shoot myself in the foot with "strategic" voting? And how likely is it that we get caught in a bad equilibrium where our false beliefs about the other side's preferences reinforce each other?
All this makes the sort of modelling attempted here a vain pursuit. You just can't model strategy this naively.
But some methods come MUCH better out of this hopelessly naive modelling of strategy than others.
For instance, Approval voting requires very little information about other people's beliefs to vote strategically - you just need to know who the top two contenders are - The strategy is to approve the "least evil" of those two, plus any from the rest of the field you prefer to those two. It degenerates to pretty much the same problem as Plurality voting, with maybe slightly better chance to get out of bad equilibriums in the long run.
And Range/Score voting is even more extreme: you can vote strategically there even knowing nothing about other people's preferences. Just always use extreme values (0 and 100).
By comparison, strategic voting in Condorcet systems requires really exact information about the electorate's full range of preferences - and even then, you usually need to coordinate your strategic voting without the opposition getting wind of it, otherwise they can counter it. It's just not practical.
(Likewise, Borda gets a bad rap in these comparisons. It's not a good system, but the likelihood that "strategic voting" will backfire horribly means that in all likelihood, voters would not pursue those strategies. )
> It degenerates to pretty much the same problem as Plurality voting, with maybe slightly better chance to get out of bad equilibriums in the long run.
No, in practice, because the top two contenders are likely to be two who are actually popular.
Plurality voting typically finds an equilibrium where the "two most popular" candidates are both extremists.
http://zesty.ca/voting/sim/ has some nice plots showing how the two most popular candidates change from centrists under approval voting to relative extremists under plurality. (I advocate IRV because when in doubt the voting system should degenerate into random chaos. 1 person in 100 shouldn't be a big enough margin to secure power as a principle either)
Probably the most important dimension, time, isn't ever included in any of these mathematical evaluations of voting systems.
Strategic voting doesn't just happen once, people apply a strategy over multiple elections and can exert influence before the election even happens. In a two-party system the party in charge knows they'll be punished for egregious transgressions - even by some of their core supporters - leading to opposite policies, whereas in multi-party if one party is caught cheating for instance they know voters will defect to a party with similar policies. The parties also act strategically.
If they want to use a satisfaction model it should be lifetime satisfaction, how well the system represented people overall. This may be basically impossible to model without a complete understanding of human nature, but it matters in the real world; if your country gets overthrown or ultimately makes poor decisions it doesn't matter how satisfied you were with the candidates at one point in time.
Condorcet methods performed well in this analysis, even under the various strategy models. Although it says that STAR performed slightly better under strategic voting, its scores are still very high. The main reason given in the text for preferring to highlight STAR was a (common) belief that the Condorcet methods are too difficult to explain and that explainability is important.
Personally I think the Condorcet methods are not that difficult to explain. The various ways of resolving not perfectly clear situations are a little involved, but everything else makes perfect sense. I built a visualisation a few years ago that created a directed graph of results, and then analysed the strongly connected components.
Having said that, I hadn't come across STAR before and I think it makes a fair amount of sense. Any of these high quality methods are enormously better than those used in lots of countries.
> Personally I think the Condorcet methods are not that difficult to explain.
To the voter, no. It is just a ranked choice/ordinal system (which is why people get mad at IRV being called RCV). Just rank your candidates in order of preference. People naturally understand this, yes. BUT that's not what people typically mean when they says the system is hard to understand. What matters is transparency and how easy it is to tabulate the votes. Actually this is where all ordinal systems fall in comparison to cardinal systems. In cardinal systems just just sum some columns (in STAR you have to do this twice). But in ordinal systems you have to do many rounds of counting and eliminations. Look at the current NYC mayoral race. We have only one round counted and they have like 10 more days before round 2 (which is funny because IRV results typically follow the first round results, we expect Adams to win). So look into tabulating Schulze or Ranked Pairs. Condorcet methods require you to run n^2 election because you have to run every pair. Now consider contested elections like what is going on in Arizona. The more complex of a counting system is the more either end can be abused (the original vote or the verification). If it is sufficiently complex a group will easily be able to justify foul play regardless of actual proof because the evidence against the claim is nuanced.
So yes, it isn't hard to explain to voters, but it is difficult to explain how you count the votes. But VSE is only one metric. Like you pointed out max(STAR) is only slightly less than max(condorcet) but min(STAR) has a decent gap over min(condorcet), which also matters. The smaller range on STAR suggests it is more resilient to strategic voting (or voter manipulation). But cardinal systems are not only easy to explain to voters but it is also easy to explain how votes are counted. I should also note that Approval/Score/STAR are monotonic and will not spoil elections (IRV increases spoilage where cardinals decrease). I'll also suggest that if you're a data person you should be in favor of Score and STAR because they are more expressive than any ordinal system because the distance between your preferences isn't constant and thus we can get more fine grained data on voter preference.
If we only look at the VSE metric, condorcet wins, but what makes voting complicated is that this isn't the only metric.
I do like STAR after having read this article, and I agree that it's easier to explain, but I'll defend my saying that condorcet isn't that hard:
Those virtual elections are not difficult to run (electronically) or explain though: "we use the ballots to work out for every person how they would do in a one on one election against each other person".
And the process for running the elections is straightforward too - count the number of ballots where one person appears higher in the list than the other person. There's no room for argument here, unless you're arguing about the ballots, which is something you can do with any system.
All of that stuff seems pretty straightforward to me.
Now, it is the case that the various ways of resolving situations where you have cycles is somewhat complex. I think that's just because those situations are genuinely complex, but there are things that can be done here to choose a relatively explainable one and work out good ways to visualize and explain it.
Finally, while cardinal systems can express the distance between preferences, there are ways an ordinal system can express things hard for a cardinal system. In an ordinal system you can allow a ballot not to include someone and have that mean 'no preference' between that person and all other candidates (perhaps because of lack of knowledge), while in an cardinal system an unscored person would have to be counted as having an actual score. There's also the question whether preferences really are consistent enough for a cardinal system to accurately reflect them - you might like X three times more than Y, and two times more than Z, but does that necessarily mean that you like Z one and a half times more than Y? Personally I also think there's a big difference between preferences between two candidates you'd be happy to win and candidates you actively don't want to win, and I don't think that difference can be captured by just making a scalar larger - not that ordinal approaches capture this either.
For the most part, yeah, Condorcet methods aren't that hard. But also consider who you and I are. We're talking on a very niche form where there's an expectation that everyone here can program and do calculus. We are not the average people.
But what I do want to stress is that cardinal systems are _easier_ than condorcet. I actually use approval frequently (it is also extremely natural). One case is how to solve where to eat as a group. It is infinitely scalable and you can just keep adding places as you think of them. If you reach a unanimous decision you stop, if you exhaust the list you take the maximal (baring strong preferences, which makes it kinda score but this works out naturally). I think from this example it should be pretty clear that it is much harder to do this sort of voting using any ranking system (nor is it scalable and is very computationally expensive to add new items).
It is often missed, especially by nerds (I fall in this camp too, I used to prefer Condorcet over cardinal), how important the transparency aspect of the voting system is. Arizona nailed the lid shut for me. If we're looking at it we're really not losing much VSE by using STAR over RP, but we do also gain a lot on every other metric (including spoilage, ease, transparency, and resistance to strategy). VSE isn't the only metric that matters and we shouldn't make large sacrifices in other metrics to gain a minute optimization in it.
> while in an cardinal system an unscored person would have to be counted as having an actual score.
This isn't necessarily true. You can modify cardinal ballots just like you do ordinal (the exact example you gave is not the theoretically perfect ordinal so I'm not sure why you compare to theoretically perfect cardinal). In practice people don't rate every candidate. In ranking you could effectively say that unranked candidates have a minimal score (and you are able to reduce the number of rounds through this, but comes with added complexity).
As to making the scale larger, yes this is correct that an infinitely large scale better encodes information. But there is a balance between that and simplicity. 10 is extremely effective. 5 is more than enough. But even 2 (approve/disapprove) seems to work out pretty well (aka "good enough").
The Condorcet methods do not require software to count up - you can easily replicate the results by hand on paper, it's just that there's such a lot to do it'll probably take you a while. In practice, you're going to want to count up electronically and then probably do a random sample by hand to check.
Condorcet is just needlessly complex and has worse behavior under high levels of strategy. Score voting (STAR, approval, etc.) is roughly as good and maybe even better, and dead simple, and very transparent.
The lesson of voting theory is to avoid ranked voting methods.
> It looks like it's a metric constructed to justify Range-ish voting systems, a la Warren Smith's Bayesian regret models.
Bayesian regret wasn't constructed to justify score voting (formerly known as range voting). Smith correctly used a utilitarian social welfare function, and just happened to find that score voting did best. This was by no means a given.
Indeed, the VSE calculations (conducted by a different mathematician, Jameson Quinn), used some different assumptions, and had STAR voting, 321 voting, and even certain Condorcet methods, beating score voting in some circumstances.
> It degenerates to pretty much the same problem as Plurality voting
It was modeled in a very sophisticated way, not "naively". The VSE figures even used all kinds of variations, e.g. asymmetric strategy where one side strategizes more than the other.
> And Range/Score voting is even more extreme: you can vote strategically there even knowing nothing about other people's preferences. Just always use extreme values (0 and 100).
And even if you do, it still gets better results.
> Borda gets a bad rap in these comparisons. It's not a good system, but the likelihood that "strategic voting" will backfire horribly means that in all likelihood, voters would not pursue those strategies.
That doesn't make any sense. Strategy, by definition, is a move which has a positive expected value. You seem to be conflating "bad aggregate performance under strategy" with "bad individual performance for a voter who uses strategy", which is deeply wrong.
> All this makes the sort of modelling attempted here a vain pursuit. You just can't model strategy this naively.
Exactly. Voting systems that have strong advantages to vote strategically should be analyzed in game theory framework (analyzing nash equilibria in payoff matrices defined by voter strategies)
Looking at the results, it's a wonder that IRV (aka "ranked-choice voting") gets more lip service than STAR and approval voting. These latter two seem obviously and measurably superior.
My guess is that it all comes down to a lack of education. Maybe civics ought to be a required course in high school and/or college?
In Seattle, many of us reached the same conclusion as you did. We’re now trying to change Seattle’s primary elections to use approval voting: https://seattleapproves.org/
Approval voting is extremely effective at capturing voters’ preferences, simple to implement, and incredibly easy to explain - not just the ballot, but how the ballots are tabulated and how the winner is chosen.
What is the simple way to pitch its advantage to IRV without using technical terms?
The spoiler effect is simple to explain because we’ve seen it. A hypothetical centre squeeze isn’t.
My belief is IRV opens the door to approval voting. (I am still sceptical of it, though I welcome experiments like the one your propose in Seattle.) When approval voting advocates oppose IRV in jurisdictions with plurality voting, however, it just splits the effort and leaves FPTP in place.
> What is the simple way to pitch its advantage to IRV without using technical terms?
Good question. It's probably that the tabulation works the way a voter thinks it would.
Neither IRV/RCV nor approval has a way to tell how much more you prefer your favorite to your #2 or #3 (something STAR/Score has, and kudos to it). What's that mean? They both have to assume. Most voters don't realize that IRV assumes massive gulfs between their #1, #2, and #3.
The example I give is Seattle's 2021 mayoral race, which has 15 candidates, at least 8 of whom are running serious races. With 8+ serious candidates, all sharing the left-of-center spectrum, there's tons of overlap. So, imagine you and a friend or spouse have similar views and support the same candidates, but rank them subtly differently. You rank A, B, C, your friend ranks B, C, A. In each runoff, your and your friend's opinions will end up competing with one another, even though both of you share the same opinions and support the same candidates.
(For anyone else reading, here's a longer version that I didn't write: https://psephomancy.medium.com/how-ranked-choice-voting-elec.... Ignore the title - the problem isn't extreme or not extreme, it's that the result doesn't reflect the opinion of the electorate.)
And the reverse is also true. IRV unintentionally rewards candidates who have their own dedicated constituency who always put their candidate #1. While it's not my role to moralize about that, choosing a candidate because a small group of voters vehemently support a candidate is not likely to make the most voters happy.
Since you brought up simple explanations.. a related simple benefit is that essentially everyone can explain how winners are chosen in approval voting. Very few people, even many active proponents of IRV, can thoroughly explain tabulation of IRV ballots. In an era when election processes are becoming politicized, a system that the average person understands well enough to explain to someone else is valuable.
I'm ignoring the most obvious easy-to-explain reason, that it does a better job of "reading voters minds." That's part of the original submission, though.
(I've now talked or exchanged emails with a fair number of people who thought they were RCV/IRV supports, and I learned that STAR delivers the benefits that many or most assumed that RCV provides. STAR lets you score every candidate on a 1-5 star scale, so you can rate candidates the same, rate them better/worse than one another, and critically, can decide for yourself how big each gap is. The vast majority of people who think they're strongly supportive of RCV/IRV would switch their support to STAR if they heard of it, and when I talk with them, they often do switch.)
> My belief is IRV opens the door to approval voting.
I support IRV as an improvement, but I'd need more evidence that IRV opens the door for approval - or that it doesn't. If you have evidence or at least a hypothesis how that would happen, I'd appreciate it.
For example, would IRV fail (what's that mean?), yet voters would not recoil, they'd decide they like the idea and just chose the wrong thing? Or would it succeed and people would look for something better? Something else?
My baseline is that it took us something like 200 years to get here (and 30 years since FairVote was created). If history is a guide, whatever we pick seems like it'll stick around for a while. That's a weakly held belief, though.
> When approval voting advocates oppose IRV in jurisdictions with plurality voting, however, it just splits the effort and leaves FPTP in place.
I'm probably the wrong person to ask about this, since I don't oppose IRV - I just don't support it anywhere near as much as I do approval or STAR. My take is simpler: Seattle doesn't have either right now, and if we're going to switch, getting something great on the ballot (an initiative to change to AV) doesn't require any more lobbying or fundraising effort than something good or okay.
A skeptic could have asked "But will it pass?" until a year or two ago. St. Louis just switched to approval last year (https://www.stltoday.com/news/local/govt-and-politics/overha...) and 68% of voters were in favor (!), and Fargo, ND switched so in 2019, so that's been affirmed. The Center for Election Science was only created in 2011, so for 9 years (3-5 years of which was discovery), that's a fast adoption curve.
Also, approval is the only one that I think delivers enough benefit to be worth a significant amount of my own time. That's an entirely personal judgment though.
(I'd have a different answer if AV fans were running an explicit campaign against an IRV/RCV initiative that was on the ballot, but again, that's not me. At least in Seattle, I don't think either one is more or less difficult than the other.)
The solution is simple: ban FPTP and give regions X years to come up with a replacement. We can have one big natural experiment on which alternative method is best.
One thing about the US for example, states get a lot of latitude on exactly how to conduct elections.
I just don't see any evidence that IRV can be considered a stepping stone to truly better voting methods. Looking empirically at the matter, the places that adopt and later repeal IRV don't adopt better methods. They revert to choose-one. (Or in the case of Burlington, revert to choose-one and then confoundingly re-adopt IRV). Are there counterexamples I'm missing?
> What is the simple way to pitch its advantage to IRV without using technical terms?
You have 3 candidates that you like in preferential order. A > B > C. But you like A only a little more than B and REALLY dislike C. Ordinal systems do not allow you to express this. When you score, you can say A: 5, B:4, C:0. Approval is just the same thing but with a smaller range and is easy to integrate into existing tools (I don't think that's a great argument though because we should rebuild these tools anyways considering massive security flaws we've consistently found).
> Looking at the results, it's a wonder that IRV (aka "ranked-choice voting") gets more lip service than STAR and approval voting. These latter two seem obviously and measurably superior.
They are obviously and measurably superior when you construct a measure designed around their shared features while ignoring the cultural variation in honest ballot marking for these methods resulting from the fact that the ballots for them ask a question that doesn't have an objective relationship to preferences.
> when you construct a measure designed around their shared features
dragonwriter thinks the metric being used in the article linked here was cherry-picked because it favours particular voting systems that the author likes for other reasons.
(I do not believe this, nor do I believe the similar allegation made by vintermann in a top-level comment that's currently the highest rated. Neither offered any actual evidence, and the metric used here seems obviously reasonable to me. What I could believe is if someone thinks that VSE / Bayesian regret captures the essence of what matters in a voting system, then they are likely to prefer systems like approval voting or range voting or STAR. I don't see anything wrong with that.)
> while ignoring the cultural variation in honest ballot marking for these methods
dragonwriter thinks that different voting systems, implemented in different cultures, will exhibit strategic ("dishonest") voting to different extents and in different ways, and thinks this is a problem not addressed in the OP.
(The linked article does consider strategic voting, of various different kinds, but it's certainly possible that it isn't realistic enough about what real voters might do. It would be easier to tell whether dragonwriter has found a real problem here, if dragonwriter had given some examples of variation in strategic voting that are important but neglected in the OP.)
> resulting from the fact that the ballots for them ask a question that doesn't have an objective relationship to preferences.
dragonwriter thinks that what voters primarily have is preferences: candidate A is better than candidate B who is better than candidate C. This is what an IRV ballot asks for. STAR or approval-voting ballots ask different questions (please give a score for how much you like each candidate / please say for each candidate whether you would find them acceptable); dragonwriter finds that unsatisfactory on the basis that voters will have to translate their preferences into answers to those questions, and there is no One True Way to do that.
(I disagree with dragonwriter about what's in voters' brains. At any rate, looking within myself, it is very common that I have a better idea of some candidates' acceptability to me than I do of their relative ranking. E.g., maybe there's the Nice Party, which I like a lot, the Mean Party, which I don't like much but concede has some competence, the Evil Party, which I really don't like but again concede has some competence, and a bunch of Crazy Fringe Parties, none of which I know enough about to rate them relative to one another, but all of which I consider obviously unfit to rule. With a range/STAR system, I might rate Nice at 100, Mean at 30, Evil at 10, and the Crazies at 0. Or I might rate Nice at 100, Mean at 70, Evil at 10, and the Crazies at 0. These are importantly different opinions I might have. A preference-ordering system like IRV doesn't let me express the difference between those, but would really like me to say exactly what my order of preference between the Crazies is. For me, range/STAR does the best job of matching the actual kinds of opinions I have about candidates; 3-2-1 is also pretty good; approval and preference-ordering are similar to one another and distinctly worse than those two; and of course plurality/first-past-the-post is hopeless.)
> dragonwriter thinks that different voting systems, implemented in different cultures, will exhibit strategic ("dishonest") voting to different extents
No, that's not it at all. I mean, now that you mention it, that sounds like something I would agree with, but its not either what I was saying here or particulary relevant to this discussion.
What I said here is about honest not dishonest votes. For ballots that do not ask a question with an objective mapping to preferences, honest ballots for the same preferences are a cultural variable. (This article implictly recognizes that honest ballots have no consistent meaning for approval, when testing it with two different consistent mathematical models, though that doesn’t really address the issue and it ignores the same issue for other methods to which it applies.)
For typical public elections, this problem occurs mostly with: ballots that ask the user to assign a score or rating to each candidate which is either more detailed than an unforced ranked preference ballot (a ranked preference ballot allowing ties) or less detailed than a forced preference ballot (a ranked preference ballot not allowing ties). Phrased another way, it applies to ballots that ask the user to assign scores or categories to candidates, where either the maximum number of distinct scores or categories is either more or less than the number of candidates.
There's fairly extensive literature on scoring systems (mostly outside of voting research, because they aren't widely implemented as voting systems), whether with a wide number (e.g., 0-100) or narrow range (5 stars) showing that without an objective touchstone, these are highly variable between individuals, in a way which strongly correlates with culture, in how they relate to what people actually think, on an absolute or relative scale, about the things rated, with the only consistent thing being that they tend to be consistent with ordered preferences, but the additional information (for score systems) or compression (for limited-categories) is inconsistent.
For that reason, approval/score/etc. ballots are, at best (that is, considering only personal variation and discounting any significance of the cultural correlations) just ranked preference ballots with either randomized noise or randomized compression, into which the associated tallying methods read false signficance.
EDIT: MORE that this applies to the use of these ballots in typical public elections of candidates. For other types of voting scenarios, the ballot markings involved may have concrete meanings that make these methods better, even ideal in some cases.
For instance, when planning a group activity for which you want maximum participation from the voter pool, approval where an “approve” mark is a binding commitment to participate if that option wins is an ideal method. Similarly, score voting where the assigned score is a binding commitment of that many units some scarce resource to the common project in the event of victory for the chosen option makes perfect sense.
Oh, I see. That interpretation didn't occur to me because at that point I hadn't noticed that you think of ordinal preferences as what voters fundamentally have, and of scores / approval as being derived from those. Again, I just don't think that's true.
(But: Whatever the cause, I misunderstood your position and consequently misrepresented it, and I greatly regret that.)
It sounds as if you are arguing that it must be true because if you ask people to score things on a scale, people from different cultures tend to give different patterns of scores for a given preference ordering. I'll take your word for it that the "fairly extensive literature" does in fact show that they do, but I don't see how you get from that to the idea that preferences come first and scores are some sort of fundamentally unreal construct on top of them.
A person's culture may well influence their values. That doesn't make those values unreal or illegitimate or mean that when we're trying to aggregate people's values (in order to choose a candidate, policy, etc.) we should ignore that influence. That's obvious (or at least I think it is) in the case of specific values -- e.g., if a country has in it a lot of people whose culturally-influenced values favour peaceful foreign policy, they will tend to elect governments that favour peaceful foreign policy, and that's how democracy is supposed to work.
I think it applies equally if what's being influenced by culture is some other more abstract feature in the pattern of values. If members of one culture are more easygoing in a way that makes them find a wider range of political candidates "acceptable" than members of another, then candidates liked less by the latter culture will do worse than otherwise-similar candidates liked less by the former under approval voting, because being disliked by culture B will lead to more "not acceptable" votes than being disliked by culture A. I claim that's the right thing: being represented by the candidate B dislikes will cause more distress than being represented by the candidate A dislikes.
Now, it's potentially a different matter if what's happening is that people in different cultures have the same actual values -- both think X is a bit better than Y who is hugely better than Z -- but some cultural factor makes people from culture A who feel that way mark Y as "acceptable" and people from culture B who feel that way mark Y as "unacceptable". If you're saying that that "fairly extensive literature" demonstrates that that happens, as opposed to the situation where people in culture B really do feel a bigger gap between X and Y, then I agree that that's an argument against scoring-style or approval-style voting systems. I would be interested in some pointers to the literature in question, if so.
But to whatever extent there are actual differences in values that go beyond mere ordinal preferences, it is simply not true that "approval/score/etc. ballots are at best just ranked preference ballots with either randomized noise or randomized compression".
It seems obvious to me that real people, whatever their culture, can have values that genuinely and importantly don't reduce to ordinal preferences. Would you rather be given $30k, be given $25k, or have $30k taken from you? Would you rather eat a carrot, a radish, or a bowl of mud? Would you rather have your country led by Joe Biden, Adolf Hitler, or Josef Stalin? It's possible that people from different cultures would express their opinions about these options in different ways, but I flatly do not believe that there would be no information in what they say beyond "$30k > $25k > -$30k" or "Biden > Stalin > Hitler" or whatever.
Do you disagree with that? Or is it just that you are confident that the noise greatly exceeds the signal in typical cases? (Or some other option I haven't thought of?)
> I hadn't noticed that you think of ordinal preferences as what voters fundamentally have
I think that ordinal preferences are the objectively comparable thing that voters fundamentally have, relevant to the question of normal elections of candidates in public elections. That’s not all they have, and if we had an objective way to measure and an objectively correct way to aggregate expected experienced (dis)utility you could probably build a better public choice system around such a measure and aggregate than anything using ranked preference ballots. But we know from the research done in scoring/rating systems that the kind of things done in score/range voting or compressed-preference ballots like approval aren’t anything like a consistent measure of that, even before considering aggregation issues.
For lots of group choice situations that aren’t elections of candidates for public office, either approval or score style voting (and probably in some cases other compressed-preference systems besides approval) have concrete comparable meaning across different ballots and make sense, as I discussed with some examples upthread.[0]
> But to whatever extent there are actual differences in values that go beyond mere ordinal preferences, it is simply not true that "approval/score/etc. ballots are at best just ranked preference ballots with either randomized noise or randomized compression".
It is, across a population, in normal secret ballot, no personal consequence for the relation of ballot markings to actual winners, someone must win, elections for public office [1], because there is no consistent mapping to what people have beyond ranked preferences and how they do scoring of the type those ballots call for. There’s been fairly extensive research on this where it comes to rating systems. It’s one of the reasons rating systems have trended away from generic forms (though there is a lot of inertia!) to context-specific forms that focus on what has consistent meaning across respondents for the particular use being made of the rating. E.g., Facebook a while ago dropping 5-star ratings for “Would you recommend?”, media platforms like Netflix and others that had star ratings switching to just Up/Down, etc.
[0] including some public elections, e.g., modified approval makes sense — and is in effect used in California (though it isn’t called that) — for conflicting ballot measures, where in the event of fundamental conflict of measures in the same election, the “winning” measure if multiple get at least the minimum required to pass is the one with the most yes votes.
[1] But even for public office elections, if you weaken any of those constraints you can find situations where this changes. E.g., plain approval would make sense in candidate elections if you eliminate the “someone must win” condition, and instead it is an optional position, and if no candidate (even the one with the most approval ratings) got more than some threshold (say, approval by a majority of voters), then the seat would be unfilled. Then, the approval/disapproval choice would have a concrete meaning rather than just being an arbitrary compression of preferences. (Actually, that’s effectively the same system as the one discussed above used by California for competing ballot initiatives, which works because in any such set of initiatives, “None are psssed” is a valid final outcome.)
The main advantage of IRV is that it isn't a particularly strategic vote. You put the candidates you in the order of your actual preference, and that's it (there are edge cases, but compared with other systems, pretty rare ones, and their complexity tends to actually discourage strategic behaviours like elevating a decent candidate you think is more likely to win than your first choices). The strategic considerations you actually might want to make are very basic (if you really, really want to make sure someone doesn't win, you can rank parties you're pretty unimpressed with ahead of them).
A ratings ballot, on the other hand, involves massive amounts of thinking. First of all, the concept of what might represent an "honest" approach to intensity of preferences is open to debate. It's US culture to give their Uber driver max marks and tip the serving staff 20%; is this an honest reflection of the intensity of the average American's enthusiasm for service workers relative to people from non-American cultures or simply what they've been taught to do? And real world electoral situations give you all kinds of incentives to do other things (like give your preferred candidate max marks and all the other candidates 0, or all the candidates you don't hate at least 9 out of 10). Is the person who bullet votes max marks for his preferred candidate because that's what the campaign literature suggested really worth more votes than the person who carefully evaluated all the candidates on their merits but doesn't think any of the candidates are worth more than 6 out of 10? With approval voting it's arguably even worse because there's a massive dilemma over whether to vote positively for the candidate(s) you want or negatively to eliminate the risk of candidates you want to lose. Obviously this applies unevenly to different demographics: put someone downright nasty towards a particular demographic on the ballot paper, and suddenly that demographic is under pressure to Approve every other candidate, whilst others continue to happily only vote for candidates they actually want.
If you design an efficiency metric with the arguably unjustifiable assumption that votes always represent either a perfect expression of the intensity of their preferences or a simple uniform strategic vote, and you assume that the proportion of people voting strategically is unaffected by voting system, it is unsurprising that voting systems whose main shortcomings are strategic complexity perform well.
Even looking at that data, IRV actually outperforms nearly all the systems if you assume IRV voters are honest and voters using the other system are very strategic which isn't unreasonable given the incentive structures of the different systems.
Excellent critique of ratings based system. I think this is the biggest problem with multi-point (more than 3 values for score) STAR voting. What constitutes "5 star"? What about 0/blank vs 1 star?
Framing is important, and I think STAR would be most optimal with good ol' Likert scale (five or seven point) rather than 0-N stars. But I think even simpler is the "3-point Likert" behavior of V321: approve, disapprove, and "meh, ok."
I don't like how IRV forces me to rank candidates I may know little to nothing about (e.g. downballot randos not on "my team"). I don't really care if Kick Puppies Party is above or below Trip Grandparents Party.
You're entitled to leave both the Kick Puppies and Trip Grandparents party off your ranking!
V321 is certainly more intuitive in the ratings (it's basically an upvote/downvote system) if not the counting, but it's still quite strategic. On the plus side, the most basic strategy of simply upvoting who you want to win and downvoting the candidates you hate ensures the most divisive candidates are removed from the pool and the obscure fringe candidates are unlikely to benefit, but there's a fair bit of juggling how many candidates to upvote or downvote depending on what you think of the candidates seen as most likely to win the final round
> I don't like how IRV forces me to rank candidates I may know little to nothing about
That is an implementation issue. IRV can be implemented as either requiring complete ordering or allow partial ordering (you have n candidates, you can assign each one level 1..n, but you may use the same level for more candidates to describe partial ordering).
> IRV can be implemented as either requiring complete ordering or allow partial ordering (you have n candidates, you can assign each one level 1..n, but you may use the same level for more candidates to describe partial ordering).
Technically, that’s a complete ordering with unforced preferences, not a partial ordering.
A ballot with a partial ordering would be won for which for some pair of candidates A, B, none of A> B, A<B, nor A=B would be true. Equal preference and absence of preference are different.
A ballot that worked this way would be a meaningful extension in existing ranked preference methods that aggregate pairwise preferences from individual ballots, lile ranked pairs. (Whether the effect would be desirable is not something I’ve really analyzed, but that kind of method would naturally make distinct use of the information.)
A score or range system has more possible values than candidates and treats ballots with different numbers but the same ordering differently; its not the same as an unforced preference ranked ballot (a ranked preference ballot that allows ties).
It's simply not accurate to say that IRV isn't particularly strategic. It removes strategy from basically one case: two viable major candidates and one or more nonviable minor candidates. As soon are there are more than two viable choices, IRV often requires you to betray your favorite and rank them lower in order to reduce the likelihood that your least favorite will win.
I think it’s because most often advocates are not promoting a specific methodology but a broad set of possible electoral reforms. Even people who study this in depth (like the article author) get substantially different results based on what they value and measure, and by methodology. That’s not something a civics class is going to address for people who want more fair choice at the ballot box but can’t possibly know the outcome of one approach or another without seeing it in situ.
Edit: I’m saying this from the perspective of someone who’s spent a lot of time reading on the subject and wouldn’t dare pick a favorite for my country (US) or locality. I just want the opportunity to try systems that are likely more fair.
This is one of the great aspects of states (localities in general) mostly governing themselves even in important things like voting. Some state tries something new, and everyone observes and can learn from the results.
If all of the new things states are trying were concentrated in just one state the churn would destabilize it too much. By spreading out the chaos we collectively get to try many new things at once, but avoid being overwhelmed since each locality is dealing with fewer new things.
One major drawback is that efficiencies of scale can’t be taken advantage of and inequities at scale are easier to propagate. So we have a very slow process that ends up disenfranchising a larger amount of people than I’d be comfortable with in a modern developed democracy. This is extremely evident when you start comparing how well governments run in other Western democracies.
I meant to respond to GP earlier and you got to it before me admirably, but if I may be more terse: some of these states are still experimenting with rehabilitating the Confederacy.
I think there's a couple reasons. IRV has been around a lot longer and is used in other countries. (It was first used in Australia in 1893.) In the U.S. it's supported most prominently by FairVote, which seems to be a pretty well-funded organization. (FairVote calls the proposal RCV, and most media follows suit.)
The main proponent of approval voting seems to be the Center for Election Science, which is a pretty small organization, though I think they've gotten a little bigger over the last few years. There's also some STAR voting groups. None of these groups have huge piles of money to spend.
Then there's news organizations (including the NPR story posted here on HN today) promoting RCV/IRV.
On its face, IRV seems pretty straightforward and reasonable. It's problems are complicated and hard to describe. The examples of non-monotonic results where voters can cause their preferred candidate to lose by ranking them too highly are counterintuitive and hard to follow, and I wouldn't blame anyone for thinking: "What? that doesn't make sense. You must have made a mistake somewhere..."
There's also the common sentiments of "this is an emergency we need to fix right now" and "everyone's already on the RCV bandwagon" and "replacing a broken system with a half-broken system is usually politically more feasible than replacing a broken system with a not-broken system" and "don't let the perfect be the enemy of the good" and so on.
There are also criticisms of the other voting systems. Some people are bothered by approval voting's lack of ordered preferences, for instance, and score voting is susceptible to strategic voting where the most extreme rankings have the most influence. STAR voting mitigates that to some extent, but it's not perfect.
I worry that we'll replace FPTP with IRV, find out later that it didn't live up to expectations, and then when voting reform advocates start pushing for something better like approval voting or STAR, the voters will say, "why should we believe you when a bunch of voting reform advocates promised unicorns and rainbows and we got a system that doesn't work any better than FPTP?" And then they'll either say to heck with all this voting theory nonsense and go back to FPTP or just live with IRV. And that's optimistically assuming that our future selves don't just abandon democracy itself as an unworkable idea (and though I reject that premise I have to concede that there's no lack of evidence pointing in that direction).
IRV has a non-complicate-to-explain problem: you can screw up the ballot and invalidate your vote.
> Unlike many single-winner methods, instant-runoff cannot accept equal rankings, and must discard ballots with multiple first-preferred remaining alternatives: such ballots would be equivalent to casting multiple ballots in a plurality election
With all-the hubbub on the last election about vote validity, and hanging chads before that, the last thing we need is screw-up-able ballots.
IMHO we should not even consider any methods in which you can violate invariants on a paper ballot. That rules out FPTP, IRV, and any others where the user with pen-and-paper can enter an invalid state.
Err, I guess that rules out 321, which is my new favorite. Simple fix: just count multi Bad/Ok/Good entries toward that total. In the first round, you only look at "good" scores, ignoring ok/bad. Then in the bad round, same thing. This ensures a) you can't produce a bad ballot b) ballot counts stay orthogonal for each category c) multi-entries kinda average out.
E2: well I guess you can manage to screw up ANY style of paper ballot and violate some invariant. I guess it matters the difficulty of screwing it up and whether it's recoverable.
I think 321 lets you have as many good/bad/ok candidates as you like. The thing it doesn't allow is marking a candidate as, say, both good and bad at the same time. (I'm not an expert though, I just heard about 321 yesterday, and my source of knowledge is the most plausible-looking google search result.)
I agree that a likelihood of higher-than-normal percentage of spoiled ballots is a potential problem with IRV, at least with mail-in or in-person paper ballots. Voting machines shouldn't have a problem with it.
Approval is about as simple as it gets. In order to spoil the ballot you'd have to do something pretty strange like tear the ballot in half. Or with mail-in ballots, the usual things like not signing the envelope (assuming that's required in your state) or mailing it too late to arrive on time.
I don't see ballot spoiling as necessarily a show-stopper for using RCV on its own, but there are other compelling reasons not to use it (such as failing monotonicity).
It's easy enough to have a scanner reject an invalid ballot and force you to fill out a new one. At least in NYC where I vote, you feed it into the scanner yourself.
And you've always been able to produce invalid ballots, like voting for two candidates when you're only allowed to pick one.
You're describing a UX problem which has UX solutions. It's not even remotely a justification for selecting a democratic procedure to express the will of the people.
Another benefit of Approval Voting ballots is you basically cannot spoil them. You can't rate/rank the same candidate more than once, and marking multiple candidates is valid.
> You're describing a UX problem which has UX solutions. It's not even remotely a justification for selecting a democratic procedure to express the will of the people.
Agreed but that's far from my only gripe with FPTP and IRV.
Personally, I fail to see how STAR and approval are superior.
If I were to lobby for some electoral change, I would lobby for approval votes, because they are easy to lobby for (just change "you gt ONE vote" into "vote for as many people you like"). But compared to ranked choice, it creates a situation where you can't differentiate candidates between who you clearly have a preference, and this has large odds of changing the outcome.
OTOH, with ordinal methods you are forced to distinguish between candidates that you think are equally good. Say if you have three candidates, A, B, and C, where A and B are very similar and you like both, though with a slight preference to A, whereas you dislike C very much. With approval voting you'd approve A and B, with score voting (say the vote is an integer between 1 and 5) you could vote A=5, B=4, C=1 (or A=5, B=5, C=1 if you think A and B are really close). Whereas with an ordinal system you'd vote A-B-C, but you have no way of 'telling the voting math' that the distance between A and B is much shorter than between B and C.
Looking at the results of these simulations described in the article, and similar simulations using 'Bayesian regret', it seems that these issues tend to make cardinal systems better. Also, considering the various gotchas of voting systems, there is the famous Arrow impossibility theorem. However, it applies only to ordinal systems. There is no similar thing for cardinal systems.
> Whereas with an ordinal system you'd vote A-B-C, but you have no way of 'telling the voting math' that the distance between A and B is much shorter than between B and C.
Kenneth Arrow is famous for his impossibility theorem, but he also rejected cardinal systems out of hand, because you can't honestly judge the strength of your own preferences compared to other voters. He argued (pretty well, I thought) that the extra information in cardinal systems was useless.
> Kenneth Arrow is famous for his impossibility theorem, but he also rejected cardinal systems out of hand, because you can't honestly judge the strength of your own preferences compared to other voters. He argued (pretty well, I thought) that the extra information in cardinal systems was useless.
Indeed, that is a strong argument in favor of approval voting; if everybody votes strategically under score voting it reduces to approval voting, so why bother with the extra complexity of score voting? FWIW, I recall Arrow himself said in some interview that his personal favorite was approval voting.
Secondly, I'm not sure I understand the point about strength of preferences. Since everybody has the same range from within to assign preferences, it's all about the relative preferences among the candidates. Doesn't matter if I hate candidate X with the heat of a thousand suns, I can't give that candidate a score of -1000, only the minimum, just like everybody else.
With approval voting, a voter faces an impossible to solve dilemma. If they hate one candidate a lot, and like one candidate a lot, should they approve all candidates except the one they hate, or only approve the one candidate they like?
With strength of preferences, some people will give their scores as 2,3,4 (out of 5) because nobody is perfect and nobody is truly the worst imaginable. But other people will give out scores 1,3,5 if they feel that has more effect, or because the are comfortable using score extremes everywhere else in life - with no real difference in the strength of conviction between those two groups of people.
And some people will use only 1,5 because they think giving every candidate except the one you hate a 5 increases the chance of the one you hate not getting in, even if they don't really like any of the others, or like one more than the others. It doesn't mean they are "pro" any of the candidates.
And some will give their favoured candidate 5 and everyone else 1 or 2.
> Secondly, I'm not sure I understand the point about strength of preferences. Since everybody has the same range from within to assign preferences, it's all about the relative preferences among the candidates. Doesn't matter if I hate candidate X with the heat of a thousand suns, I can't give that candidate a score of -1000, only the minimum, just like everybody else
Are the relative preference intensities expressed real? Everybody has the same range to vote Uber drivers, but apparently they're all nearly perfect.
If a very successful campaigning organization convinces people that they have to give the Judean People's Front 0 for the People's Front of Judea to win even though the parties have similar policies and lots of Judeans strongly prefer them to the Romans, is the resulting distribution a real or artificial expression of the intensity of preferences?
1) Arrow's theorem isn't really about voting though. 2) Arrow's doesn't apply to cardinal systems. 3) Arrow said that for voting cardinal systems were probably superior.
From wiki
> Arrow originally rejected cardinal utility as a meaningful tool for expressing social welfare, and so focused his theorem on preference rankings, but later stated that a cardinal score system with three or four classes "is probably the best".
Again note that he's talking about social welfare and not voting.
I personally favour ranked voting with the feature that you can put multiple candidates at the same rank, and you can leave off candidates you really dislike.
(Putting candidates in the "leave off" set is the same as ranking them below a "none of the below" pseudo-candidate because it means they won't inherit unused vote shares during counting).
That solves your distinguish and C-dislike objections, and feels more honest when voting.
It also lets you feel fair when there are two candidates you both like equally and don't want to add to signal-favouritism when the results come out.
At a local club (where people are friendly so it's not meant to be competitive like in government), having to rank multiple loved and personally known candidates for a committee occupied by multiple people, when you like several equally, doesn't change the outcome at all, assuming they win, but it tends to greatly amplify the tiny, irrelevant biases fron the most petty things people have to draw on into very large differences in reported scores. This makes it looks like one person is very much preferred to be in charge of the rest of the committee or team, much more than is actually true, ie. it doesn't actually represent the voters intentions. The election outcome is representative, but the outsized score differences are interpreted wrongly. So this seeds ungrounded internal divisions after the vote, peculiar strategic voting at subsequent elections, and peculiar behaviours to chase after those "tiny, irrelevant biases". This isn't good for anyone involved. So it's good to allow voters to rank candidates equally for this reason too, if the voter genuinely feels that way.
Can you point to a real example where you can't distinguish between candidates (I'm Australian, so vote with IRV, and I can't say it ever comes up, except perhaps when choosing who should be last vs second last ;)? Approval voting would seem to have the same convergence to a two party system as FPTP where you have a chamber composed of multiple members, a mayor or similar role would be different.
Multi-member representation would be ideal, but I can see the appeal of having a single person represent you.
> Can you point to a real example where you can't distinguish between candidates
With hundreds/thousands(?) of ranked voting elections worldwide, with certainly millions of voters, you're statistically guaranteed to have that situation for some voter somewhere.
> Approval voting would seem to have the same convergence to a two party system as FPTP where you have a chamber composed of multiple members, a mayor or similar role would be different.
Not sure the voting method used for single-winner elections matter that much for the dynamics leading to a duopoly or not. I think convergence to a duopoly is most strongly influenced by winner-take-all districts. The solution to that, as you pointed out, is multi-member districts with proportional representation.
> but I can see the appeal of having a single person represent you
... assuming that 'your' party wins. Otherwise you have no representation at all and you feel disfranchised?
I'd say the major appeal of single-winner districts is that the districts are much smaller, and you have a better chance to get to know your own representative (again, assuming it's somebody you find broadly acceptable).
> With hundreds/thousands(?) of ranked voting elections worldwide, with certainly millions of voters, you're statistically guaranteed to have that situation for some voter somewhere.
One could argue the same thing applies when a voter chooses whether or not to approve a vote for a candidate they don't like in order to prevent an even more disliked candidate from winning. To me, it seems a false choice either way (to order or to approve).
Possibly this a framing problem, given I've grown up in a system where IRV/STV is used, I naturally compare candidates, maybe someone growing up in a Approval voting system would naturally think about whether they like a specific candidate or not.
> ... assuming that 'your' party wins. Otherwise you have no representation at all and you feel disfranchised?
Having two houses where one uses STV and the other has single member districts does have its advantages as a "why not both" option.
I feel like STAR in practice ends up being approval voting with more complexity. I just learned about V321 though - interesting concept for sure and also holds the important property perhaps as approval voting if I understood it - can be counted by hand once and you can figure out the results.
Probably because IRV is easier to actually do and doesn't really have any "free variables" in how you should behave. For example if I strongly like candidate A and I like candidate B a bit less then I should obviously rank them first and second. What about scoring? Should I give them 10, 9? But will that penalise B when A is eliminated? Similarly should I give them Good, Good, or Good, Approve?
You kind of have to keep the whole voting system model in your head to decide.
You might not think it's very difficult but even ranking candidates is too complicated for most people. Anything more complicated than that has no chance in the real world.
Honestly I think we just need some big name to advocate for STAR (or any cardinal system). IRV really took off after CGP Grey's video and gained WAY more support after Hasan Minhaj's episode. The only places I see people talking about cardinal systems are Hacker News and the Yang subreddit (though lots of armchair experts advocating for condorcet methods because they don't see criteria other than VSE as being important). I would not expect cardinal methods to enter the public discourse until some famous person talks about it.
Maybe, but as I understand it, within voting system reform there's at least one big practical conflict: the tradeoff between 'the best' vs. the simplest voting system that you can convince people is marginally better. The average voter has no clue what's wrong with the current system.
Proponents of one system over the other worry that voters will get confused, or worse, that established groups that disagree with the reform could attack a new voting system as too complicated.
Personally I think this is part of the reason approval has gotten more traction than score or STAR systems, but who's to say. The Center for Election Science has better PR and web design than https://www.rangevoting.org/, which largely advocates for Score voting.
But it's a multifaceted issue. Academics argue over different methods, and even as a casual observer it seems pretty clear that no system is flawless.
For anyone who hasn't read "Gaming the Vote" by Poundstone, it's a useful introduction that discusses both the academic study and practical use of voting systems.
I'm not convinced eliminating primaries is a good idea. They have a useful role in reducing the effort and cost of campaigning in a general election. Consider a scenario where some party spends a hundred million dollars in the general election promoting one candidate versus another party that spends a hundred million dollars in the general election split between five different candidates. I would expect the former party to be more likely to get their candidate elected that the latter.
Also another nice thing about primaries is that they're staggered. I think it's ridiculous that Iowa and New Hampshire always go first, but ignoring that it's great that a candidate without a lot of funding only has to focus on a couple of states in the initial stage, and can use success in the early states to gain momentum (and funding). That means you don't have to be some self-funding rich person to run for president, which is a nice property.
> They have a useful role in reducing the effort and cost of campaigning in a general election.
If the goal is to reduce costs, the conventional wisdom is to shorten the campaign season. Time box them. Say 4 weeks.
Eliminating primaries reduces one whole campaign cycle. (Candidates consider primary and general election as two separate campaigns.)
> it's great that a candidate without a lot of funding
Third party access to the ballot is crucial. Support proven methods. Public financing of campaigns is probably the most effective. Fair redistricting lowers barriers. Greatly increase the number of representatives by reducing the size of districts. There are many other proposals.
I would really enjoy staying in touch with people who spend a good number of brain cycles on these topic. What forums or discussion groups would you all recommend?
Because elections and voting are trigger issues, it's a huge challenge.
Ages ago, before I burned out, Election Verification Network (EVN) was really good. Bringing together academics, activists, officials. Very low key, low profile. To avoid the outrage machine.
EVN mostly focused on gear and procedures. Not much overlap with adjacent topics.
I organized an every-partisan election integrity group for a while. We had some small local victories. (Prohibit putting serial number on paper ballot linked to voter ID. Delayed adoption of janky new gear.)
But drift into adjacent issues and coalitions fall apart. For instance: Voter registration databases. Correct answer is universal automatic voter registration where everyone is listed and has an eligibility flag, with audit history. But improving data quality is a no go because of misbelief that higher registration (and participation) has partisan advantage. (Spoiler: It doesn't.)
Another fundamental problem is being outside looking in. Asking the establishment to modify the very system that got them their power is really, really challenging. The vetocracy. That's why all these stupid obvious reforms with +70% support languish. Default answer is always "no", with demands for ever more ridiculous standards of proof, because reasons. This is a fundamental trait of all orgs, groups. Why would the public sphere be any different? Sadly, when change does finally happen, it's disruptive. Punctuated equilibrium vs slow and steady progress.
The Center for Election Science doesn't hide their preference* and advocacy:
> Let’s put approval voting ON THE MAP!
* Generally, each organization has some kind of bias, and I appreciate it when they are up-front about it.
My preferred group would have these biases / goals:
1. Valuing a diverse group of people
2. Valuing clear and honest discussion about member preferences in voting systems
3. Valuing understanding among the group members
4. Help group members form alliances with each other, should they desire to organize and do advocacy, because while each of us might have our 'preferred' voting system, we all benefit from moving beyond the 'worst' voting systems
What's kinda interesting about Election Science is how they've changed strategies. They used to push STAR till about a year ago and have since moved to approval voting. It seems they have found that approval is easier to sell to the average person and is still good enough.
> Valuing clear and honest discussion about member preferences in voting systems
This is a difficult thing tbh. A lot of people learn about voting systems from CGP Grey or Hasan Minhaj and get really passionate, so you get an armchair expertise. Then I think many learn about VSE and see Condorcet methods as the obvious winners (I used to be in this camp). But it often takes awhile to internalize all the nuances in voting. About how to balance VSE, resiliency, simplicity, transparency, computational cost, and more. It is one of those problems that looks easier than it is. A lot of very smart people get into it because it is an interesting problem but it is also easy to convince yourself that your understanding is far better than it is. Every time these threads come up I learn something new and I've been interested in the subject for almost a decade. I think the biggest thing I've learned is to look at the people who have been studying the problem for a long time and see why they are making their decisions. It is difficult to separate the signal from the noise.
In addition to the discord link I also suggest following Clay Shentrup[0]. He's active with Election Science and the co-inventor of STAR (he also typically joins voting threads on HN. Username is his full name). The reason I follow him is because reading his comments and posts have led me to a lot more sources and brought up a lot of the above nuances I didn't understand when I first started getting into the subject.
You may be confusing CES with some other organization. I asked on CES's public discord:
ranicki: someone on HN claimed that Center for Election Science switched from pushing STAR to pushing approval. is it true that this organization formerly supported STAR, or was it one of the other organizations? I don't remember who was advocating for what
-redacted name-: Center for Election Science has broadly supported cardinal voting methods, score voting and approval voting basically since its inception. From everything I can tell it still is supportive of cardinal methods broadly, but as a political tactic has honed in on approval voting specifically due to its Pareto optimality.
...
-redacted name 2-: I don't think CES ever changed from STAR to approval. IIRC they existed long before STAR was invented
(Names are redacted because I didn't ask permission to quote, but the discussion is publicly visible and not sensitive at all.)
> "From everything I can tell it still is supportive of cardinal methods broadly, but as a political tactic has honed in on approval voting specifically due to its Pareto optimality."
Hmmm... the notion of Pareto optimality is driving political tactics (about what voting method to back)? I know what it means, and it seems strange to me. This suggests that a specialized economic term was used as a motivation for a tactical political decision.
CES was originally a research-based organization, but recently shifted to advocating Approval Voting and trying to get it implemented. (I think they should rename as the Center for Approval Voting.)
Their forum and Google Group for discussion of voting systems were good resources, but are now being shut down
Funny enough, Washington wasn't a fan of parties. The argument against them is that they encourage lazy voting. Sure, there are advantages in that a party will help summarize a list of policies, but the disadvantage is tribalism. I think in modern day we don't have the same need for policy hinting (what parties give) because every major politician lists all their policies and positions on their campaign websites. 50 years ago this would be a logistical nightmare.
That said, if we adopted something like score or STAR I don't think there would be any problem with parties, nor would they be destroyed. You still get the advantages of pooling resources. Have a strong preference for members of the hippo party? You can donate and campaign for them even while you hash out which candidate best represents the coalition of people at the time. Think the hippo party played dirty and kicked out your favorite candidate Salamander-Gibbon Yack but really don't like the opposing party Tree Trunks? You can still vote your favorite person and not have any determent (system is monotonic). This actually gives less power for the hippo party to determine for you which is your preferred candidate.
Better yet, you get a lot of information! So even if the hippo party's leading candidate wins but you find that Salamander-Gibbon Yack was an extremely close second then the hippo party will have to update their policies (hopefully). By rating, instead of ranking, candidates you have a substantially more expressive ballot and we can data mine and learn from it through this expressiveness. We live in the age of big data and yet in politics we compress data to a point that is almost meaningless.
Cardinal systems won't destroy political parties, but they prevent them from controlling and manipulating the elections.
We had political parties long before we had primaries. Presidential primaries were only introduced in the 1970s. Primaries for state office were introduced in the 1890s, died out, and were introduced again in the 1920s.
But we had parties going basically back to the beginning of the country, and the current two parties crystallized during the 1850s -- long before primaries. Primaries were introduced as a way of reducing the power of party bosses over the parties.
Good. Breaking up the duopoly of US parties with deeply calcified sets of positions and constituents in favor of a more flexible set of platforms (that may or may not overlap with others on various concerns) could lubricate the government (especially the legislature) into operating based on policies and priorities that more accurately represent the majority view on any given topic.
> Approval voting has the best balance between simplicity, fairness, and certainty.
Approval voting is completely inappropriate for common situations in Europe, where you have center-right, center-left, far-right and far-left candidate (e.g. presidental elections in France). Reasonable people should approve both center candidates despite of strong political disagreement with the other side (but still beter than fascists or communists). That essetially eliminates information from them entering to voting (although finding balance between right and left is one of the most important aspect of elections).
But that assumes people would hold common good strategy and not defect to vote only for their preferred candidate (while risking that extermists win). As the side who first defects wins, it is likely that it will just degrade to plurality voting.
I've been a proponent of STAR for some time, but today I'm happy to learn about 321. It sounds even easier to explain and implement than STAR (though approval is likely the method with the least friction using existing equipment).
I really don't see the appeal of IRV, other than momentum. It's harder to implement. It's not really intuitive why you have to rank them. (I belive) it has higher chance of screwing up ballots (e.g. Forgot to rank a candidate). And I don't like how it forces you to split hairs on two equally liked candidates. Minimizing mental friction is important when we struggle to get high voter turnout as-is.
I think 321 most closely matches the political climate in America. People tend to like a candidate, hate them, or "eh, they're fine."
Also, not totally sure (maybe someone can chime in), but I think 321 is "stateless" or "commutative" in that the counts of Goods, Oks, and Bads are independent. With any runoff methods, each ballot has a "state", because after a knockout, you need to apply the runoff votes based on other information on that ballot. This makes it hard to compose ballots, e.g. for our goofy ass electoral college system. With 321, (I think) you can decompose each "channel" total for each candidate, and aggregate them at a higher level.
> I think 321 most closely matches the political climate in America. People tend to like a candidate, hate them, or "eh, they're fine."
I couldn't disagree more. I voted in yesterday's NYC Democratic mayoral primary (13 candidates on the ballot), and there were three candidates I considered "like", but with a very strong preference for the first over the second, as well as the second over the third. In today's political climate, people tend to be very passionate about their #1 choice, even if there are 3 they "like". That passion matters.
This is an argument for cardinal systems though... (more score/star rather than approval or 321). When ranking you encode in such a way that your preference from one candidate over another is uniform. But in scoring you don't have this limitation. You can express that you REALLY are passionate about 2 candidates but have a slight preference of one over the other, are lukewarm about a few, and really dislike others. Ordinal systems do not allow you to express this.
Score voting does solve a lot of problems, you're right.
Unfortunately it also introduces a ton of new ones, because while it's easy to compare ordinal preferences, there's no single good intuitive way to compare cardinal preferences.
If you give voters a scale from 0 to 10, one person's 3 is another person's 5. It's not even remotely clear that one person's vote of 8 should count the same as two other people's vote of 4, or even that the same person's 8 should count double their 4. Is the preferences scale linear? Logarithmic? Exponential? Other?
The arbitrariness and therefore unknown meaning of cardinal voting scores, and cardinal preferences generally, is the reason why philosophers mainly abandoned cardinal measures of utility a long time ago, and most utilitarian philosophy is based on knowledge of ordinal preferences only.
> there's no single good intuitive way to compare cardinal preferences.
What? You can literally rank them if you're confused. Better yet, rank them with unequal spacing (A:5, B:4, C:2).
It is hard to take your claim seriously because it has been tested in real world settings. This isn't a theoretical proposition (and you're arguing over theory). Also did you really just say "logarithmic vs exponential?"
The reality doesn't match your hypothesis and that is testable.
You're claiming philosophers abandoned it and I don't see evidence for this. Also I'm not sure why philosophers matter here, this isn't philosophy. Cardinal methods are the preference of most researchers that study voting. I don't know about you, but I tend to go with expert opinion (not absolutely, but it is a strong signal that something is the right path to follow. Then you just vet the work).
> What? You can literally rank them if you're confused.
That's ordinal, not cardinal then. Your example "(A:5, B:4, C:2)" gives zero indication as to what 5 vs 4 means, or 4 vs 2, beyond simply being "greater than". That's the entire point.
> It is hard to take your claim seriously because it has been tested in real world settings.
I have no idea what that even means. Any voting system can be "tested". It doesn't mean it produces desirable results. The philosophies behind different voting methods aren't "testable". Political philosophy is philosophy because it isn't testable -- it's about what's desirable from some ethical viewpoint. E.g. we don't "test" majoritarian voting, we defend it from principles of equality.
> Also did you really just say "logarithmic vs exponential?"
Yes I did. y=2^x and y=log(x) have different shapes qualitatively. In other words, is voting 10 twice as strong as voting for 9, or only a teensy bit stronger? Not sure what's confusing you here, or what words you think would express that difference more clearly.
> You're claiming philosophers abandoned it and I don't see evidence for this.
It's just the basic history of utilitarianism or consequentialism. Any survey on that can point you in the right direction as to why cardinal preference have been largely abandoned.
(I'm obviously not talking about economics where "utility" is commonly measured in dollars. Most analyses of voting methods are concerned with what should be the "right" outcome, not the dollar-maximizing outcome. Social choice theory has been the attempt by economists to essentially measure votes in dollars, but even that still works almost entirely with ordinal preferences that are then analyzed in dollar outcomes.)
> Also I'm not sure why philosophers matter here, this isn't philosophy.
Voting methods are a huge part of political philosophy. Of course it is. It basically straddles political philosophy and political science in the literature. (Then there are some additional contributions by economists, especially when it gets deeper into the mathematics of it, mainly in social choice theory.)
> Cardinal methods are the preference of most researchers that study voting.
That's absolutely the opposite of my experience and knowledge. There are absolutely some people who study them, but they're absolutely not the focus of "most" researchers, at least not in the realm of political philosophy and political science. Voting with cardinal preferences is also barely used in real-world political governance.
Again, the fundamental problem with cardinal voting is simply that there's no intuitive, easily agreed-upon meaning for what cardinal values even mean. Most succinctly: on a 10-point scale, define the difference between a "4" and a "5", and argue why your definition is superior to 10 other definitions that are trivial to come up with. Also explain how you'll get voters to reliably understand and vote according to that same definition as well.
On the other hand, ordinal preferences are entirely intuitive and unambiguous to express and interpret. Which isn't to say they're perfect, but they give researchers something they can actually work with, both theoretically and practically.
Ordinal preferences are fundamentally wrong, because they arbitrarily assign equal weight to preferences that are not equal. For example, if I am indifferent between two candidates, and a ranked voting system forces me to choose one over the other (randomly), while you love one and hate the other, it is undemocratic for my weak preference to have the same weight as your strong preference. Such a system is less likely to find the most representative candidate than one which takes that strength of preference into account.
> Again, the fundamental problem with cardinal voting is simply that there's no intuitive, easily agreed-upon meaning for what cardinal values even mean.
Cardinal values are perfectly meaningful. They can be interpreted as expressions of ordinal preference for probabilities of different outcomes, for instance.
For example, if I prefer Strawberry ice cream over Chocolate ice cream, and Chocolate ice cream over Garlic ice cream, and am given the choice between:
- A box that definitely contains Chocolate
- A box that has a 50/50 chance of containing Strawberry or Garlic
I would choose the safe bet of Chocolate, because my preference for C>G is much stronger than my preference for S>C. By varying the probability and making the choice again, we can estimate the ratio of the strengths of my preferences between the three options.
Wow, I envy you. We don't get a lot of choice in Upstate NY. You're lucky if you even get 2 candidates for a lot of the smaller races.
I'm probably over-generalizing but I still think this disparity is a result of low/med density vs high density like NYC. You're always gonna have more choices in high density areas, and I'm definitely in favor of more granular signals whenever possible.
I'd love to see an IRV where you don't have to obey any strict ordering rules (which I guess is like score voting where the max score is the number of candidates) which I guess might be a thing?
These results aren't passing the sniff-test for me.
If I understand, "honest" voters in the score-like systems stick to some magically-shared mapping of their own utility to scores on the ballot (totally unreasonable, but let's grant it). Strategic voters, meanwhile, just use the ballots to maximize their own utility.
Let's take the simplest, common scenario of two clear frontrunners. In score voting's one-sided strategic scenarios:
- Honest voters who support candidate A will put down magical normally-distributed numbers. Their scores for A end up perhaps 10-50% higher than their scores for B.
- Strategic voters who support candidate B will maximize B's chance of winning by putting down the highest score for B and the lowest score for A. Score for B will be uniformly 100% higher than scores for B.
In this scenario it takes perhaps 2-10 honest A voters for every strategic B voter for A to win. Yet somehow, the VSE for 1-sided strategic score scenarios are all much better than plurality, and equal with approval?
This is a commonly-discussed downside for any score voting system, where it gives strategic voters extra power, and, in these 1-sided scenarios, is worse than plurality. The downside is appearing nowhere in this data.
I believe you are stating that Score0to2 one-sided strategy should measure worse than plurality one-sided strategy. I don't think this is an obvious claim. It seems to depend on the underlying distribution of candidate quality (especially correlations of preferences).
I don't know what is up with 1-sided score vs 1-sided approval. That is an interesting observation you made and does look suspicious; we should expect the distortion of power to be greater than the loss of honest expressiveness.
When I looked at this project a year ago, I found other strange behaviors that led me to ignore its results, similar to how you are comparing its graphs to your intuition. There is no "correct" model in voting theory. Models are neither good or bad, only to the author's taste or not. For example, with voters on a single dimension, should utility of a candidate be the squared distance, or the absolute distance? How should a candidate's positions be distributed, is n-dimensional Euclidean space good? Should some candidates be straight-up better than others, and what should the distribution be? How should voters be clustered, if at all? What does "strategic" voting mean in each context? What does "honest" voting mean in cardinal contexts?
The author's discussion of methodology was lacking, and the missing priors make its results partially meaningless. I also had no success looking through the code for even basic behavior, because it was too hard to navigate. However, this is only a graduate project, and most graduate projects (including mine) are of lower quality than this one, so I wasn't upset.
In score systems, voters can sometimes be made honest to an extent. For example, on a book rating site, you probably don't rate all books 0 or 5; you use the full scale. Your ratings have greater self-value than your desire to tip the global score. While your assertion is reasonable that it is not the expected "honest" voting behavior, it is also a behavior worth looking at. This goes back to discussion of methodology; all these assumptions need to be documented, rather than buried in hard-to-read code.
>Intuitive expectations are not the same as empirical data.
Sure, and I can't criticize the model with any certainty, or make my own firm claims, without doing modelling myself. But looking for concerns is fine. I can't say the model is wrong in this aspect even though it looks odd. However, I can think of alternative models which would definitely have the opposite result. (For example, squared distance as utility.)
>Of course there is. The ideal would be a model that is exactly like real life.
Practical considerations make this impossible. Validation of a model by comparing to reality only works for small subsets of voting theory at the moment, and weakly so. A complete model like this one has no hope. And if comparison to reality is the goal, I can already see gaps where it diverges from the best-known theory, by omission of features. So taste is important. I have my own preferences about what features are important, which others will disagree with. These preferences will be non-falsifiable for at least 50 years while we wait for experiments to catch up.
For example, I assume you mean affine scaling of utility into the [0, 1] range when you say "Normalization of sincere utilities". But we might add to each election a candidate whose goal is to nuke the country itself. His utility is massively negative for every voter. However, it isn't to my taste for this candidate's presence to distort "honest" votes; I'd rather there be some clamping at the ends. So even this "obvious" choice is not canonical.
>Be specific. It just sounds like you don't understand how it was done.
Yes, that's the problem. I shouldn't have to go through code to understand necessary things. An expert in the field should be able to follow along by reading the paper. Although, this is not a paper, so holding it up to that standard is unfair. But the gaps in what I can figure out make the results non-interpretable in some of the aspects I most care about.
I think one of the results you most care about, from the model, is that Approval measures better than ranked choice. I think that's still a fair conclusion to arrive at.
The VSE metric implicitly rewards strategic voting, because strategic voting works. By "works" I mean it helps the individual voter maximize their chance of having their preferences reflected in the outcome.
> VSE cannot measure:
> - Any desirable characteristics of a voting method
> that do not directly relate to outcome (who wins).
This includes, for example, a value judgment that says "strategic voting is less desirable than honest voting".
An outcome where 30 strategic B voters win over 70 unstrategic A voters is one I would expect VSE to rate very poorly. That is not a value judgement, and is directly related to the outcome.
Your expectation is not aligned with how VSE is described. See my quotes above, and respond specifically to them if you would like me to continue discussing.
> If I understand, "honest" voters in the score-like systems stick to some magically-shared mapping of their own utility to scores on the ballot (totally unreasonable, but let's grant it). Strategic voters, meanwhile, just use the ballots to maximize their own utility.
Yes, this is how the terms are generally used in voting systems literature.
I was afraid some math would support a counterintuitive method, but I was glad to learn about 321 Voting for the first time as well as to see that it scored the best. Pretty neat!
> I was afraid some math would support a counterintuitive method
When you say 'math', what does that mean to you?
To me, especially in this case, 'math' as used here (a model) is ultimately a formalization of a decision criteria. The decision criteria used here, VSE, is a value judgment. Assuming the math is correct and the simulation bug free, I'm most interested in assessing the norms (human values and priorities) implicit in the mathematical scoring function.
Though mathematical logic is indeed deductive, using it to assess social systems is subjective.
It may be hard to model, but I would have liked to see how Asset Voting compared to the other voting systems.
Asset Voting seems to be almost unheard of, even among voting reform enthusiasts, despite being over 100 years old and one of the simplest systems out there. Basically it works the same way as a FPTP election, except that after the votes are counted, the losing candidates one by one get to reassign their votes to a remaining candidate.
It doesn't guarantee proportional representation, but it does solve the spoiler problem and doesn't force voters to decide whether to "approve" of a weakly-disliked candidate, or how to rank multiple candidates. The ballot papers wouldn't even have to change their instructions, and it can be easily counted by hand, allowing the votes for each candidate to be placed in their own stack, which makes the process more transparent.
I've seen some voting systems include a "delegation" feature, where instead of submitting a full ranking of candidates (or whatever), you can just vote for a single candidate and your full ballot will be determined by their full ballot. What you're describing seems to be basically IRV but where you are forced to delegate?
(OK, I guess the runoff isn't truly instant, but still.)
I worry that this might have the same sort of monotonicity problems as IRV normally does, only in terms of the candidates' ballots instead of the voters'.
That’s an interesting concept although it does set the likely stage that a candidate pisses off their base by making some kind of personally politically convenient deal. Also the strong opening it leaves for even the appearance of corruption isn’t my favorite property.
Are there real works examples of how such systems operate in practice?
Yes, Swiss federal parliamentary elections have a similar system. The name of the other parties your vote goes to if the one you voted for didn’t pass is printed on the voting bulletin.
The system is a bit more complex, since it’s for the parliament and you aren’t really voting for parties, but for lists, and the votes get transferred depending on how many seats the list filled, but the basic idea is here.
This actually influenced my voting choice. I went for a more niche party who had little chance to get a seat. I knew that if it didn’t get enough votes to get a seat, my vote wouldn’t go to waste. It would go to the larger party which already has many seats but doesn’t align with my political views as well as the niche one.
It turned out well since the niche party got a seat despite the polling predictions.
Not necessarily; I think it's usually more of a prisoner's dilemma situation: if you optimize for getting what you want, then the overall outcome is worse.
For example, in the results Borda count is much worse when everyone strategic votes.
The new results contrast with an older simulation (graph shown down the page): https://electionscience.org/library/tactical-voting-basics/ Those results show honest voting always being (globally) better. That's one reason why I find the new results surprising.
I like range voting (of which approval is a special case). It's simple, and it always takes everybody's point of view into account (it doesn't lose any voter information), which is a really nice property IMHO.
Yes, you can argue that the more extremely somebody puts the ranges, they have an advantage. (I would argue we should always normalize the cast vote so that most wanted candidate gets 100 and the least wanted 0, but then we take away people's ability to express only mild interest.) And through that, they can in certain cases manipulate the outcome.
I feel like we cannot really meaningfully say that there is a difference between "real" preferences and the "manipulated" ones. For example, this article (from my cursory reading) has a model how media (or others) convince somebody to vote for B when his "real" preference is A. But how can we claim that he truly didn't change his "real" preference to vote for B, based on the media discussion?
Ultimately, I find it somewhat patronizing to claim that, if the real preferences are (internally in humans) expressed on numerical scale 0-100, and we ask for the number from that numerical scale 0-100, this number can somehow not correspond to real preference. I think there is then a contradiction in the definition of "preference", because then the preference has to be something more complicated, like some condition based on other people's choice, not a simple internal value that is independent of what other people want.
> I like range voting (of which approval is a special case). It's simple, and it always takes everybody's point of view into account (it doesn't lose any voter information), which is a really nice property IMHO.
I don't agree with the claim "it doesn't lose any voter information"; unless voter preferences are aligned, any preference aggregation method must lose information. (More broadly, this happens with any statistic.)
You probably know this, so I think you meant something slight different? Can you elaborate?
My interpretation, not sure if correct, is that basically you're expressing each vote as a point in d-dimensional Euclid space (where d is number of choices), and the result of the vote is the center of mass of all these points. Stated like this it perhaps makes more sense than just linear range voting.
Interesting (but surprising; this is a routine and hard to describe as anything other than intentional “oversight” in analyzing these methods) that Approval gets two forms, but none that recognize that honest ballots in Approval aren’t either of those constants, but a cultural variable that varies within an electorate, and that the similar issues with all restricted preference (approval and 3-2-1 from this list) and score based (score and STAR from the list) methods are completely ignored.
Not knowing on the same day who won the NYC primary is much more annoying than I thought it would be. I wonder if there is a psychological effect in play here, like lag in a game controller. Makes you feel less like you’re actually in control.
Do you know what the source of the lag is? There's a concept of "rounds" in IRV, but if you have access to computers, a "round" should resolve in less than a second, if election data is digitized. Is the lag digitizing the votes in the first place?
I guess (without any research) it’s because you have to fully count all the votes before you start simulating as results may change. You can’t just say “this person has an inescapable lead” because the drop-out of another candidate might net a win. The only case where you can “call” it I think is if someone is likely to get >50% on the first ballot.
The machines at the polling places can only count first preferences. Distribution of preferences will occur once the votes are transferred to the central scrutiny centre.
The Junior Eurovision song contest has an interesting system - you must vote for 3 of the 12 entries. Voting for only 1 or 2 is not allowed. I'd be interested to see analysis of this. I don't think it's a good idea for government elections, I just wonder what it would actually do.
What VSE doesn't capture, but I feel is an important part of voting is signaling voter preferences.
Election where a democrat vote with 70% of the vote conveys a different information than the one in which 30% of those 70% first went to the green party candidate.
It performs the same core sleigh of hand: it takes a very simple view of strategic voting, assuming a fixed share of the electorate vote "strategically" regardless of election method, and making big assumptions about what's a good strategy.
That is, to put it mildly, not a reasonable assumption. It varies widely how much information about other people's preferences you need to "vote strategically", and the strategies also vary widely in how risky they are. If I have false beliefs about other people's preferences, how likely is it that I shoot myself in the foot with "strategic" voting? And how likely is it that we get caught in a bad equilibrium where our false beliefs about the other side's preferences reinforce each other?
All this makes the sort of modelling attempted here a vain pursuit. You just can't model strategy this naively.
But some methods come MUCH better out of this hopelessly naive modelling of strategy than others.
For instance, Approval voting requires very little information about other people's beliefs to vote strategically - you just need to know who the top two contenders are - The strategy is to approve the "least evil" of those two, plus any from the rest of the field you prefer to those two. It degenerates to pretty much the same problem as Plurality voting, with maybe slightly better chance to get out of bad equilibriums in the long run.
And Range/Score voting is even more extreme: you can vote strategically there even knowing nothing about other people's preferences. Just always use extreme values (0 and 100).
By comparison, strategic voting in Condorcet systems requires really exact information about the electorate's full range of preferences - and even then, you usually need to coordinate your strategic voting without the opposition getting wind of it, otherwise they can counter it. It's just not practical.
(Likewise, Borda gets a bad rap in these comparisons. It's not a good system, but the likelihood that "strategic voting" will backfire horribly means that in all likelihood, voters would not pursue those strategies. )