Well, the idea is to attach responses to downvotes. So you spend several days finding mistakes, you write it up at the review stage, they say "nope" and reject your review. You don't lose any reputation. When they publish, you downvote and post your review as a response (you can just copy and paste - you're right, right now it disappears, but we can make sure reviewers retain access to review comments - that's easy). Sure the office mate can upvote, but other voters will see your critical review, see the mistakes you outline and can affirm you.
I guess what it comes down to is this: do you believe the average actor in science/academia is a good actor? IE. Are they honestly trying to be unbiased and advance good science/good work above all else? Or do you believe the average actor is a bad actor (OR that the incentive structure is so strong that even well intentioned actors are corrupted to be mediocre actors)?
If the average actor is a good actor, the system works and work will be rated/rewarded correctly. If the average actor is a bad actor, or the incentive structure is too strong, then this system would absolutely fail in the sort of worst case scenarios that have been described here.
> do you believe the average actor in science/academia is a good actor?
I think you've over-simplified your model. I'll build on your analogy.
There are many roles. As you've argued, the current system is fundamentally broken, which means most if not all of the roles are bad roles.
What happens if a good actor is stuck in a bad role?
My simplistic view of modern scientific publication is it started to reduce knowledge hoarding by coupling prestige to publication. ("I'm sorry Mr. Newton, but as Herr Leibniz published first, he gets the credit with creating the calculus of infinitesimals.")
The loop is now tightly coupled, leaving us with "publish or perish", and with coarse and unreliable publication metrics used to determine career advancement.
As a result, many people deliberately game the system, with methods like "least publishable unit" and "salami slicing". How is the author order determined? Who gets to be on the author list?
Are those the method of good actors? Or bad actors?
To be clear, my goal is not simply to point out that there is a grey area, but rather that there are competing goals. Your use of "good actor" and "bad actor" oversimplifies the issue by projecting it onto the single axis of "good science."
I recently reviewed a paper. It was 20 or so pages long. Nothing about it was all that new - it was about the level of a senior undergrad project. Most of it was a tutorial/walk-through about how to use the software. Had it been the late 1990s, it would have been cutting edge.
My review was something like "this should be a 1-2 page software announcement. Nothing seems wrong, but it isn't worth the time for anyone to read all these details in a journal paper, and it isn't worth my time to review it."
Is the full 20-page paper good science? Bad science? Is it the result of a bad actor? Or an uninformed actor? Should my review be acted upon or ignored? If published, would my post-publication review make a difference?
I might put that in the middle category - the category that might warrant a quick review with just that criticism and maybe a post-publish response, but no voting.
"Bad" here is used to mean dishonest, very poorly done methods, or conclusions way out of step with the actual empiracle results. "Good" is used to mean simply "this was well and honestly done work, with conclusions that may be reasonably drawn from the results". Nothing more. There are massive gradations with in "Good" and a large area between "Good" and "Bad". The intention is for "Good" papers to be upvoted, "Bad" papers downvoted, and everything in between to not be voted on, but to rather receive responses where warranted.
The reasoning behind this is that in a lot of ways we've gotten too into the first discovery pieces of this and forgotten that science works in the aggregate, it must be a collaborative enterprise. Any empirical experiment or theoretical work could be biased or flawed in any number of ways, and in ways which we can't necessarily catch in review. That's why replication is so important, but in so many fields it's not happening now due to ideas about what represents a "substantive contribution to the field". Hence, the Replication Crisis.
So the goal of using such a simple system and metric, in a lot of ways, is to take us back to brass tacks. This is about building humanity's collective knowledge base - together. All else is secondary. So a "Good" actor in that context is someone who does the thing that contributes to humanity's collective knowledge about the world, including but not limited to:
- replicating previous work
- getting a null result
- getting inconclusive results
- helping a peer polish their work so that it's best communicated
- helping a peer catch mistakes or spot unexplored avenues for further study
In a lot of ways, the prestige seeking is part of the problem, because it works against collaboration.
The nice thing about a reputation system where votes are granted for "Good" work in this context, is that it's not zero sum. And indeed, it does a much better job of incentivizing a lot of non-glamorous grind work, because there's a lot of that, and if it's well done, then it merits votes. It probably won't get as many votes in a single go as the ground breaking stuff, but it's a valid path to building reputation.
This system definitely does incentivize "Least Publishable Unit", but... I'm not necessarily convinced that's a bad thing. In software engineering, there are a ton of benefits to breaking up a large knowledge base or problem into its smallest digestible chunks. If all of those "Least Publishable Units" are well organized in a single database, that might actually be a benefit rather than a cost.
By the way, I'm still chewing on your other unanswered comments. You're raising a lot of good points and giving me a lot to think about and I really, truly appreciate it.
> "Good" is used to mean simply "this was well and honestly done work, with conclusions that may be reasonably drawn from the results"
Many high school science fair projects would fall into that category.
Would that be allowed? Or will there be some gating system?
Will you require ORCID?
Will you allow anonymous users? If so, what prevents nyms from being fly-by nattering nabobs of negativism? SO-style permissions based on rep? Being vouched for by other users?
> valid path to building reputation.
I don't know what to do with my SO or HN rep. As others have pointed out, unless this rep helps/hinders career advancement, I don't think people will care.
While I suspect you're still chewing on my comment that if it does help/hinder career advancement, then many will try to game it,.
> I really, truly appreciate it
Thank you for that comment. I don't want to come across like I'm trying to rain on your parade.
> Many high school science fair projects would fall into that category.
> Would that be allowed? Or will there be some gating system?
I think that's up for the community decide. There have been high school science projects that got published before and were considered significant contributions.
The idea is that anyone can register - though they will be asked to use their real name. To be determined how to verify that.
> Will you require ORCID?
> Will you allow anonymous users? If so, what prevents nyms from being fly-by nattering nabobs of negativism? SO-style permissions based on rep? Being vouched for by other use
We will integrate with ORCID and provide space to link ones account, but my understanding is that ORCID adoption is still limited (growing, but limited) so we don't want to require it.
Yes, it's an SO style reputation permission system - only tied to the fields you publish in. So when you publish a paper, you gain reputation in the fields you tag it with. That then grants you permission to review and referee papers tagged with the fields you have reputation in.
The current thought is to aim to set the threshold so that a graduate student who's published a few times can review, but the refereeing (voting threshold) would be set closer to late postdoc or early professorship. Possibly higher, to prevent labs from attempting to game the reputation system.
What those numbers are is yet to be determined. The current plan is to initialize the reputation system using citations - 1 citation = 1 upvote, essentially. It's not perfect, but it's the closest analog. That would allow researchers with established records to help seed the community and carry over the reputation they've already earned. This won't work with out that (because no one will have enough reputation to vote). We've got some analysis to do to figure out where the permissions thresholds should lie (and it might turn out that they need to be in a different place in different fields - which complicates the system somewhat, but isn't unmanageable).
> I don't know what to do with my SO or HN rep. As others have pointed out, unless this rep helps/hinders career advancement, I don't think people will care.
Yeah, at this point my SO rep seems kind of pointless too. That said, I'm pretty sure I got one of my first software jobs on the back of my SO rep a decade ago. They kind of squandered that aspect of the site by mismanaging the job board/cv aspect of it. Inspite of that, SO still works. It's still filtering content appropriately, and it's still a very supportive community where you can go and get the help you need. It's still one of the best sources of technical information on the internet. So even though the reputation is little more than internet points, it still seems to be doing its job. Which, to be fair, it was always just a mediator for the permissions system first and foremost.
Where Peer Review is concerned there are kind of two modes of operation to think about (...okay, maybe four).
There's the "Trying to get traction, no one cares about the reputation yet." At that point, the reputation can't do much in terms of incentives (beyond what any internet point endorphin system does). But it's still the system that facilitates matching papers with the appropriate reviewers and referees, it's necessary. It's what allows this to be crowdsourced - that plus the evolutionary field system.
Then there's the "Established, the institutions have started to take the reputation seriously and count it as another citation metric." At that point, it can start to provide incentives. And that's where we can start using it to help encourage good science and counter some of the negative incentives currently built into the system - things like offering reputation bonuses for replications.
The middle ground might be "Just starting to gain traction, where it's not meaningless, but not super meaningful."
I suppose there's a fourth option "Succeeds beyond my wildest dreams" where Peer Review becomes the standard place to publish and we manage to get the entire literature into the database. At that point we can do all kinds of cool things - automating literature reviews, detecting and flagging P-hacking, incentivizing replications.
Somewhere in this growth curve, we hopefully gain enough funding to hire some folks to help detect and counter the bad actors. But as long as they are the minority, the system should be self correcting to a large degree.
My thought for field evolution is that anyone can propose a new field, but they have to propose a parent and some percentage of the users in that field with referee reputation have to approve it (quarter? tenth? with a minimum base number?). That's to prevent small groups with agendas from creating echo chambers.
There's the counter point of adherents of popular ideas can sometimes stifle valid challenges to those ideas at first. But I think ideas tend to be popular (or should be) because they have a certain weight of evidence and theory behind them. And it should be challenging to overthrow those at first. As long as the average actor is a good actor - ie interested first and foremost in the truth and good science and willing to affirm good science that goes against their biases or supports findings they may not like - then this will work out in the long run.
Yes, the part I'm still chewing on is that first stage. Getting traction is the hard part. If it's big enough that people are trying to game it, then hopefully we have enough funding to work to counter that gaming in the areas where the system can't self correct. There's a degree to which trying to foresee what those ways will be now is a little bit pre-mature optimization. The gamers will come up with things we can't think of now, and thinks that seem like they would be obvious problems might turn out to be much smaller issues. Assuming the underlying assumption - that the average actor is a good actor as I defined it early - is true. If it isn't, then yeah, this system is fucked and won't work.
> I think that's up for the community decide. There have been high school science projects that got published before and were considered significant contributions.
I wrote many science fair projects to mean many in each science fair, not the rare few that produce novel and publishable results in the scientific literature.
Follow the instructions from a book about science fair projects. The result will be "well and honestly done work, with conclusions that may be reasonably drawn from the results" - exactly as you described. How many publications of "I synthesized aspirin from salicylic acid and acetic anhydride" do you want?
> up for the community decide
If this becomes a craze among high school science fair entrants, then the community will be high school science fair entrants. It will then be up to the to decide if tenured professors are allowed.
More to the point, I've long found "community" to be a difficult word to understand. How is it different from "users"? Are there users who aren't part of the community? Why aren't there multiple communities? We know there are multiple types of users (eg, HCI's "persona development".)
With no sense of what you want your community / user base to be, you can't tell if you've ended up with what you wanted.
> my understanding is that ORCID adoption is still limited (growing, but limited)
Last fall ORCID tweeted that many/most of the Nobel Prize winners had an ORCID. https://twitter.com/ORCID_Org/status/1445448209782894592 . One of the comments correctly pointed out "surprisingly a lot of researchers have their ORCIDs only as void placeholders required by their institutions to have."
That includes me.
> The current plan is to initialize the reputation system using citations - 1 citation = 1 upvote, essentially. It's not perfect, but it's the closest analog.
There are a lot of E. Smiths in the world. (This is what ORCID is trying to resolve.) Oh, and my name is $NOBEL_PRIZE_WINNER and email address is minecreeper666@washington-hs.state.us.
(Email addresses change as people move between institution. Someone's current email address might not be on any of their papers.)
Do preprints count as publications?
I tried to get around the publication system by having blog posts which were essentially preprints/publications. That failed - there's a disdain for the grey literature.
> But it's still the system that facilitates matching papers with the appropriate reviewers and referees, it's necessary.
Which is why it seems like you could start as an overlay system over existing preprint servers. They've already resolved issues about user id, audience, etc., which you could use a a starting point.
> That's to prevent small groups with agendas from creating echo chambers.
Real-world example: what PZ Myers refers to as the "panspermia mafia" - "They use their connections to promote a small family of fellow travelers."
> that the average actor is a good actor as I defined it early
Again, I urge you to consider that "good actor" is overly simplistic. If you truly think the system is broken, how are so many good actors not able to change it?
I would say academics are just people, and in many cases they want to be employed and successful.
At the moment, while the system is very bad in many ways, it used to work (in my opinion, I haven't done proper research into this) because:
* People who run "good" journals want their journal to be considered good, so they publish good papers.
* People want to get into "good" journals, so they want to write good papers that will be accepted. The existence of these papers is then used as evidence for promotions and grants.
* Academics want to read "good" journals, so they get their University to pay for a subscription to get physical copies in the library.
Previously this cycle held reasonably well -- it's been broken by the reduced cost of online journals, and journals wanting higher payments.
I think a good future system needs to (somehow) recreate this "virtuous cycle" -- making reviews as important as journals could be a good idea, I could imagine following a reviewer I thought was excellent, and a group of reviewers could create their own "virtual journal".
I would personally trust "Person X who I know thinks this paper is good" much more than "100 people thought this paper was good, 40 thought it was bad, so +60".
The way I'm thinking it would work is that journals can effectively create a team of reviewers on the platform and when you submit a paper, you could request review from their team.
Right now I'm thinking you could choose to request open review from everyone in your fields, review from one or more journals, or both at the same time. The journal editors then give their feedback through the review system, but it's still up to the author to choose to publish. If you successfully satisfy the journal, you get their stamp of approval on your paper when you publish. If not, you can still publish at will, but your risk their review team down voting you and you don't get the stamp of approval.
> I would personally trust "Person X who I know thinks this paper is good" much more than "100 people thought this paper was good, 40 thought it was bad, so +60".
I can definitely understand where that hesitation would come from - especially with experience in places like Reddit, where the crowd is not given guidelines and voting systems decay to the lowest common denominator.
My understanding of the research on the topic is that it actually supports the idea that the judgement of the crowd will often be better than the judgement of any individual -- when it's the right crowd, given the right guidelines. That's the model behind StackExchange and it's worked very well there. And that's the theory behind this (proposed) platform. The reputation and field model is geared towards identifying and creating groups of "the right people" for each topic/discipline/field and allowing people to submit their work to those groups.
It could create a similar virtual cycle:
* People want to support good science and get good review feedback, so they give good review feedback.
* People want to support and highlight good science, so they upvote good science, downvote dishonest science, and post responses with critical feedback on mediocre science.
* People want to have their work to be upvoted and receive positive responses, so they take review feedback seriously.
That's the theory anyway - the jury seems to be very much out on whether or not that can work with academia.
I guess what it comes down to is this: do you believe the average actor in science/academia is a good actor? IE. Are they honestly trying to be unbiased and advance good science/good work above all else? Or do you believe the average actor is a bad actor (OR that the incentive structure is so strong that even well intentioned actors are corrupted to be mediocre actors)?
If the average actor is a good actor, the system works and work will be rated/rewarded correctly. If the average actor is a bad actor, or the incentive structure is too strong, then this system would absolutely fail in the sort of worst case scenarios that have been described here.