Reason-based-voting increases the advantage of kindness

Short summary

Experiments were done to determine the effect of "reason based voting" on the quality of group decision-making in "request for proposals" surveys.

Users were shown a "good proposal", and a "bad proposal", then asked to dis/agree with each proposal. Some users voted normally, while others were required to view / choose / write reasons for their vote.

Requiring reasons while voting on proposals, increased users' accuracy on objective questions, by about 20%. On subjective questions, reasons increased the advantage of "kind" proposals by about 70%, but did not affect the advantage of "efficient" proposals. The data can support the conclusion that reason-based-voting nudges group decision-making toward greater correctness & kindness.

Theory

Group decision making

Humans' intelligence raised us to the level of apex-predator, and allowed us to occupy every ecosystem on earth. Many other organisms have prospered through prolific reproduction, or biochemical advantages. But for humans, intelligence has been one of the greatest contributors to our expansion & quality of life.

Yet "human progress has been mostly a story of what groups of people - not individuals - have accomplished" (Superminds, Malone, 2018), so our intelligence working in groups matters more than our individual intelligence. Our individual failures of intelligence tend to harm only ourselves, and a few family & friends around us. But our failures of intelligence as communities, organizations, and nations, affect both the members of those groups, and large numbers of other uninvolved people around the world. Collective decisions affect the economic commons, like knowledge systems, transportation systems, nature reserves, the sea, and the sky.

Moreover, group-intelligence can be modified by the communication technology that connects members of the group. For example, in "social media that reward items of information for how far they spread, using algorithms that reinforce people's existing assumptions ... obvious falsehoods... circulate widely..." at a scale that has significantly impacted national politics. Groups' communication technology can affect the degree to which "democratic institutions - parliaments, congresses, and parties - look dumber than the societies they serve." (Big Mind, Mulgan, 2017)

Communication-technology has harmed group-decision quality ‐ demonstrating that communication-technology can change group-decision quality. Now, perhaps we can create communication-technology that alters group-decision quality for the better, to improve people's collective intelligence & quality of life.

Problems

Group decision-making suffers from well-known pitfalls:

Authority bias: Users with official roles tend to have extra influence.
Group-think, social influence: Rationality may be compromised to avoid conflict and achieve consensus.
Group polarization, group shift: Group discussion may push members into more extreme positions, if the member distribution lacks a representative sample of opposing positions to push back. Members converge to the norms of the group, regardless of how abnormal is the sample of group members.
Information cascades, bandwagon effect: Late-voters may be influenced by early-voters, without verifying the basis for early-voters' choices.
Shared-information bias: People tend to spend more effort discussing reasons & information that most already knew, rather than considering new information.
Filter-bubbles: Decision-systems' information-sources may only provide information that agrees with a user's bias.

Because of all these problems, people often prefer distributed decision-making, avoiding the need for group consensus. Yet despite all these problems, people continue to engage in group-decision-making, because group action is powerful.

Approval voting

To overcome these difficulties of group-decision-making, people have designed a variety of voting-methods, which aim to integrate individuals' choices into a choice for the whole group. Methods like "majority rule", "ranked choice", and "liquid democracy", each have shown benefits in some situations.

One of the most reliable voting-methods is "approval voting", used on city & state ballot-propositions, where voters must vote "yes" or "no" to each proposal. Approval-voting has a variety of desirable voting properties, such as Pareto, Condorcet, "monotone", "clone independence"... Additionally, approval-voting can be more cognitively efficient than ranked-choice-voting when the number of options is very large, by automatically disapproving proposals that do not warrant attention.

However, approval-voting has not deterred voters from forming ill-conceived preferences, especially when voters were influenced by heavy marketing from vested interests. Governments have attempted to inform voters' choices on propositions, by supplying voter-guides with reasons "pro" & "con" each proposal. But the government-supplied reasons come from the propositions' polarized supporters / opponents, contain little evidence, and confuse most voters.

Reason based voting

To overcome these problems of making rational approval-voting choices with limited time & effort, we have re-invented a form of crowd-sourcing the best "pro" & "con" arguments for proposals. We use the term "reason based voting" to mean a voting system where voters can input the reason for their vote, and can also read other voters' reasons. Some versions require that the users provide a reason when voting.

In a sense, reason-based-voting is an online, lightweight, highly-structured version of "deliberative democracy" (When the People Speak, Fishkin, 2009). We attempted the difficult "trilemma" combining participation & equality & deliberation, by rigidly structuring the deliberation. There have been a variety of instances of reason-based approval-voting, including: Consider.it, Tricider, Thought Exchange, Regulation Room, Kialo, and more.

Converj created reason-based-voting methods that are similar to these predecesors. But Converj addressed group-decision-making pitfalls, using the following design points:

Problem Design solution

Group-think

All discussion must be in the form of reasons shared with all users.
No unstructured chat forums.
Consensus is established by integrated voting.

Group polarization
All users see opposing reasons.

Filter-bubbles
Users can observe some of the information upon which others decide, supplied as reasons & links.

Information cascades

The decision process is largely parallel.
Users can always input a new unique proposal or reason.
Some amount of opinion convergence is a goal of the system, in order to cluster users around a few best answers.

Social influence
Anonymity, with no reputation scores.

Authority bias

There are no authorities.
The host's only special abilities are to start & stop the discussion.
The host's posts are not distinguished from the participants' posts.

Problem	Design solution
Group-think	All discussion must be in the form of reasons shared with all users. No unstructured chat forums. Consensus is established by integrated voting.
Group polarization	All users see opposing reasons.
Filter-bubbles	Users can observe some of the information upon which others decide, supplied as reasons & links.
Information cascades	The decision process is largely parallel. Users can always input a new unique proposal or reason. Some amount of opinion convergence is a goal of the system, in order to cluster users around a few best answers.
Social influence	Anonymity, with no reputation scores.
Authority bias	There are no authorities. The host's only special abilities are to start & stop the discussion. The host's posts are not distinguished from the participants' posts.

Experimental method

Several reason-based-voting group-decision systems have been made freely available online. What benefits do these new styles of citizen-feedback offer to cities & organizations? Deliberative-democracy methods have demonstrated effects, increasing understanding of opposing viewpoints, and changing participants' initial less-informed opinions. But deliberative-democracy experiments usually take place in person, over many hours & days, with repeated polling. Because Converj surveys take place online, over short time-spans, with only a single round of polling, it was challenging to evaluate changes in users' opinions. To evaluate the effects of reason-based-voting, we tried several experimental methods.

With versus without reasons

The main experimental goal was to compare the quality of group decisions made with reason-based-voting, versus without. Since Converj normally requires users to input reasons when they vote, experimental options were implemented to hide & prevent reason-input. Then survey questions were asked, once with reasons required, and once with reasons blocked.

Survey types

Though Converj supports several types of reason-based surveys, the survey-type that best accommodates experimental comparison, is the "request for proposals survey". A request-for-proposals survey keeps reasons more distinct from voting, compared to the "auto-complete survey" that slightly blends reason-entry and voting. Additionally, compared to "participatory budget surveys", request-for-proposals voting is relatively binary, which simplifies comparison of survey results.

Survey topics

The topic of a survey can change the effect of reason-based-voting. Therefore, surveys were run to cover both objective & subjective questions. For subjective questions, both controversial & less contentious topics were used.

Group composition

To start, each group was composed of about 100 Amazon Mechanical Turk workers. However, these participants showed low comprehension of the questions, and low willingness to provide sincere responses. So groups were filtered to use only local workers, in their native language. Then group sizes were reduced to around 50 crowd-workers per experimental group.

To ensure that both without-reasons & with-reasons conditions have comparable user-groups, it would be useful to run surveys both without-reasons and then with-reasons, on the same users. But repeated surveying increases user reflection, so answers to the second survey would show those confounding effects. Randomizing the order of with/out-reason surveys could help, but the reason effect is "sticky", such that having once seen reasons to vote, users cannot forget those reasons while voting in the without-reasons survey. So separate user groups were used for the with/out-reason survey conditions.

Proposals

To ensure that proposals were comparable, the surveys were restricted to only experimenter-generated proposals. In a real request-for-proposals survey, users are allowed to enter any proposals they want, with better proposals being voted to the top of the page, and worse proposals being buried & ignored at the bottom. Each survey contains a different mix of similar user-generated proposals in a variety of phrasings. Therefore the quality of user-generated proposals was difficult to rate or compare objectively between surveys in different experimental conditions. So the experiment only evaluated the performance of identical proposals, supplied by the experimenter, in each survey-condition.

To ensure the visibility of experimenter-generated proposals, only a few proposals were allowed. During a real survey, proposals are introduced at a variety of timings, and the momentary subset of participants in attendance may bury a proposal that most participants would support. The "precision" of top-rated proposals is achieved by down-voting "bad proposals" from the top positions. But the "recall" for "good proposals is not assured, since many good-proposals may be ignored by chance, or by users' incomplete search for proposals they would approve. So we limited proposals to only 1 good-proposal & a few bad-proposals written by the experimenter, so that users in all survey-conditions, saw all proposals. The good-proposals were designed to be true / effective / efficient, and the bad-proposals were designed to be opposite.

The variance between experimental-groups is smaller if the same users populate as many experimental-conditions as possible. Therefore we showed both the good-proposal & bad-proposals to the same experimental-group of users.

Data

Objective question surveys

To experiment with objective-questions, we used the request-for-proposals-survey to ask a multiple-choice-question. The experimenter proposed 1 correct answer, and several incorrect answers. Participants were instructed to agree with 1 proposed-answer, and disagree with the rest.

Though it was possible for a user to approve multiple conflicting answers, nonetheless, survey-results could be measured by the "correct proposal agree share" =

agree votes of correct proposal

agree votes of all proposals

Surveys were run on objective questions about math, logic, and biology facts. The effect of reasons on correct-proposal agree-share was clear, large, with low variance. Requiring reason-based-voting increased agreement with the correct answer by over 20% on average.

Subjective question surveys

Experiments with subjective-questions, asked questions that depended partly on participants' personal values. The experimenter provided "good proposals" and "bad proposals", where good-proposals included effective government policy based on research from economics, sociology, and other sciences.

For each proposal, the "agree percentage" =

agree votes

agree votes + disagree votes

Though different proposals received varying engagement and vote-totals, their agree-percentage could still be compared.

One possible effect of reason-based-voting, could be to reduce the agree-percentage for all proposals, both good & bad. Reasoned reflection might cool optimistic agreement. But the data from subjective-surveys showed no significant shift in agree-percentage due to reasons. However, reason-based-voting did show a slight widening of the variance in agree-percentage, perhaps due to reasons solidifying users' opinions.

For each request-for-proposals survey, the "good proposal advantage" =

agree percent of good proposal - agree percent of bad proposal

Different surveys received higher / lower agreement for all proposals, due to varying participant groups in each survey. But a survey's good-proposal-advantage could still be compared to other surveys'.

Disregarding reasons, efficient-proposals showed a noisy but visible agree-percentage-advantage compared to inefficient-proposals. And efficient-proposals appeared to have more consistent agreement, with lower variance.

But reasons did not increase the efficient-proposal-advantage. While participants were more likely to vote for efficient-proposals over inefficient-proposals, reason-based-voting did not make it more likely.

Yet there did appear to be significant voting-pattern shifts when requiring reasons. Since those shifts did not conform to ideals for efficient government policy, perhaps users' reasons were pushing them toward some other values besides efficiency.

One well-established political value, is about "kindness" / "compassion". ^{[1,
2,
3]} To examine whether reason-based-voting was affecting participants' values other than efficiency, the experimental proposals were re-categorized as kind / unkind, and then the kind-proposal-advantage was calculated for with-reasons & without-reasons conditions.

Efficient-proposals & kind-proposals both showed nearly equal agree-percentage-advantage, above 25%. Both efficient-proposals & kind-proposals showed lower variance & more consistent agreement compared to inefficient-proposals & unkind-proposals.

But the kind-proposal advantage over unkind-proposals, increased when reasons were required.

The reason-effect on kind-advantage was almost always positive. On average, reasons raised the kind-advantage by 15 points, a 70% increase.

Sometimes the reason-effect on kind-proposal-advantage was small, though for some topics the effect was much larger.

The reason-effects on kind-proposal-advantage were distributed less like a wide noisy gaussian, more like a positive-skewed gamma.

Conclusions

The data show that reason-based-voting increases the accuracy of group-decision-making for objective issues. And data support the conclusion that on subjective issues, reason-based-voting increases the advantage of kind proposals over unkind proposals. Since making effective group decisions requires a combination of factual correctness & personal values, the objective & subjective advantages of reason-based-voting imply improvements to the quality of group decisions.

The population of quasi-volunteer test-subjects who participate in these experiments, may be somewhat intellectual and predisposed to kindness as a value. Nonetheless, if reason-based-voting helps people to vote more in keeping with their values, no matter what those values are, it appears to be a step forward for democratic representation.

Legislative professionals tend to prefer government policy that efficiently & effectively achieves high quality-of-life. Shifting democratic processes not toward efficiency, but toward more kindness, would be a step in a different yet similar direction.

Reason-based direct-voting could contribute to a balance-of-power with elected representatives who provide expertise & moral-indifference. Many people are concerned that direct-voting may contribute to the disease of populist oppression. But populist oppression already flourishes in many representative democracies. Reasons & kindness may be part of the cure, inoculating direct-voting against majoritarianism.

Future work

Experimental improvements

The experiments described here could be improved in several ways.

The proposals could be re-classified according to various concepts of "good" & "bad" proposals. Then the reason-effect on good-proposal-advantage could be re-calculated, perhaps discovering more significant effects of reason-based-voting on certain concepts of good-proposals.

Additionally, it would be useful to provide multiple experimenter-written proposals at once, and evaluate their rank-ordering using more advanced statistical methods. Having a single experimental user group rank multiple proposals, may provide more noise-tolerance and a clearer signal of reason effects.

How to do your own experiment

You can create a request-for-proposal with additional experimental-options.

The "done link" option will allow you to direct users who complete the survey, to a following survey or to a mock survey that displays end-of-survey instructions.
The "hide reasons" option will create a survey that does not require / allow reasons to vote.

Create initial proposals, then freeze them.

Once you create a request-for-proposals survey, you can add a few proposals yourself, then "block new proposals" to limit users to voting on your proposals.
Once you have completed running users through the survey, you may "block user input" to preserve the results without further changes.

Here is an example request-for-proposals survey, which blocks new proposals, but allows reason-based-voting on the existing proposals.

The Converj system is free to use, and open-source , so you can create your own instance and modify it. Or, build some of these non-proprietary features into your own existing projects.

Disclosures

This experimental work was self-funded by Converj LLC. No outside money or contributions were involved. Converj LLC is currently comprised of Chad Brower.