This essay seeks to advance a discussion to meet the needs of designers of technologies–including of institutions–which are meant to help users accurately answer questions about reality, including questions about nature and morality. Specifically, it helps clarify the list of requirements that designs would have to satisfy in order to provide reasonable expectation that the technology would converge on truth. The design of the procedures of the Belleville Research Ethics Committee (BREC) are offered as a practical example of how such a list of requirements might inform design, and this essay aims to help other designers reason about potential improvement to the BREC procedures … [please read below the rest of the article].
Chris Santos-Lang’s “The Method of Convergent Realism” will be presented in two parts. Please find below Part I. Please refer to Part II. The PDF of the entire article is linked in the Article Citation.
Santos-Lang, Chris. 2022. “The Method of Convergent Realism.” Social Epistemology Review and Reply Collective 11 (1): 33-49. https://wp.me/p1Bfg0-6t5.
🔹 The PDF of the article gives specific page numbers.
Why Care and What to Hope For
In 2011, IBM’s Watson artificial intelligence won Jeopardy because it could read much more than any human could. Now doctors around the world have “Ask Watson” buttons they can press to find out what treatment Watson recommends for their patient. IBM expects doctors to become dependent on such buttons as society’s collective knowledge about health grows to exceed what individual doctors can read. Watson already follows over 300 medical journals and can read in over 13 languages. What human doctor can compete with that?
But why should Watson stop with doctors? Why shouldn’t I have an “Ask Watson” button to inform my decisions when planning my meals and exercise regimen? For example, when I eat at a restaurant, perhaps Watson could recommend items from the menu, factoring-in my personal health profile and eating history. Why shouldn’t legislators have an “Ask Watson” button they can press to inform their votes on health legislation? Why shouldn’t the button extend to legislation that impacts the health of our environment and economy? If we think doctors will need computers to inform their decision-making, why shouldn’t we expect policy-makers to need the same? Why shouldn’t I have an “Ask Watson” button that tells me which policy-makers to vote for in elections?
Even if Watson could help us be better voters, you might object that AI-guided voting would be pointless. If Watson, Siri, Alexa, Cortana, and Google all gave the same advice, and most voters trusted that advice, then outcomes would be decided before the vote, so it would be efficient to skip the actual voting ritual. Such profoundly informed voting might still be called “democratic,” but it would not be the form of government Winston Churchill referred to when he said the best argument against democracy “… is a five-minute conversation with the average voter.” One way such democracy would be new would be to manifest new vulnerabilities.
Much as one might anticipate threats to a nation’s elections, this article anticipates threats to the social epistemic method that would guide decision-making in a technologically informed world. John Dewey referred to that epistemic method as “the scientific method” (1910), but it will instead be called the “method of convergent realism” herein because it can inform all kinds of decision-making, including moral and mathematical decision-making in so far as moral and mathematical facts are real.
All theories of convergent realism share two commonalities (e.g. Putnam 1982; Hardin and Rosenberg 1982):
• Supposition that natural laws/facts exist, and;
• Supposition that at least one practical method exists which would reliably converge on those laws/facts.
Rather than defend these suppositions, this article simply assumes them and focuses on the puzzle, “What would have to be true of the most-efficient reliable method of convergent realism?”
In the past, some philosophers proposed that “convergence” should be understood as requiring a method that yields a sequence of theories, each one more accurate than the previous (e.g. Popper 1963; Post 1971). Philosophers employing that definition were trounced in 1981 (Laudan). In contrast, the definition of “convergence” employed in the current essay requires merely that the method of convergent realism settles on truth in the end. We prefer whichever method gets there the fastest, but any method which settles on truth qualifies, even if it takes detours along the way.
To exemplify convergence, consider four proposed methods to find the exit of a finite labyrinth:
1. Wait for the answer to come to you;
2. Wander randomly;
3. Wander randomly, but at each fork prioritize whichever path you’ve travelled least;
4. Wander randomly, but build a map as you explore and prioritize paths you haven’t explored yet.
The first method might not be reliable because one might wait forever. The other three methods involve randomness, so any of them could accidentally head away from the exit at some point (even if initially on the optimal path), and therefore fail to “converge” by the 1971 definition. By our new definition, however, the third and fourth methods both reliably converge on the exit, although the fourth is preferred over the third because it sometimes converges faster (and never slower).
Other proposals might need to be compared by applying them to sets of randomly generated labyrinths, but the four proposals listed above can be compared with mere reasoning: We can rule out the first two methods by recognizing the possibility of waiting forever and the (unlikely but real) possibility that random wandering may produce an infinite circle. We can recognize the inferiority of the third method compared to the fourth by recognizing a possibility like the following:
You come to a fork. Turning right brings you to a second fork in ten feet. Turning right at the second fork brings you to a third fork in 500 miles. Turning right at the third fork brings you to the left branch of the first fork in ten feet. Reaching the second fork a second time, the method requires you to go left because that is the only path at that fork you have not already traveled at least once. It dead-ends in ten feet, so you must return to the second fork for a third time. Now the method requires you to repeat the 500 mile path to the third fork because that is the only path you have not travelled at least twice. Clearly, this is inefficient—you already know that path goes to the third fork, and you know that you could instead get there in twenty feet by going the other way.
Like mathematicians who take years to find a proof but recognize it in minutes once it is brought to their attention, you might not see the inefficiency of the third method until a possibility like the one above is brought to your attention. The rest of this article brings similar possibilities to the reader’s attention–ways one could (inadvertently) design “Ask Watson” buttons that could lead society astray. Each potential for disaster implies a restriction on what the ideal method of convergent realism would have to entail in order to mitigate that potential.
This article does not assume that the ideal method of convergent realism aligns with current practice in science. On the contrary, this essay is sympathetic with modern efforts to reform science. Furthermore, this article does not assume with Dewey that every step of the method must be implemented by humans. Today, we must confront the possibility that machines might implement some steps better than humans. This essay also entertains the possibility that certain steps might best be reserved for certain kinds of humans–a division of labor which can make the ideal method necessarily social (Kitcher 1990).
By ruling-out alternative variations of the method of convergent realism, this article concludes that the ideal method of convergent realism must include at least these three steps:
1. Articulation of reasoning;
2. Independent tests of the reasoning;
3. Provision of means by which new discoveries will force retesting:
b. Conflict resolution;
c. Amendment process;
d. Expiration dates;
Any flaws in this conclusion could be expressed in terms of counter-examples in which ethics, mathematics or science can reasonably expect to converge on truth despite skipping one or more of these steps. Anyone who is aware of any such counterexample is asked to please share it via the PubPeer page for this article so that all readers can find it. Any reader of this article is asked to please check that page before trusting anything herein. Thus, the value of this article may be less to directly guide a redesign of modern AI and human pursuits of truth than to open a productive conversation that can guide such design going forward.
In addition to justifying each step of the method in terms of the potential disasters it would avert, this article will provide examples to demonstrate the feasibility of each step. In particular, the author has constructed and implemented a full set of governance documents for the Belleville Research Ethics Committee (BREC). They serve as a “proof of concept” in the domain of research ethics.
Step 1: Articulation of Reasoning
Reasoning has two parts: (a) a claim and (b) what Longino (1990) called “background assumptions.” In science, the claim is composed of a testable hypothesis, and the background assumptions justify methods used to test that claim. For example, reasoning might be composed of a claim about nitrogen plus background assumptions that include the assumption that a certain method is appropriate to purify the nitrogen used to test that claim. The discovery of a more reliable or precise way to purify nitrogen could challenge the background assumption part of that reasoning which could, in turn, change what we believe about the claim by obliging us to repeat its test with more reliably pure nitrogen. In mathematics, the claim is composed of a conclusion, and the background assumptions are composed of axioms and lemmas cited to justify each step in mathematical arguments. For example, the axioms of Euclidean geometry are background assumptions for the Pythagorean Theorem.
In ethics, the claim is a prescription, and the background assumptions include a supposed complete list of relevant ethical considerations such as:
• Does the prescription treat others with respect?;
• Is the prescription consistent with other accepted prescriptions (e.g. laws, traditions, religious beliefs)?;
• Compared to alternative prescriptions (or doing nothing), what consequences would following the prescription have for the well-being of humans alive today?;
• What consequences would following the prescription have for the well-being of biological, social or economic ecosystems?
The moral quality of a prescription is tested by application of the list of considerations. If a list were not complete–for example, if it excluded consideration of impacts on entities with whom we cannot empathize so directly (e.g. on the economy, on our ecosystem, on future generations, etc.)–then an initial endorsement of the prescription might be challenged by pointing-out the relevance of excluded considerations. The possibility that lists may be incomplete makes moral knowledge subject to revision (but with potential to converge) just like scientific knowledge.
In research ethics, the claim would be a prescription that a certain research plan should be followed; the background assumptions would be a supposed complete list of relevant considerations. For example, before conducting the first research on radioactive chemicals, the researchers might have considered whether a different research plan would answer the same questions more efficiently. The subsequent discovery that radiation can cause harm produced a new background assumption for research ethics: consideration of whether the research plan includes sufficient protocols to mitigate risks of harm due to radiation. From then on, scientists became obliged to additionally consider radiation risks, which made them augment their experiments with new safety protocols (or, in some cases, stop doing a given experiment at all).
The method of convergent realism does not require that new claims contain previous claims, but it does require that new reasoning addresses previous background assumptions. For example, once significantly better procedures to purify nitrogen become discovered, any scientist who purified nitrogen in the obsolete way would be deviating from the method of convergent realism (unless they could justify the use of the obsolete method for the given case). The same would be true of any scientist who failed to account for radiation risks once radiation’s potential to harm was discovered. It is by expanding and refining background assumptions that science, ethics, and so forth converge on truth.
To show that the ideal method of convergent realism must include articulation of reasoning, imagine an example in which Watson is fed claims without articulated background assumptions. Imagine a researcher is planning research and asks Watson whether implementing their plan would be ethical. Imagine Watson has read the published opinions of three distinct ethics committees who have already considered an equivalent research plan; one rejected the proposal for failing to mitigate radiation risks, but the other two approved the plan without considering radiation risks. If the published opinions did not articulate what ethical considerations each committee made, then Watson could only report, “Two out of three committees consider this plan ethical” when it should instead say “The morality of the plan may depend upon the relevance of radiation risks. Only one committee considered radiation risks, and that committee rejected this plan.”
As another example, imagine someone asked Watson to report the current best estimate of a measurable property of nitrogen and Watson had read the reports of three teams who measured that property; none of the researchers knew how their nitrogen was purified (each simply acquired a cylinder of nitrogen from some supply center), and the team whose nitrogen was purified in a more precise way got a lower result than the other two teams did. If the scientists did not report how the nitrogen they measured was purified, then Watson would report a weighted average of all three teams’ results, when it may be more appropriate to report only the results of the team that used the most pure nitrogen.
Ideally, the first time a certain kind of background assumption appears in reasoning, such as the first time anyone reported use of a more reliable way to purify nitrogen, Watson should automatically ask previous reporters of the same or contrary claims whether they dispute the relevance of the new background assumption. If the previous claimants do not dispute the relevance, it should ask them to add the same background assumption to their own report (i.e. repeat their measurement with the better purification process). It should also tell teams–before they execute a research plan–what background assumptions will be expected in their report (i.e. “if you plan to report on that claim, be sure to purify your nitrogen this way…”)
This essay demonstrates the feasibility of Step1 using the procedures of the Belleville Research Ethics Committee (BREC) as the example. The BREC procedures require investigators to articulate their ethical reasoning in three sections:
1. Facts include the research plan and any other information future investigators or committees might use to determine similarity to future cases;
2. Precedents list previously published opinions for similar cases;
3. Considerations and Reasoning list all ethical considerations made, explain how each was addressed, and defend any deviations from precedent.
Note that the BREC procedures place the responsibility to articulate ethical reasoning on those who propose research, rather than on the ethics committees who review it. In the future, it is doubtful that the method of convergent realism must necessarily delegate all of this responsibility to the individuals who make a claim. For example, perhaps individuals could ask Watson to find precedent, and to list all background assumptions articulated in precedent as a sort of checklist of what to consider with respect to their own research plan.
The first opinion authored by BREC cited 26 ethical considerations and 2 precedents (Committee Opinion about Replication of Merolla et al. 2017). The research plan was a replication study, so the first precedent was the opinion of the board which approved the original research. Instead of listing the ethical considerations it made, that board offered a letter noting that all approved plans must comply with “the Belmont principles, 45 CFR 46, and pertinent OHRP guidance” (Gerstein 2016). At least half of the ethical considerations listed in the BREC opinion went beyond that list of policies (partly because those policies intentionally ignore all ethical considerations beyond a scope called “human subjects”). Having better background assumptions made the BREC review more thorough than previous reviews (i.e. converging towards truth).
Furthermore, many of the 26 ethical considerations included in the first BREC opinion are relevant to many other research projects. Each research plan is different, so the list of considerations made by the next researcher might not include all 26 in BREC’s first review and also might include considerations beyond that list, but we should not be surprised if 99% of the millions of research plans implemented each year would be covered by the 50 most common ethical considerations. If so, convergence would have a sense of “settling” as the average number of new ethical considerations invented per research plan approaches zero and stays there. By incorporating information technology, an economy of scale can be achieved which makes articulation of reasoning very feasible.
It is worth noting that the BREC procedures described above are innovations inspired by formalization of the method of convergent realism. As far as we know, previous research ethics committees recorded merely “approved” or “disapproved” for each plan they reviewed. (Some probably recorded more, but will not disclose those records–this essay defers further discussion of transparency until Step 3). As explained above, such records could mislead Watson because they do not specify which ethical considerations were made (or failed to be made). Citing the Belmont principles, 45 CFR 46, pertinent OHRP guidance, or more-complete lists of consideration to be made as records of how ethicists actually reasoned assumes that those ethicists never made any mistakes (which is not plausible). Without amassing reliable precedent, even manual research ethics must constantly reinvent the wheel, and that makes it unreasonable to expect research ethics to converge on truth without reform. In this sense, the BREC procedures are an example not only that Step 1 is practically feasible, but also that it is not trivial–there is a real need to raise awareness of it.
Step 2: Independent Tests of the Reasoning
In science, the testing step consists in observing independent examples in which the claim manifests or does not. Often this takes the form of a controlled experiment and replications of it, but astronomy is a field in which hypotheses are created and tested empirically even though controlled experiments are impractical. In mathematics, the testing step consists in submitting the reasoning to independent peer-reviewers. In ethics, it consists in submitting the reasoning to some kind of independent committee or court (which may have a single judge, multiple judges, a jury, or even a court of public opinion).
Independence is relevant in all of these cases because all of these tests are fallible. Tests qualify as independent from each other to the extent that their differences allow them to correct for each other’s biases. Inability to correct for such bias is the problem with a mathematician providing their own peer-review, with a scientist collecting non-independent samples, or with a single ethicist serving as prosecutor, judge, and jury.
Independence may never be complete, but the method of convergent realism needs only enough independence to converge on truth eventually. In ethics, if a court fails to discern moral truth, then its opinion is expected to be overturned by a higher court or by the court of an independent (potentially future) community. In mathematics, if peer-reviewers fail to recognize an error, then it is expected that future readers will catch it. If an experiment yields a misleading result (which is statistically likely to happen from time to time), the error is expected to be caught via attempts to replicate it.
This essay held up BREC as a demonstration that it is feasible to implement Step 1 in research ethics by gradually increasing articulation of background assumptions over time. Likewise, it holds up BREC as a demonstration that Step 2 can be implemented in research ethics, gradually increasing independence of testing over time.
In the United States, special kinds of committees have evolved to provide independent tests for research ethics: Institutional Review Boards (IRBs) offer independent opinion about how well a research plan addresses considerations related to human subjects, Institutional Animal Care and Use Committees (IACUCs) offer independent opinion about how well the plan addresses considerations related to animal subjects, and Institutional Biosafety Committees (IBCs) offer independent opinion about how well the plan addresses considerations related to recombinant DNA. To increase independence in opinions, all IRBs, IACUCs and IBCs already strive to include non-academics—often called “community members”—who represent the perspective of the general community. Inclusion of community members is expected to mitigate two risks:
1. Professional researchers are biased not to reject too many proposals, lest they lose their income stream, and;
2. Professional researchers are less independent from each other because they share similar training.
Typical IRBs, IACUCs and IBCs currently have difficulty with including community members (Klitzman 2012), but BREC-style procedures address the obstacles to inclusion.
The first obstacle to including regular people seems to be that many regular people find it intimidating to be a minority in a crowd of professional scientists. Imagine being expected to cast the deciding vote in a meeting full of professors who passionately debate each other using technical vocabulary you don’t understand. Traditional IRBs, IACUCs and IBCs recruit enough professional scientists to cover the full range of specializations relevant to all research proposals, and such a large number of professionals inevitably makes community members a minority. In contrast, BREC is composed mostly of community members because it accesses specialist expertise relevant to a given proposed plan via temporary consultants. Furthermore, the BREC procedures require all members to agree to a set of expectations which include the expectation that “Expert members of the Committee are expected to teach non-expert members whatever is needed to form their own legitimate opinions.” Each BREC ethics review serves the dual purpose both to advance knowledge at the expert level and to disperse expertise into the non-expert community.
Liability can be a second obstacle. People have proven willing to volunteer in many capacities, but few will volunteer to be held responsible if anything goes wrong, and that makes people hesitate to volunteer for an ethics committee. The Health Care Quality Improvement Act of 1986 gave medical peer-reviewers qualified immunity from liability so they would be more willing to offer independent review. Hoffman and Berg (2005) recommended giving similar immunity to IRB members for the same reason. But, even if an IRB member can’t be sued, they might still feel responsible for harm done by scientists “under their watch.” The problem is that scientists delegate ethics to an IRB the way patients delegate health decisions to their doctors. In contrast, the BREC procedures require scientists to fully understand their own ethics. The BREC procedures make the review process entirely transparent to scientists, and require scientists to seek additional independent review until the scientists determine that review was conducted adequately.
Commitment-level can be a third obstacle. Most non-commercial IRBs, IACUCs and IBCs serve a specific laboratory, university or hospital, and review all research at that institution. The commitment to review all research an institution conducts requires the average community member to review one plan per day. This burden is too high for many people. In contrast, BREC expects to review only one plan per year, and the BREC procedures empower its Chair to keep that burden low by rejecting additional requests for opinion. Investigators who cannot be served by BREC are given step-by-step instructions to launch additional BREC-style committees (https://goo.gl/LZr5nS).
One potential criticism of BREC-style committees compared to the current status quo is the worry that BREC-style committees might not provide assurance that all research will get reviewed (since BREC-style committees can reject requests for opinion). One answer to this criticism is that society can have both kinds of committees, so BREC-style committees don’t need to provide everything that other research ethics committees can provide. Another answer is that traditional research ethics committees might have even less potential to provide such assurance. Limited supply of expert labor sets a cap on the number of proposals traditional IRBs can review non-superficially, and that cap may already have been reached (US Department of Health and Human Services 1998). IRBs should be expanding to include more and more ethical considerations (e.g. beyond human subjects). Instead, policy-makers are advancing policies to exempt more research plans from review (US Department of Health and Human Services 2017).
One famous answer to the problem of limited supply of expert labor is citizen science. As an example, faced with a classification challenge that experts could not complete in their own lifetime, the Galaxy Zoo project recruited 100,000 volunteers who classified more than 40 million galaxies in 175 days (Lintott et al. 2010). It remains to be seen whether hundreds of thousands of volunteers would likewise rise to the challenge of providing independent ethics review for every research plan scientists propose, but BREC-style procedures provide a way they could. We would have no reason to expect research ethics to converge on moral truth without non-superficial independent review, and that may require a flood of committees like BREC.
Other areas of ethics, or mathematics, or science could face the same obstacles to independent testing that BREC overcomes. For example, as science becomes more prolific, it might become impractical to independently test the replicability of each experiment without volunteer help. The most efficient method of convergent realism may need to bridge the gap between expert and non-expert scientists and mathematicians, and the similar gap between humans and AI. The competence of experts and AI both rely on independent testing, and the paths to assure independent testing may require that they elevate more of their community to similar competence. [Please refer to Part II of the article.]
Association for the Accreditation of Human Research Protection Programs, Inc. 2018. “2017 Metrics on Human Research Protection Program Performance for Academic Institutions.” Washington, DC. https://admin.aahrpp.org/Website%20Documents/ 2017%20Academics%20Metrics.pdf.
Belleville Research Ethics Committee Procedures. 2017. IRB 11228. https://doi.org/
Committee Opinion about Replication of Merolla et al. 2017. IRB 11228. http://doi.org/
Dewey, John. 1910. How We Think. Boston, MA: DC Heath.
Gerstein, Dean R. 2016. Correspondence. ClaremontLetter.pdf (Version: 1) https://osf.io/h95yv/.
Hardin, Clyde L. and Alexander Rosenberg. 1982. “In Defense of Convergent Realism.” Philosophy of Science 49 (4): 604-615.
Hoffman, Sharona and Jessica Wilen Berg. 2005. “The Suitability of IRB Liability.” Case Legal Studies Research Paper No. 05-4. http://doi.org/10.2139/ssrn.671004.
Kitcher, Philip. 1990. “The Division of Cognitive Labor.” The Journal of Philosophy 87 (1): 5-22.
Klitzman, Robert. 2012. “Institutional Review Board Community Members: Who Are They, What Do They Do, and Whom Do They Represent? “Academic Medicine: Journal of the Association of American Medical Colleges 87 (7): 975–981.
Lintott, Chris, Kevin Schawinski, Steven Bamford, Anze Slosar, Kate Land, Daniel Thomas, Edd Edmondson, Karen Masters, Robert Nichol, Jordan Raddick, Alex Szalay, Dan Andreescu, Phil Murray, Jan Vandenberg. 2010. “Galaxy Zoo 1: Data Release Of Morphological Classifications for Nearly 900,000 Galaxies.” Monthly Notices of the Royal Astronomical Society 410 (1): 166-178.
Laudan, Larry. 1981. “A Confutation of Convergent Realism.” Philosophy of Science 48: 19-48.
Longino, Helen E. 1990. Science as Social Knowledge: Values and Objectivity in Scientific Tradition and Change. Princeton: Princeton University Press.
Makel, Matthew C., Jonathan A. Plucker, and Boyd Hegarty. 2012. “Replications in Psychology Research: How Often do They Really Occur?” Perspectives on Psychological Science 7 (6): 537-542.
Popper, Karl. R. 1963. Conjectures and Refutations. London: Routledge.
Post, H. R. 1971. “Correspondence, Invariance and Heuristics: In Praise of Conservative Induction.” Studies in the History and Philosophy of Science 2: 213-255.
Putnam, Hilary. 1982. “Three Kinds of Scientific Realism.” The Philosophical Quarterly (1950-) 32 (128): 195-200.
Tononi, Giulio, Melanie Boly, Marcello Massimini, and Christof Koch. 2016. “Integrated information theory: from consciousness to its physical substrate.” Nature Reviews Neuroscience 17 (7): 450-461.
US Department of Health and Human Services. 2017. “Final Rule Enhances Protections for Research Participants, Modernizes Oversight System.” HHS Press Office. http://wayback.archive-it.org/3926/20170127095200/https://www.hhs.gov/
US Department of Health and Human Services. 1998. “Institutional Review Boards: A Time for Reform.” Office of Inspector General Publication OEI-01-97-00193. June G. Brown, Inspector General. Washington, DC. https://oig.hhs.gov/oei/reports/oei-01-97-00193.pdf.