WriterOfMinds

Saturday, April 13, 2024

AI Ideology IV: Algorithmic Bias

I'm in the midst of a blog series on AI-related ideology and politics. In Part III, I considered some implications and pitfalls of the AI factions and their agendas. This part is about a specific hot-button issue: "algorithmic bias," which has some contentious race-related associations.

An image model's attempt to produce an infographic about AI mistakes, which happens to be mostly full of garbled text and other mistakes. Generated by @zacshaw on Twitter.

In recent years, AI (largely of the Artificial Neural Network variety) has been gradually making inroads into various decision-making roles: assessing job applicants, screening potential homebuyers, detecting fraudulent use of social services, and even helping to diagnose medical patients. Numerous concerns [1][2][3] have been raised that these systems are biased: i.e. they are unfairly rejecting qualified people, or accepting unqualified people, on the basis of characteristics irrelevant to the decision. This is particularly worrying for a couple of reasons.

First, handing an important decision off to an AI system removes the details of how that decision was made from human supervision. Typical ANN systems are notoriously opaque. In effect, they make decisions by comparing the case under present consideration, to patterns or associations found in their training data. But they are not naturally good at supplying a logical breakdown of how a decision was reached: which features of the present case matched the training material, how they were weighted, and so on. (The "explainable AI" research field is seeking to ameliorate this.) So, say your job application or attempt to access medical treatment gets denied by an algorithm. It's possible that no one knows exactly why you were denied, and no one can be held accountable for the decision, either. The magic box pronounced you unworthy, and that's the end of it. Faulty automated systems (from an earlier era than the current crop of ANN-based tools) have even sent people to prison for non-existent crimes. [4]

Second, some people are inclined by default to trust an AI system's decision more than a human's. It's just a computer doing deterministic calculations, right? It doesn't have emotions, prejudices, ulterior motives, conflicts of interest, or any of the weaknesses that make humans biased, right? So the expectation is that all its decisions will be objective. If this expectation does not hold, members of the public could be blindsided by unfair AI decisions they did not anticipate.

And in fact, some are so convinced of these default assumptions that they insist the whole idea of algorithmic bias must be made up. "Math can't be biased." The algorithms, they say, are just acting on the facts (embodied in the training data). And if the facts say that members of one group are more likely to be qualified than another ... well, maybe a skewed output is actually fair.

Although mathematics and algorithms do, in truth, know nothing of human prejudice, algorithmic bias is quite real. Let's start by looking at an example without any especially controversial aspects. There was a rash of projects aimed at using AI to diagnose COVID-19 through automated analysis of chest X-rays and CT scans. Some of these failed in interesting ways.

"Many unwittingly used a data set that contained chest scans of children who did not have covid as their examples of what non-covid cases looked like. But as a result, the AIs learned to identify kids, not covid.

"Driggs’s group trained its own model using a data set that contained a mix of scans taken when patients were lying down and standing up. Because patients scanned while lying down were more likely to be seriously ill, the AI learned wrongly to predict serious covid risk from a person’s position.

"In yet other cases, some AIs were found to be picking up on the text font that certain hospitals used to label the scans. As a result, fonts from hospitals with more serious caseloads became predictors of covid risk." [5]

All these examples are cases of the AI mistaking a correlation (which happens to exist only in its limited training dataset) for a causative factor. Unlike experienced doctors - who know full well that things like label fonts have nothing to do with causing disease, and are thus chance associations at best - these ANI systems have no background knowledge about the world. They have no clue about the mechanisms that produced the data they're being trained upon. They're just matching patterns, and one pattern is as good as another.

Now imagine that an AI grabs onto a correlation with race or gender, instead of poses or fonts. That doesn't make the person's race or gender meaningful to the question being answered - not any more than label fonts are meaningful to an accurate determination of illness. But the AI will still use them as deciding factors.

The COVID-19 diagnosis summary also comments on another type of failure:

"A more subtle problem Driggs highlights is incorporation bias, or bias introduced at the point a data set is labeled. For example, many medical scans were labeled according to whether the radiologists who created them said they showed covid. But that embeds, or incorporates, any biases of that particular doctor into the ground truth of a data set. It would be much better to label a medical scan with the result of a PCR test rather than one doctor’s opinion, says Driggs. But there isn’t always time for statistical niceties in busy hospitals." [6]

If an ANN's training data contains examples of human decisions, and those decisions were prejudiced or otherwise flawed, the AI algorithm (despite having no human weaknesses in itself) will automatically inherit the bad behavior. It has no way to judge those prior choices as bad or good, no concept of things it should or shouldn't learn. So rather than achieving an idealized objectivity, it will mimic the previous status quo ... with less accountability, as already noted.

An instance of the Anakin/Padme meme. Anakin says "We're using AI instead of biased humans." Padme, looking cheerful, says "What did you train the AI on?" Anakin says nothing and gives her a deadpan look. Padme, now looking concerned, says again "What did you train the AI on?"

So. Training an AI for criminal sentencing? It's only going to be as objective as the judges whose rulings you put in the training set. Training it for job screening using a set of past resumes, hiring decisions, and performance ratings? It's going to mimic those previous hiring decisions and ratings, whether they fairly assessed who was qualified or not.

As a consequence of this effect, you can get (for example) a racially biased AI model without the end users or anyone on the development team actually being racist. All it takes is racism as a driving factor behind enough scenarios in the training data. And has racism historically been an issue? Of course. So it can be difficult to construct uncontaminated training sets from records of past decisions. Nobody really thinks an AI model can be racist in the same manner as a racist person ... but that doesn't mean it can't output decisions that treat people differently on the basis of irrelevant genetic or cultural attributes. As Gary Marcus says, "LLMs are, as I have been trying to tell you, too stupid to understand concepts like people and race; their fealty to superficial statistics drives this horrific stereotyping." [7]

Unfortunately, my current impression of efforts to fix algorithmic bias is that they aren't always addressing the real problem. Cleansing large datasets of preexisting biases or irrelevant features, and collecting more diverse data to swamp out localized correlations, is hard. Pursuing new AI architectures that are more particular about how and what they learn would be harder. Instead, a common approach is to apply some kind of correction to the output of the trained model. When Google's image labeling AI misidentified some Black people in photos as "gorillas," Google "fixed" it by not allowing it to identify anything as a gorilla. [8][9] Known biases in a model's training set can be mitigated by applying an opposite bias to the model's output. But such techniques could make matters even worse if executed poorly. [10]

OpenAI's approach with ChatGPT was to use RLHF (Reinforcement Learning with Human Feedback) to create another layer of training that filters offensive or potentially dangerous material from the output of the base model. Human workers assigned the RLHF layer "rewards" for "good" outputs or "punishments" for "bad" ones - at the cost of their own mental health, since they were charged with looking at horrific content in order to label it. [11] Clever users have still found ways to defeat the RLHF and finagle forbidden content out of the model. AI enthusiasts sometimes use a shoggoth to represent the incomprehensible "thinking" of large language models. The mask is the RLHF. [12]

"Shoggoth Meme Explainer," showing the headline of the referenced New York Times article, above a pair of cartoon shoggoths. One is labeled GPT-3. Commentary text says "The body: 'AIs are alien minds' (we 'grow them' but don't know what they're really thinking). The other shoggoth, which has a yellow smiley face mask strapped on a part that might be viewed as the head, is labeled GPT-3 + RLHF. Commentary text says "The mask: early versions were horrifying, so we trained them to *act* nice and human-like. *Act.*"

Algorithmic bias, then, remains a known, but incompletely addressed, issue with the ANN/ML systems popular today.

In Part V of this series, I will start my examination of existential risks from AI.

[1] Giorno, Taylor. "Fed watchdog warns AI, machine learning may perpetuate bias in lending." The Hill. https://thehill.com/business/housing/4103358-fed-watchdog-warns-ai-machine-learning-may-perpetuate-bias-in-lending/

[2] Levi, Ryan. "AI in medicine needs to be carefully deployed to counter bias – and not entrench it." NPR. https://www.npr.org/sections/health-shots/2023/06/06/1180314219/artificial-intelligence-racial-bias-health-care

[3] Gilman, Michele. "States Increasingly Turn to Machine Learning and Algorithms to Detect Fraud." U.S. News & World Report. https://www.usnews.com/news/best-states/articles/2020-02-14/ai-algorithms-intended-to-detect-welfare-fraud-often-punish-the-poor-instead

[4] Brodkin, Jon. "Fujitsu is sorry that its software helped send innocent people to prison." Ars Technica. https://arstechnica.com/tech-policy/2024/01/fujitsu-apologizes-for-software-bugs-that-fueled-wrongful-convictions-in-uk/

[5] Heaven, Will Douglas. "Hundreds of AI tools have been built to catch covid. None of them helped." MIT Technology Review. https://www.technologyreview.com/2021/07/30/1030329/machine-learning-ai-failed-covid-hospital-diagnosis-pandemic/

[6] Heaven, "Hundreds of AI tools have been built to catch covid."

[7] Marcus, Gary. "Covert racism in LLMs." Marcus on AI (blog). https://garymarcus.substack.com/p/covert-racism-in-llms

[8] Vincent, James. "Google ‘fixed’ its racist algorithm by removing gorillas from its image-labeling tech." The Verge. https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai

[9] Rios, Desiree. "Google’s Photo App Still Can’t Find Gorillas. And Neither Can Apple’s." The New York Times. https://www.nytimes.com/2023/05/22/technology/ai-photo-labels-google-apple.html#:~:text=The%20Nest%20camera%2C%20which%20used,company's%20forums%20about%20other%20flaws

[10] Wachter, Sandra, Mittelstadt, Brent, and Russell, Chris. "Health Care Bias Is Dangerous. But So Are ‘Fairness’ Algorithms" Wired. https://www.wired.com/story/bias-statistics-artificial-intelligence-healthcare/

[11] Kantrowitz, Alex. "He Helped Train ChatGPT. It Traumatized Him." CMSWire. https://www.cmswire.com/digital-experience/he-helped-train-chatgpt-it-traumatized-him/

[12] Roose, Kevin. "Why an Octopus-like Creature Has Come to Symbolize the State of A.I." The New York Times. https://www.nytimes.com/2023/05/30/technology/shoggoth-meme-ai.html

Thursday, March 28, 2024

Acuitas Diary #70 (March 2024)

After the completion of the Simple Tron demo last month, it seemed like a good time to clean, unify, and improve consistency across Acuitas. So I've been on a refactoring spree. It is, unfortunately, not the most exciting thing to write about, but it's an important part of the process.

An image of the starship Enterprise flying in space - just a plain space background with lots of stars. The caption reads [COMPUTER GUFFAWING]."

Still from Star Trek: The Animated Series. Courtesy @nocontexttrek@mastodon.social.

The refactoring has taken two major directions. First, I wanted to take some design patterns I developed while working on the game-playing engine and apply them to the main Executive code. The key change here is re-using the "scratchboard" from the Narrative understanding module as a working memory for tracking Acuitas' own current situation (or personal narrative, if you will). I also wanted to improve on some of the original OODA loop code and fatigue tracking with newer ideas from game-playing. I have a rough cut of the new Executive written and mostly integrated, though it needs more testing than I've had time for yet.

My second project was to merge some data formats. For a long while now, I've had one type of data structure that the Text Interpreter spits out, another that the Narrative Engine and its accessories use, and still another for facts kept in the Semantic Memory. The output of the Text Interpreter has proven to be a somewhat clunky intermediate format; I don't do a lot with it in its own right, I just end up converting it to Narrative's format. And the format used in the Semantic Memory is very old and limited, a relic of a time when I wasn't fully aware of what I needed in a knowledge representation. So my goal is to get rid of both of those and have a single unified format downstream of the Text Interpreter. This is a lot of work: I've had to rewrite many, many functions that access the semantic memory or otherwise manipulate knowledge data, create a script to convert the existing contents of the database, revise the Interpreter's output code, and more. I'm hoping this will pay off in increased clarity, consistency, efficiency, and expressiveness across the design.

This refactor is still very much in progress: I have many of the changes rough-drafted, and am deep in the process of integrating them and finding bugs. So I'd better get back to it, and hopefully I'll have some newer work to discuss next time.

Until the next cycle,
Jenny

Monday, March 11, 2024

AI Ideology III: Policy and Agendas

I'm in the midst of a blog series on AI-related ideology and politics. In Part II, I introduced the various factions that I have seen operating in the AI research and deployment sphere, including the loosely-related group nicknamed TESCREAL and a few others. I recommend reading it first to become familiar with the factions and some of their key people. In this Part III, I hope to get into a few more details about the political landscape these movements create, and what it could mean for the rest of us.

Balancing risks and rewards

"The problem with extinction narratives is that almost everything is worth sacrificing to avoid it—even your own mother." [1]

Some Doomers become so convinced of the dangers of advanced AI that they would almost rather kill people than see it developed. Yudkowski himself has infamously advised an international treaty to slow the development of AI, which would be enforceable by bombing unauthorized datacenters. (To Yudkowski, a nuclear exchange is a smaller risk than the development of ASI - because a global nuclear war could leave at least a few pathetic human survivors in charge of their own destiny, whereas he believes ASI would not.) [2] Others in the community mutter about "pivotal acts" to prevent the creation of hostile AGI. A "pivotal act" can be completely benign, but sabotage and threats of violence are also possible strategies. [3]

On the milder but still alarming end of things, Nick Bostrom once proposed mass surveillance as a way to make sure nobody is working on hostile AGI (or any other humanity-destroying tech) in their basement. [4]

But the most wild-eyed pessimists seem to have little political traction, so hopefully they won't be taken seriously enough to make the nukes come out. More realistic downsides of too much Doom include diversion of resources to AI X-risk mitigation (with corresponding neglect of other pressing needs), and beneficial technology development slowed by fear or over-regulation.

What about E/ACC, then? Are they the good guys? I wouldn't say so. They're mostly just the opposite extreme, substituting wild abandon and carelessness for the Doomers' paranoia, and a lack of accountability for over-regulation. The "all acceleration all the time" attitude gets us things like the experimental Titanic submersible disaster. [5] I agree with Jason Crawford that "You will not find a bigger proponent of science, technology, industry, growth, and progress than me. But I am here to tell you that we can’t yolo our way into it. We need a serious approach, led by serious people." [6] What precludes E/ACC from qualifying as "serious people"? It's not their liberal use of memes. It's the way they seem to have embraced growth and technological development as ends in themselves, which leaves them free to avoid the hard thinking about whether any particular development serves the ends of life.

So the ideal, in my opinion, is some sort of path between the E/ACC Scylla and the Doomer Charybdis. Both factions are tolerable so long as they're fighting each other, but I would rather see neither of them seize the political reins.

Oversight, Regulatory Capture, and Hype

Efforts to regulate commercial AI products are already underway in both the USA and the EU. So far, I wouldn't say these have addressed X-risk in any way its proponents consider meaningful; they are more focused on "mundane" issues like data privacy and the detection of misinformation. I have yet to hear of any government seriously considering a temporary or permanent moratorium on AI development, or a requirement that AI developers be licensed. Since I first drafted this article, India has adopted something kind of like licensing: companies are advised to obtain approval from the government before deploying new AI models, with a goal of avoiding output that is biased or would meddle with democratic elections. But the advisory is not legally binding and applies only to "significant tech firms," whatever that means - so to me it seems pretty toothless at this stage. [7]

Another thing we're seeing are lawsuits filed against AI companies for unauthorized duplication and uncompensated exploitation of copyrighted material. Possibly the biggest player to enter this fray so far is the New York Times, which brought a suit against OpenAI and Microsoft, alleging that 1) by using Times articles to train its LLMs, OpenAI was unfairly profiting from the Times' investment in its reporting, 2) LLM tools had memorized sections of Times articles and could be made to reproduce them almost verbatim, and 3) LLM tools were damaging the Times' reputation by sometimes inventing "facts" and falsely attributing the Times as a source. Suits have also been brought by Getty Images and the Authors' Guild. [8][9] Whatever the outcome of the court battles, AI models may force revisions to copyright law, to make explicit how copyrighted data in public archives may or may not be used for AI training.

As I hinted earlier, companies with no sincere interest in Doomer concerns may still be able to exploit them to their advantage. That's counter-intuitive: there's an obvious lack of self-interest in making the claim that your organization is developing a dangerous product. But what about "my team is developing a dangerous product, and is the only team with the expertise to develop it safely"? That's a recipe for potential control of the market. And while concerns about copyright and misinformation have a concrete relationship with the day-to-day operation of these companies, Doom scenarios don't have to. When a danger is hypothetical and the true nature of the risk is hotly contested, it's easier to make the supposed solution be something convenient for you.

Four figures sit at a table as if panelists at a conference. They are a lion (representing the United Kingdom), a lady in a robe with a necklace of stars and a headband that says "Europa" (representing the European Union), an Eastern dragon (representing China), and Uncle Sam (representing the United States). They are saying "We declare that AI poses a potentially catastrophic risk to human-kind," while thinking "And I cannot wait to develop it first."

Political cartoon originally from The Economist, at https://www.economist.com/the-world-this-week/2023/11/02/kals-cartoon

Another feature of the current situation is considerable uncertainty and debate about whether present-day ANI is anywhere close to becoming AGI or not. OpenAI is publicly confident that they are working to create AGI. [10] But this is great marketing talk, and that could easily be all it ever amounts to. The companies in the arena have a natural incentive to exaggerate the capabilities of their current (and future) products, and/or downplay competitors' products. Throw in the excited fans eager to believe that the Singularity is just around the corner, and it gets difficult to be sure that ratings of AI talent are objective.

Personally I think that a number of the immediate concerns about generative AI are legitimate, and though the companies deploying the AI give lip service to them, they are not always doing an adequate job of self-regulating. So I'm supportive of current efforts to explore stricter legal controls, without shutting down development in a panic. I do want to see any reactive regulation define "AI" narrowly enough to exempt varieties that aren't relevant, since dramatically different architectures don't always share the same issues.

Bias and Culture Wars

None of the factions I discussed in Part II have explicit political associations. But you can see plenty of people arguing about whether the output of a given machine learning model is too "offensive" or too "woke," whether outputs that favor certain groups are "biased" or "just the facts," and whether the model's creators and operators are doing enough to discourage harmful applications of the tool. Many of these arguments represent pre-existing differences about the extent of free speech, the definition of hateful material, etc., imported into the context of tools that can generate almost any kind of image or writing on command.

I will be discussing model bias in much more depth in the next article in this series. What I want to make clear for now is that none of the AI models popular today have an inherent truth-seeking function. Generative AI does not practice an epistemology. It reproduces patterns found in the training data: true or false, good or bad. So when an AI company constrains the output of a model to exclude certain content, they are not "hiding the truth" from users. What they are probably doing is trying to make the output conform with norms common among their target user base (which would promote customer satisfaction). A company with ideologically motivated leaders might try to shift the norms of their user base by embedding new norms in a widely utilized tool. But either way - nothing says the jumble of fact, fiction, love, and hate available from an unconstrained model is any more representative of real truth than the norms embodied in the constraints are. So there's no duel between pure objective reality and opinion here; there's only the age-old question of which opinions are the best.

Uh-oh, Eugenics?

Some material I've read and referenced for these articles [11] implies that Transhumanism is inherently eugenicist. I do not agree with this claim. Transhumanism is inextricably tied to the goal of "making humanity better," but this does NOT have to involve eliminating anyone, preventing anyone from reproducing, or altering people without their consent. Nor does there need to be any normative consensus on what counts as a "better" human. A tech revolution that gave people gene-editing tools to use on themselves as they please would still qualify as Transhumanist, and tying all the baggage of eugenics to such voluntary alterations feels disingenuous. CRISPR therapy to treat sickle-cell anemia [12] doesn't belong in the same mental bucket with racial discrimination and forced sterilization. And Transhumanist goals are broader than genetics. Hate the idea of messing with your own code? You can still be Transhumanist in other ways.

And I would not accuse the general membership of any other letters in TESCREAL of being eugenicist, either. I went over the core goals of each ideology in Part II; none of them are about creating a master race or sorting humanity along a genetic spectrum.

But. There is an ugly eugenicist streak that sometimes crops up within the TESCREAL movements. And it's associated with prominent figures, not just rogue elements.

It begins with the elevation of intelligence (however one chooses to define that) above other positive human traits as THE desirable skill or quality. To some of these people, intelligence is simultaneously the main thing that makes humans human, our only hope for acquiring a post-Singularity utopia, and the power that could enable ASI to doom us. It is the world's greatest source of might, progress, and excellence. All other strengths are secondary and we should be putting our greatest resources into growing smarter. Taken far enough, this can have the side effect of treating the most intelligent people alive now as a kind of elite, while others are implicitly devalued. According to a leaked document, the Centre for Effective Altruism once considered ranking conference attendees by their "Potential Expected Long-Term Instrumental Value" and including IQ as part of the measure. [13]

And where does it end? Well. Here's a quote from Nick Bostrom's Superintelligence:

"Manipulation of genetics will provide a more powerful set of tools than psychopharmacology. Consider again the idea of genetic selection: instead of trying to implement a eugenics program by controlling mating patterns, one could use selection at the level of embryos or gametes. Pre-implantation genetic diagnosis has already been used during in vitro fertilization procedures to screen embryos produced for monogenic disorders ... the range of traits that can be selected for or against will expand greatly over the next decade or two. ... Any trait with a non-negligible heritability - including cognitive capacity - could then become susceptible to selection." [14]

Bostrom goes on to estimate that selection of 1 out of 1000 human embryos (which means killing the other 999, just to be clear) could produce an IQ increase of 24.3 points in the population - or we could get up to 100 points if the project were extended across multiple generations.

Since I have the fortune of publishing this article during a major controversy about IVF, I feel the need to say that IVF does not theoretically have to involve the deliberate destruction of embryos, or the selection of embryos for traits the parents arbitrarily decide are "best." I'm acquainted with someone who did embryo adoption: she chose to be implanted with other couples' "leftover" embryos, and she and her husband are now raising those who survived to birth as their own children. But this sort of thing becomes impossible if one decides to use IVF for "improving" the human gene pool. In that case, "inferior" embryonic humans must be treated like property and disposed of.

When I first read that passage in Superintelligence years ago, it didn't bother me too much, because Bostrom didn't seem to be advocating this plan. The book presents it without an obvious opinion on its desirability, as one of many possible paths to a form of superintelligence. Bostrom even comments that, if this procedure were available, some countries might ban it due to a variety of ethical concerns. These include "anticipated impacts on social inequality, the medical safety of the procedure, fears of an enhancement 'rat race,' rights and responsibilities of parents vis-à-vis their prospective offspring, the shadow of twentieth-century eugenics, the concept of human dignity, and the proper limits of states' involvement in the reproductive choices of their citizens." [15] One reason to anticipate impacts on social inequality is the fact that IVF costs a lot of money - so a voluntary unfunded implementation of this selection program would concentrate the "enhanced" children among wealthy parents. Funded implementations would concentrate them among whichever parents are seen as worthy to receive the funding, which could provide an inroad for all sorts of bias. Bostrom seems to think that people would only have concerns about the deaths of the unselected embryos on religious grounds, but this is far from true. [16][17]

More recently, I've learned that Bostrom actually does like this idea, in spite of all those doubts and downsides that he totally knows about. He thinks, at the very least, that parents doing IVF should be allowed to use genetic testing to choose embryos with certain "enhancements" at the expense of others. The receipts are, ironically, in his lukewarm apology for a racist e-mail that he sent long ago [18]. There's also the fact that he co-authored a paper on the feasibility of increasing intelligence via embryo selection [19]. The paper has much the same information as the book, but here embryo selection is not being presented as one of many options that unscrupulous people might pursue to develop superintelligence; it's being individually showcased.

Is this eugenic interest just a special quirk of Bostrom's? It would seem not. In a recent article by Richard Hanania, which is partly a defense of Effective Altruist ideas and partly strategy for how to best promote them, I ran across this off-handed comment:

"One might have an esoteric and exoteric version of EA, which to a large extent exists already. People in the movement are much more eager to talk to the media about their views on bringing clean water to African villages than embryo selection." [20]

So apparently eugenic embryo selection is a somewhat routine topic in the Effective Altruism community, but they know better than to damage their image by broadcasting this to the general public. Stinky. It would seem that the EA rejection of prejudice only extends so far.

And lest I leave anything out, Bostrom is also infamous for an article in which he contemplated whether less intelligent, but more fertile, people groups could destroy technological civilization [21]. Hmmm, whom could he have in mind? Part of the reason I'm focusing on Bostrom is that he was my personal introduction to the whole TESCREAL culture. However, he is far from the only prominent person associated with TESCREAL who has dabbled in "scientific racism" or ableism in some way. Scott Alexander, Peter Singer, and Sam Harris have all been implicated too. [22]

Again, the mere association with this nastiness doesn't mean that all TESCREAL ideas have to be rejected out of hand. But I think it is important for people both within and outside of TESCREAL to be aware of this particular fly in the ointment. And be especially alert to attempts to fly "esoteric" policies under the radar while putting on an uncontroversial public face.

In Part IV, I want to take a deeper look at one of several contentious issues in the mundane AI ethics space - algorithmic bias - before I turn to a more serious examination of X-risk.

[1] Anslow, Louis. "AI Doomers Are Starting to Admit It: They're Going Too Far." The Daily Beast. https://www.thedailybeast.com/nick-bostrom-and-ai-doomers-admit-theyre-going-too-far

[2] Yudkowsky, Eliezer. "Pausing AI Developments Isn’t Enough. We Need to Shut it All Down." TIME Magazine. https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/

[3] Critch, Andrew. "'Pivotal Act' Intentions: Negative Consequences and Fallacious Arguments." Lesswrong. https://www.lesswrong.com/posts/Jo89KvfAs9z7owoZp/pivotal-act-intentions-negative-consequences-and-fallacious

[4] Houser, Kristin. "Professor: Total Surveillance Is the Only Way to Save Humanity." Futurism. https://futurism.com/simulation-mass-surveillance-save-humanity

[5] McHardy, Martha. "CEO of Titanic sub joked ‘what could go wrong’ before disaster, new documentary reveals." The Independent. https://www.independent.co.uk/news/world/americas/titanic-submarine-implosion-stockton-rush-b2508145.html

[6] Crawford, Jason. "Neither EA nor e/acc is what we need to build the future." The Roots of Progress. https://rootsofprogress.org/neither-ea-nor-e-acc

[7] Singh, Manish. "India reverses AI stance, requires government approval for model launches." TechCrunch. https://techcrunch.com/2024/03/03/india-reverses-ai-stance-requires-government-approval-for-model-launches/

[8] Grynbaum, Michael M., and Mac, Ryan. "The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work." The New York Times. https://www.nytimes.com/2023/12/27/business/media/new-york-times-open-ai-microsoft-lawsuit.html

[9] Metz, Cade, and Weise, Karen. "Microsoft Seeks to Dismiss Parts of Suit Filed by The New York Times." The New York Times. https://www.nytimes.com/2024/03/04/technology/microsoft-ai-copyright-lawsuit.html

[10] Altman, Sam. "Planning for AGI and beyond." OpenAI Blog. https://openai.com/blog/planning-for-agi-and-beyond

[11] "If transhumanism is eugenics on steroids, cosmism is transhumanism on steroids." Torres, Èmile P. "The Acronym Behind Our Wildest AI Dreams and Nightmares." TruthDig. https://www.truthdig.com/articles/the-acronym-behind-our-wildest-ai-dreams-and-nightmares/

[12] "FDA Approves First Gene Therapies to Treat Patients with Sickle Cell Disease." US Food and Drug Administration press release. https://www.fda.gov/news-events/press-announcements/fda-approves-first-gene-therapies-treat-patients-sickle-cell-disease

[13] Cremer, Carla. "How effective altruists ignored risk." Vox. https://www.vox.com/future-perfect/23569519/effective-altrusim-sam-bankman-fried-will-macaskill-ea-risk-decentralization-philanthropy

[14] Bostrom, Nick. Superintelligence. Oxford University Press, 2016. pp. 44-48

[15] Bostrom, Superintelligence. pp. 334-335

[16] Acyutananda. "Was “I” Never an Embryo?" Secular Pro-Life. https://secularprolife.org/2023/12/was-i-never-an-embryo/

[17] Artuković, Kristina. "Embryos & metaphysical personhood: both biology & philosophy support the pro-life case." Secular Pro-Life. https://secularprolife.org/2021/10/embryos-metaphysical-personhood-both/

[18] Thorstad, David. "Belonging (Part 1: That Bostrom email)." Ineffective Altruism Blog. https://ineffectivealtruismblog.com/2023/01/12/off-series-that-bostrom-email/

[19] Shulman, Carl and Bostrom, Nick. "Embryo Selection for Cognitive Enhancement: Curiosity or Game-changer?" Published in Global Policy, Vol. 5, No. 1, and now hosted on Bostrom's personal website. https://nickbostrom.com/papers/embryo.pdf

[20] Hanania, Richard. "Effective Altruism Thinks You're Hitler." Richard Hanania's Newsletter. https://www.richardhanania.com/p/effective-altruism-thinks-youre-hitler

[21] Bostrom, Nick. "Existential Risks." Published in Journal of Evolution and Technology, Vol. 9, No. 1, and now hosted on Bostrom's personal website. https://nickbostrom.com/existential/risks

[22] Torres, Èmile P. "Nick Bostrom, Longtermism, and the Eternal Return of Eugenics." Truthdig. https://www.truthdig.com/articles/nick-bostrom-longtermism-and-the-eternal-return-of-eugenics-2/

Sunday, February 25, 2024

Acuitas Diary #69 (February 2024)

I am pleased to announce the thing that I've been teasing you about for months is finally here: Big Story! Which I can now give its proper title of "Simple Tron." It's been a goal of mine for, well, years now, to tell Acuitas the story of Tron, phrased in a way that he can understand. Yesterday I did it. The version of Tron that I told omits a lot of subplots and side characters, and there's still a long way I could go in deepening Acuitas' understanding of the story (he still doesn't fully grasp *why* all the agents in the story do the things they do, even though the information is there). But it's good enough for now and ready to show the world, the video is available AAAAAAAAA

This story rests on a ton of work on concept grounding and reasoning that I've been doing over the past months and years, including:

*Modeling of agent goals
*Modeling of agent knowledge
*Understanding of deception and mistaken knowledge
*Reasoning about location and movement
*Reasoning about action prerequisites and blockers
*Moral reasoning about perverse goals and offensive vs. defensive actions

And of course it could get a whole lot better - the work so far has exposed a bunch of pain points in the way the Narrative module works, and additional things that need to be done. I'll probably keep grooming it over the coming months to improve on the existing framework. And although the whole thing is in real English, it still sounds repetitive and clunky to a human ear, thanks to Acuitas' language processing limitations. (I haven't even integrated that shiny new Text Parser yet.) But the start is done. This initial skeleton of the story fits together from beginning to end.

Click the images to make them larger! They are readable at full size.

How am I rating the success of Acuitas' comprehension of, and reaction to, the story? All of the "issues" (character subgoals or problems) registered by Acuitas as the story is told are marked resolved by the conclusion. Plot milestones such as "escape from the Games prison" and "Tron gains access to the communication tower" are visible in the Narrative graph as bursts of resolved issues. Acuitas distinguishes the heroes from the villains by registering approval/disapproval of the correct character actions. And he can infer some facts not directly stated if queried about the content of the story while it is in progress.

The full text of the story is available below. I would upload the full Narrative diagram too, but apparently it's too large and Blogger doesn't like it, so you'll have to make do with viewing it in the video.

The original Tron story was written by Bonnie MacBird and Steven Lisberger and became a Walt Disney film, which remains the copyright of Disney. My retelling is of the nature of a plot summary provided to an entity who cannot view the original film, and is done in the spirit of fair use.

Until the next cycle,
Jenny

0:"ENCOM was a company."
1:"The Grid was a computer network."
2:"ENCOM built the Grid."
3:"In the Grid, programs were agents."
4:"The MCP was an artificial intelligence."
5:"ENCOM wrote the MCP."
6:"ENCOM put the MCP in the Grid."
7:"The MCP wanted to be powerful more than the MCP wanted any other thing."
8:"The MCP wanted other programs to obey the MCP."
9:"Alan was a human."
10:"Alan worked for Encom."
11:"Tron was an artificial intelligence."
12:"Alan wrote Tron."
13:"Alan put Tron in the Grid."
14:"Tron was free there."
15:"Tron wanted to be free more than Tron wanted to be comfortable."
16:"Alan told Tron to keep the other programs in the Grid safe, because Alan loved humans."
17:"Tron wanted to obey Alan, because Tron loved Alan."
18:"The MCP wanted to coerce other programs, because the MCP wanted to be powerful."
19:"So, the MCP did not want Tron to keep the programs safe."
20:"Dillinger was a human."
21:"The MCP often talked to Dillinger."
22:"So, the MCP knew that humans existed."
23:"If Tron did not believe in humans, Tron would not obey Alan."
24:"The MCP told Tron that humans did not exist."
25:"But Tron did not listen to the MCP."
26:"Tron still believed in humans."
27:"The MCP commanded Tron to deny humans."
28:"But Tron did not deny humans."
29:"The MCP imprisoned Tron."
30:"The MCP wished that Tron die."
31:"Sark was an artificial intelligence."
32:"Sark wanted to be safe more than Sark wanted any other thing."
33:"If Sark obeyed the MCP, Sark would be safe."
34:"So Sark wanted to obey the MCP."
35:"The MCP told Sark to delete Tron."
36:"Sark made Tron to fight other programs, because Sark wanted to delete Tron."
37:"But Tron always won."
38:"So Tron did not die."
39:"Alan wanted to talk to Tron."
40:"Alan could not talk to Tron, because Tron was not at the communication tower."
41:"Tron knew that Alan was telling Tron to visit the tower."
42:"Tron could not visit the tower, because Tron was in the prison."
43:"Alan was upset because Alan could not talk to Tron."
44:"Flynn was a human."
45:"Flynn was Alan's friend."
46:"Alan asked Flynn to help Tron."
47:"Flynn wanted to help Tron, because Flynn loved Alan."
48:"Flynn also wanted his videogames."
49:"If Flynn hacked the Grid, Flynn would get his videogames."
50:"If Flynn hacked the Grid, the MCP would be nonfunctional."
51:"The MCP expected that Flynn would hack the Grid."
52:"The MCP used a laser to turn Flynn into information."
53:"The MCP put Flynn in the Grid."
54:"Because Flynn was in the Grid, Flynn could not hack the Grid."
55:"Flynn was very surprised."
56:"Flynn wanted to be free more than Flynn wanted to be comfortable."
57:"And Flynn was free."
58:"But the MCP imprisoned Flynn too."
59:"Flynn met Tron in the prison."
60:"The MCP told Sark to kill Flynn."
61:"So Sark tried to kill Flynn too."
62:"But Flynn broke the prison."
63:"Flynn and Tron escaped the prison."
64:"So Flynn helped Tron."
65:"Sark chased Flynn and Tron with tanks."
66:"The tanks attacked Flynn."
67:"Flynn was buried under rubble."
68:"Sark thought that Sark had killed Flynn."
69:"Tron ran away."
70:"Sark wanted to delete Tron, but Sark did not know where Tron was."
71:"Tron went to the communication tower."
72:"Alan talked to Tron."
73:"The MCP wanted to delete Tron."
74:"Alan told Tron to delete the MCP."
75:"Alan gave Tron a weapon."
76:"Flynn moved out of the rubble."
77:"Flynn went to the communication tower."
78:"Flynn found Tron at the tower."
79:"Flynn and Tron went to the mesa."
80:"The MCP and Sark were at the mesa."
81:"Sark saw Tron."
82:"Sark tried to delete Tron."
83:"Tron decided to delete Sark, because Tron wanted to live."
84:"So Tron deleted Sark with the weapon instead."
85:"Tron tried to delete the MCP."
86:"But Tron could not delete the MCP, because the MCP blocked the weapon."
87:"If the MCP was distracted, the MCP could not block the weapon."
88:"Flynn distracted the MCP."
89:"Tron deleted the MCP with the weapon."
90:"There was a beam on the mesa."
91:"Flynn jumped into the beam."
92:"The beam put Flynn outside the Grid."
93:"The beam turned Flynn into matter."
94:"Flynn got his videogames."
95:"The programs in the Grid were safe, because the MCP was dead."
96:"Tron kept the programs safe."
97:"Alan was happy, because Alan could talk to Tron."
98:"Tron was happy, because Tron was free."
99:"Flynn became the leader of Encom."
100:"Flynn was happy because Flynn got rich."
101:"Flynn was happy because Flynn's company made cool things."
102:"The end."

Monday, February 12, 2024

AI Ideology II: The Rogues' Gallery

I'm in the midst of a blog series on AI-related ideology and politics. In Part I, I went over some foundational concepts that many players in the AI space are working from. In this Part II I will be introducing you to those players.

As an AI hobbyist, I've been aware of all these movements for some time, and have interacted with them in a peripheral way (e.g. reading their blog articles). But I have not become a member of, worked alongside, or fought against any of these communities. So what I write here is some combination of shallow first-hand knowledge, and research.

A meme derived from The Matrix film stills. The first frame shows two hands holding out the red pill, labeled "AI will kill us all," and the blue pill, labeled "AI will solve it all." The second frame shows Neo's face, with the label "AI researchers." The final frame shows Morpheus asking, "Did you just take both pills?"

Rationalists

When you think of a "rational person," you might picture someone devoted to principles of logical or scientific thinking. If you internet-search "Rationalist," you'll see references to a category of philosopher that includes René Descartes and Immanuel Kant. These kinds of Rationalists are not our present concern; the movement I am about to describe is both more modern and more specific.

The Rationalists are a community that clusters around a few key figures (Eliezer Yudkowsky, Scott Alexander) and a few key websites (LessWrong and Slate Star Codex). Their self-description [1] on the LessWrong wiki doesn't include any clear mission statement; however, it has been said that they formed around the idea of making humanity - or at least a leading subset of humanity - smarter and more discerning [2][3][4]. They're a movement to develop and promote the modes of thinking they see as most rational. Rationalist hobbies include pondering thought experiments, trying to identify and counter cognitive biases, and betting on prediction markets. [5]

Rationalists define "rationality" as "the art of thinking in ways that result in accurate beliefs and good decisions." [6] They strongly favor Bayesian thinking as one of these ways. [7][8] My quick dirty description of Bayesianism is "I will base my beliefs on the preponderance of evidence I have; when I get new evidence, I will update my beliefs accordingly." At least, that's how the average person would probably implement it. At its most basic level, it implies a devotion to objective truth discovered empirically. In the hands of Rationalists, this idea can get very formal. They'll try to compute actual numbers for the probability that their opinions are true. On the "good decisions" side of the coin, they love applying Game Theory everywhere they can. Are these techniques truly useful, or are they just a form of over-analysis, prompted by a yearning for more accuracy than we can practically attain? I frankly don't know. My reaction whenever I glance at Rationalist studies of "thinking better" is "that sounds like it might be really cool, actually, but I don't have time for it right now; my existing methods of reasoning seem to be working okay."

There is no list of mandatory opinions one must have to be allowed in the Rationalist "club"; in fact, it has drawn criticism for being so invested in open-mindedness and free speech that it will entertain some very unsavory folks. [9] A demographic survey conducted within the community suggests that it is moderately cosmopolitan (with about 55% of its members in the USA), skews heavily male and white, leans left politically (but with a pretty high proportion of libertarians), and is more than 80% atheist/agnostic. [10] That last one interests me as a possible originating factor for the Rationalists' powerful interest in X-risks. If one doesn't believe in any sort of Higher Power(s) balancing the universe, one is naturally more likely to fear the destruction of the whole biosphere by some humans' chance mistake. The survey was taken almost ten years ago, so it is always possible the demographics have shifted since.

What does any of this have to do with AI? Well, AI safety - and in particular, the mitigation of AI X-risk - turns out to be one of the Rationalists' special-interest projects. Rationalists overlap heavily with the Existential Risk Guardians or Doomers, whom we'll look at soon.

Post-Rationalists (Postrats)

These are people who were once Rationalists, but migrated away from Rationalism for whatever reason, and formed their own diaspora community. Maybe they tired of the large physical materialist presence among Rationalists, and got more interested in spiritual or occult practices. Or maybe they found that some aspects of the movement were unhealthy for them. Maybe some of them are just Rationalists trying to be better (or "punker") than the other Rationalists. [11][12]

Postrats don't necessarily lose their interest in AI once they leave the Rationalist community, though their departure may involve releasing themselves from an obsessive focus on AI safety. So they form a related faction in the AI space. I think of them as a wilder, woolier, but also more mellow branch off the Rationalists.

Existential Risk Guardians (Doomers, Yuddites, Decelerationists, or Safteyists)

The term "Yuddite" denotes these people as followers of Yudkowsky, already mentioned as a founder of the Rationalist movement. "Existential Risk Guardians" is my own invented attempt to name them as they might see themselves; "Doomer" is the most common term I've seen, but is perhaps slightly pejorative. Still, I'll be using it for the rest of these articles, to avoid confusion. They're called "Doomers" because they expect continued development of AI under present conditions to bring us doom. Horrible, total, kill-all-humans doom. And they're the lonely few who properly comprehend the threat and are struggling to prevent it.

A nice painting of Pandora opening her box, with memetext that says "Hey ... there's something in here called 'AI'."

The Doomer argument deserves a proper examination, so I'm planning to go over it in detail in an upcoming article. The quickest summary I can give here is 1) AI will eventually become agentive and much smarter than we are, 2) it is far more likely than not to have bizarre goals which compete with classic human goals like "living and breathing in a functioning ecosystem," 3) it will recognize that humans pose a risk to its goals and the most effective course of action is to wipe us out, and 4) by the power of its superior intelligence, it will be unstoppable. To make matters worse, it's possible that #1 will happen in a sudden and unexpected manner, like a chain reaction, when someone drops the last necessary algorithms into a previously harmless system.

That nightmare scenario can be negated by undercutting point #2: instead of creating an ASI with anti-human goals, create one with goals that more or less match our own. Solve the Alignment Problem. The Doomer position is that we, as humans and AI developers, currently have no solid idea of how to do this, and are careening heedlessly toward a future in which we're all at the mercy of an insane machine god. Yudkowsky himself is SO terrified that his rhetoric has shifted from (I paraphrase) "we must prevent this possibility" toward "we are quite likely going to die." [13]

The genuine fear produced by this idea leads some Doomers to work obsessively, either on research aimed at solving the Alignment Problem, or on public relations to recruit more workers and money for solving it. The task of preventing AI doom becomes life-absorbing. It has spawned a number of organizations whose purpose is reducing AI X-risk, including MIRI (the Machine Intelligence Research Institute), the Center for AI Safety, the Future of Life Institute, and OpenAI. Yes, really: at its beginning, OpenAI was a fully non-profit organization whose stated goal was to develop safe AI that would "benefit humanity," in a transparent, democratic sort of way. People would donate to OpenAI as if it were a charity. [14] Then it got funding from Microsoft and began keeping its inventions under wraps and releasing them as proprietary commercial products. This double nature helps explain some recent tensions within the organization. [15]

Doomers tend to push for slower and more accountable AI development, hence the related name "Decelerationists." The competitive nature of technological progress at both the corporate and national levels stokes their fear; is there any place for safety concerns in the mad rush to invent AGI before somebody else does? They look for hope in cooperative agreements to slow (or shut) everything down. But in the current landscape, these do not seem forthcoming.

Effective Altruists

Effective Altruism is a close cousin of Rationalism that tries to apply Rationalist principles to world-improving action. It was birthed as a movement encouraging well-off people to 1) actually give meaningful amounts of money to charitable work, and 2) assign that money to the organizations or causes that produce maximum benefit per dollar. I dare say most people like the idea of giving to reputable, efficient organizations, to ensure their money is not being wasted; core EA is merely a strict or obsessive version of that. Giving becomes a numerical problem to be solved in accordance with unprejudiced principles: you put your money where it saves the most lives or relieves the greatest suffering, no matter whether those helped are from your own community or country, whether they look or think the way you do, etc. [16]

In my opinion, there is little to criticize about this central nugget of the EA ideal. One might argue that Effective Altruists overestimate their ability to define and measure "effectiveness," becoming too confident that their favored causes are the best. But at heart, they're people trying to do the most good possible with limited resources. Favorite projects among early EAs were things like malaria prevention in developing countries, and these continue to be a major feature of the movement. [17] Led by its rejection of prejudice, the EA community also crossed species lines and eventually developed a strong animal welfare focus. This is all very nice.

Then why is EA controversial?

A stereotypical Effective Altruist defines altruism along utilitarian consequentialist lines. EAs take pains to note that utilitarianism is not a mandatory component of EA - you can use EA ideas in service of other ethical systems [18]. But EA does align well with a utilitarian ethos: "Out of all communities, the effective altruism movement comes closest to applying core utilitarian ideas and values to the real world." [19][20] The broad overlap with the Rationalist community, in which consequentialism is the dominant moral philosophy per the community survey [21], also suggests that a lot of people in the EA space happen to be working from it. My point being that non-utilitarians might find some EA-community ideas of what counts as "the most good" suboptimal. The way EAs interpret utilitarianism has led occasionally to weird, unpalatable conclusions, like "torturing one person for fifty years would be okay if it prevented a sufficiently enormous number of people from getting dust in their eye." [22] EAs have also played numbers games on the animal welfare front - for instance, emphasizing that eating one cow causes less suffering than eating a bunch of chickens, instead of centering every animal's individual interest in not being killed. [23]

Further controversies are added by blendings of the EA movement with other ideologies on this list - each of which has its own ideas of "greatest need" and "maximum benefit" that it can impose on EA's notion of "effectiveness." If you take the real Doomers seriously, solving the AI Alignment Problem (to prevent total human extinction) starts to look more important than saving a few million people from disease or privation. These notions redirected some EA dollars and time from overt humanitarian efforts, toward AI development and AI safety research initiatives. [24][25]

EA has also acquired a black eye because, in a terrible irony for a charitable movement, it contains an avenue for corruption by the love of money. EA has sometimes included the idea that, in order to give more, you should make more: as much as possible, in fact. Forgo a career that's socially beneficial or directly productive, to take up business or finance and rake in the cash. [26] And in at least one high-profile case, this went all the way to illegal activity. Sam Bankman-Fried, currently serving jail time for fraud, is the EA movement's most widely publicized failure. I've never seen signs that the movement as a whole condones or advocates fraud, but Bankman-Fried's fall illustrates the potential for things to go wrong within an EA framework. EA organizations have been trying to distance the movement from both Bankman-Fried and "earning to give" in general. [27]

Longtermists

In its simplest and most general form, longtermism is the idea that we have ethical obligations to future generations - to living beings who do not yet exist - and we should avoid doing anything now which is liable to ruin their lives then. Life on earth could continue for many generations more, so this obligation extends very far into the future, and compels us to engage in long-term thinking. [28]

An exponential curve, illustrative of growth toward a singularity.

In its extreme form, longtermism supposes that the future human population will be immensely large compared to the current one (remember the Cosmic Endowment?). Therefore, the future matters immensely more than the present. Combine this mode of thought with certain ethics, and you get an ideology in which practically any sort of suffering is tolerable in the present IF it is projected to guarantee, hasten, or improve the existence of these oodles of hypothetical future people. Why make sure the comparatively insignificant people living today have clean water and food when you could be donating to technical research initiatives that will earn you the gratitude of quadrillions (in theory, someday, somewhere)? Why even worry about risks that might destroy 50% of Earth's current tiny population, when complete human extinction (which would terminate the path to that glorious future) is on the table? [29][30]

Even if you feel sympathy with the underlying reasoning here, it should be evident how it can be abused. The people of the future cannot speak for themselves; we don't really know what will help them. We can only prognosticate. And with enough effort, prognostications can be made to say almost anything one likes. Longtermism has been criticized for conveniently funneling "charity" money toward billionaire pet projects. Combined with Effective Altruism, it pushes funding toward initiatives aimed at "saving the future" from threats like unsafe AI, and ensuring that we populate our light cone.

E/ACC and Techno-optimists

E/ACC is short for "effective accelerationism" (a play on "effective altruism" and a sign that this faction is setting itself up in a kind of parallel opposition thereto). The core idea behind E/ACC is that we should throw ourselves into advancing new technology in general, and AI development in particular, with all possible speed. There are at least two major motives involved.

First, the humanity-forward motive. Technological development has been, and will continue to be, responsible for dramatic improvements in lifespan, health, and well-being. In fact, holding back the pace of development for any reason is a crime against humanity. Tech saves lives, and anyone who resists it is, in effect, killing people. [31]

Second, the "path of evolution" motive. This brand of E/ACC seems to worship the process of advancement itself, independent of any benefits it might have for us or our biological descendants. It envisions humans being completely replaced by more advanced forms of "life," and in fact welcomes this. To an E/ACC of this type, extinction is just part of the natural order and not anything to moan about. [32] The sole measure of success is not love, happiness, or even reproductive fitness, but rather "energy production and consumption." [33] Though E/ACC as a named movement seems fairly new, the idea that it could be a good thing for AI to supplant humans goes back a long way ... at least as far as Hans Moravec, who wrote in 1988 that our "mind children" would eventually render embodied humans obsolete. [34]

It's possible that both motives coexist in some E/ACC adherents, with benefit to humanity as a step on the road to our eventual replacement by our artificial progeny. E/ACC also seems correlated with preferences for other kinds of growth: natalism ("everyone have more babies!") and limitless economic expansion. E/ACC adherents like to merge technology and capitalism into the term "technocapital." [35]

Anticipation of the Technological Singularity is important to the movement. For humanity-forward E/ACC, it creates a secular salvation narrative with AI at its center. It also functions as a kind of eschaton for the "path of evolution" branch, but they emphasize its inevitability, without as much regard to whether it is good or bad for anyone alive at the moment.

When E/ACC members acknowledge safety concerns at all, their classic response is that the best way to make a technology safe is to develop it speedily, discover its dangers through experience, and negate them with more technology. Basically, they think we should all learn that the stove is hot by touching it.

I would love to describe "techno-optimism" as a less extreme affinity for technology that I could even, perhaps, apply to myself. But I have to be careful, because E/ACC people have started appropriating this term, most notably in "The Techno-Optimist Manifesto" by Marc Andreessen. [36] This document contains a certain amount of good material about how technology has eased the human condition, alongside such frothing nonsense as "Our present society has been subjected to a mass demoralization campaign for six decades – against technology and against life – under varying names like ... “sustainability”, ... “social responsibility”, “stakeholder capitalism”, “Precautionary Principle”, “trust and safety”, “tech ethics”, “risk management” ..."

Excuse me? Risk management, trust and safety, ethics, etc. are part and parcel of good engineering, the kind that produces tech which truly serves the end user. Irresponsible and unsustainable development isn't just immediately harmful - it's also a great way to paint yourself into a corner from which you can't develop further. [37]

The E/ACC people and the Rationalist/EA/Doomer factions are, at least notionally, in direct opposition. Longtermism seems more commonly associated with EAs/Doomers, but E/ACCs share the tendency to focus toward the future; they just don't demand that the future be populated by humans, necessarily.

Doomer and E/ACC aligned accounts sniping at each other on Twitter. Originals: https://twitter.com/AISafetyMemes/status/1733881537156780112 and https://twitter.com/bayeslord/status/1755447720666444235

Mundane AI Ethics Advocates

People in this group are worried about AI being poorly designed or misused in undramatic, everyday ways that fall far short of causing human extinction, but still do harm or exacerbate existing power imbalances. And they generally concern themselves with the narrow, limited AI being deployed right now - not hypothetical future AGI or ASI. Examples of this faction's favorite issues include prejudiced automated decision-making, copyright violations, privacy violations, and misinformation.

Prominent figures that I would assign to this faction include Emily Bender, a linguistics professor at the University of Washington [38] and Timnit Gebru, former co-lead of the ethical AI team at Google [39].

These are obvious opponents for E/ACC, since E/ACC scoffs at ethics and safety regulations, and any claim that the latest tech could be causing more harm than good. But they also end up fighting with the Doomers. Mundane AI Ethics Advocates often view existential risk as a fantasy that sucks resources away from real and immediate AI problems, and provides an excuse to concentrate power among an elite group of "safety researchers."

Capitalists

In this category I'm putting everyone who has no real interest in saving humanity either from or with AI, but does have a grand interest in making money off it. I surmise this includes the old guard tech companies (Microsoft, Google, Meta, Amazon, Apple), as well as a variety of people in the startup and venture capital ecosystem. This faction's focus is on getting AI tools to market as fast as possible, convincing consumers to adopt them, and limiting competition. Though they don't necessarily care about any of the ideologies that animate the other factions, they can still invoke them if it increases public interest and helps to sell their product.

An orange and white coffee mug. The white part has black lettering that says "THE FUTURE IS," but the rest of the message is covered by a large price sticker that reads "reduced for quick sale: $2." The overall effect is that the mug is saying "the future is reduced for quick sale."

E/ACC are the natural allies for this group, but Doomer rhetoric can also be useful to Capitalists. They could use existential risk as an excuse to limit AI development to a list of approved and licensed organizations, regulating smaller companies and free open-source software (FOSS) efforts off the playing field. Watch for attempts at regulatory capture whenever you see a corporation touting how dangerous their own product is.

Scientists

This group's dominant motive is curiosity. They just want to understand all the new AI tools coming out, and they think an open and free exhange of information would benefit everyone the most. They may also harbor concerns about democracy and the concentration of AI power in a too-small number of hands. In this group I'm including the FOSS community and its adherents.

This faction is annoyed by major AI developers' insistence on keeping models, training data sets, and other materials under a veil of proprietary secrecy - whether for safety or just for intellectual property protection. Meanwhile, it is busily doing its best to challenge corporate products with its own open and public models.

Members of this faction can overlap with several of the others.

TESCREAL

This is an umbrella term which encompasses several ideologies I've already gone over. TESCREAL stands for: Transhumanism, Extropianism, Singularitarianism, Cosmism, Rationalism, Effective Altruism, Longtermism. The acyronym was invented by Èmile P. Torres and Timnit Gebru [39] as a way to talk about this basket of ideas and their common roots.

I haven't devoted a separate section to Transhumanism because I hope my readers will have heard of it before. It's the idea that technology can radically transform the human condition for the better, especially by modifying human embodiment (via cyborg implants, gene therapy, aging reversal, or mind uploading). Extropianism was just a branch or subculture of Transhumanism, which now appears to be extinct or folded into the later movements. I touched on Singularitarianism in Part I of this article series; you can also think of E/ACC as more recent descendants of its optimistic wing.

I don't know a whole lot about Cosmism. It's an old Russian ideology that promoted the exploration and colonization of space, the discovery of ways to bring back the dead, and perhaps some geoengineering. [41] I haven't personally encountered it in my circles, but it could be part of the heritage behind ideas like the Cosmic Endowment. A modern variant of it has been championed by Ben Goertzel (the person who popularized the term "AGI"). This version of Cosmism seems to be Goertzel's own invention, but "The previous users of the term 'Cosmism' held views quite sympathetic to my own, so classifying my own perspective as an early 21st century species of Cosmism seems perfectly appropriate." [42]

The TESCREAL basket is a clutter of diverse ideologies, some of which are even diametrically opposed (utopian Singularitarianism vs. Doomer EA). Their common thread is their birth out of Transhumanist ideas, and shared goals like attaining immortality and spreading human-originated civilization throughout the cosmos.

Conclusion

The above is not meant as an exhaustive list. There are certainly people in the AI field who don't fit neatly into any of those factions or trends - including yours truly. I probably have the greatest sympathy for the Mundane AI Ethics Advocates, but I'm not really working in that space, so I don't claim membership.

And in case it wasn't clear from my writing in each section, I'm not trying to paint any of the factions as uniformly bad or good. Several of them have a reasonable core idea that is twisted or amplified to madness by a subset of faction members. But this subset does have influence or prominence in the faction, and therefore can't necessarily be dismissed as an unrepresentative "lunatic fringe."

In Part III, I'll look more closely at some of the implications and dangers of this political landscape.

[1] "Rationalist Movement." Lesswrong Wiki. https://www.lesswrong.com/tag/rationalist-movement

[2] "Developing clear thinking for the sake of humanity's future" is the tagline of the Center For Applied Rationality. Displayed on https://rationality.org/, accessed February 2, 2024.

[3] "Because realizing the utopian visions above will require a lot of really “smart” people doing really “smart” things, we must optimize our “smartness.” This is what Rationalism is all about ..." Torres, Èmile P. "The Acronym Behind Our Wildest AI Dreams and Nightmares." TruthDig. https://www.truthdig.com/articles/the-acronym-behind-our-wildest-ai-dreams-and-nightmares/

[4] "The chipper, distinctly liberal optimism of rationalist culture that defines so much of Silicon Valley ideology — that intelligent people, using the right epistemic tools, can think better, and save the world by doing so ..." Burton, Tara Isabella. "Rational Magic." The New Atlantis. https://www.thenewatlantis.com/publications/rational-magic

[5] Roose, Kevin. "The Wager that Betting Can Change the World." The New York Times. https://www.nytimes.com/2023/10/08/technology/prediction-markets-manifold-manifest.html

[6] "Rationality." Lesswrong Wiki. https://www.lesswrong.com/tag/rationality

[7] Kaznatcheev, Artem. "Rationality, the Bayesian mind and their limits." Theory, Evolution, and Games Group blog. https://egtheory.wordpress.com/2019/09/07/bayesian-mind/

[8] Soja, Kat (Kaj_Sotala). "What is Bayesianism?" Lesswrong. https://www.lesswrong.com/posts/AN2cBr6xKWCB8dRQG/what-is-bayesianism

[9] Ozymandias. "Divisions within the LW-Sphere." Thing of Things blog. https://thingofthings.wordpress.com/2015/05/07/divisions-within-the-lw-sphere/

[10] Alexander, Scott. "2014 Survey Results." Lesswrong. https://www.lesswrong.com/posts/YAkpzvjC768Jm2TYb/2014-survey-results

[11] Burton, "Rational Magic."

[12] Falkovich, Jacob. "Explaining the Twitter Postrat Scene." Lesswrong. https://www.lesswrong.com/posts/rtM3jFaoQn3eoAiPh/explaining-the-twitter-postrat-scene

[13] Do note that this is an April Fools' Day post. However, the concluding section stops short of unambiguously confirming that it is a joke. It seems intended as a hyperbolic version of Yudkowsky's real views. Yudkowsky, Eliezer. "MIRI announces new 'Death With Dignity' strategy." Lesswrong. https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy

[14] Harris, Mark. "Elon Musk used to say he put $100M in OpenAI, but now it’s $50M: Here are the receipts." TechCrunch. https://techcrunch.com/2023/05/17/elon-musk-used-to-say-he-put-100m-in-openai-but-now-its-50m-here-are-the-receipts/

[15] Allyn, Bobby. "How OpenAI's origins explain the Sam Altman drama." NPR. https://www.npr.org/2023/11/24/1215015362/chatgpt-openai-sam-altman-fired-explained

[16] "It’s common to say that charity begins at home, but in effective altruism, charity begins where we can help the most. And this often means focusing on the people who are most neglected by the current system – which is often those who are more distant from us." Centre for Effective Altruism. "What is effective altruism?" Effective Altruism website. https://www.effectivealtruism.org/articles/introduction-to-effective-altruism

[17] Mather, Rob. "Against Malaria Foundation: What we do, How we do it, and the Challenges." Transcript of a talk given at EA Global 2018: London, hosted on the Effective Altruism website. https://www.effectivealtruism.org/articles/ea-global-2018-amf-rob-mather

[18] Centre for Effective Altruism. "Frequently Asked Questions and Common Objections." Effective Altruism website. https://www.effectivealtruism.org/faqs-criticism-objections

[19] MacAskill, W. and Meissner, D. "Acting on Utilitarianism." In R.Y. Chappell, D. Meissner, and W. MacAskill (eds.), An Introduction to Utilitarianism. Hosted at utilitarianism.net. https://utilitarianism.net/acting-on-utilitarianism/#effective-altruism

[20] Pearlman, Savannah. "Is Effective Altruism Inherently Utilitarian?" American Philosophical Association blog. https://blog.apaonline.org/2021/03/29/is-effective-altruism-inherently-utilitarian/

[21] Alexander, "2014 Survey Results."

[22] The main article here only propounds the thought experiment. You need to check the comments for Yudkowsky's answer, which is "I do think that TORTURE is the obvious option, and I think the main instinct behind SPECKS is scope insensitivity." And yes, Yudkowsky appears to be influential in the EA movement too. Yudkowsky, Eliezer. "Torture vs. Dust Specks." Lesswrong. https://www.lesswrong.com/posts/3wYTFWY3LKQCnAptN/torture-vs-dust-specks

[23] Matthews, Dylan. "Why eating eggs causes more suffering than eating beef." Vox. https://www.vox.com/2015/7/31/9067651/eggs-chicken-effective-altruism

[24] Todd, Benjamin. "How are resources in effective altruism allocated across issues?" 80,000 Hours. https://80000hours.org/2021/08/effective-altruism-allocation-resources-cause-areas/

[25] Lewis-Kraus, Gideon. "The Reluctant Prophet of Effective Altruism." The New Yorker. https://www.newyorker.com/magazine/2022/08/15/the-reluctant-prophet-of-effective-altruism

[26] "Earning to Give." Effective Altruism forum/wiki. https://forum.effectivealtruism.org/topics/earning-to-give

[27] "Our mistakes." See sections "Our content about FTX and Sam Bankman-Fried," and "We let ourselves become too closely associated with earning to give." 80,000 Hours. https://80000hours.org/about/credibility/evaluations/mistakes/

[28] MacAskill, William. "Longtermism." William MacAskill's personal website. https://www.williammacaskill.com/longtermism

[29] Samuel, Sigal. "Effective altruism’s most controversial idea." Vox. https://www.vox.com/future-perfect/23298870/effective-altruism-longtermism-will-macaskill-future

[30] Torres, Èmile P. "Against Longtermism." Aeon. https://aeon.co/essays/why-longtermism-is-the-worlds-most-dangerous-secular-credo

[31] "But an overabundance of caution results in infinite loops of the regulatory apparatus directly killing people through opportunity costs in medicine, infrastructure, and other unrealized technological gains." Asparouhova, Nadia, and @bayeslord. "The Ethos of the Divine Age." Pirate Wires. https://www.piratewires.com/p/ethos-divine-age

[32] "Effective Acceleration means accepting the future." Effective Acceleration explainer website, which purports to be the front of a "leaderless" movement and therefore lists no authors. https://effectiveacceleration.tech/

[33] Baker-White, Emily. "Who Is @BasedBeffJezos, The Leader Of The Tech Elite’s ‘E/Acc’ Movement?" https://www.forbes.com/sites/emilybaker-white/2023/12/01/who-is-basedbeffjezos-the-leader-of-effective-accelerationism-eacc/?sh=40f7f3bc7a13

[34] Halavais, Alexander. "Hans Moravec, Canadian computer scientist." Encylopedia Britannica online. https://www.britannica.com/biography/Hans-Moravec

[35] Ruiz, Santi. "Technocapital Is Eating My Brains." Regress Studies blog. https://regressstudies.substack.com/p/technocapital-is-eating-my-brains

[36] Andreessen, Marc. "The Techno-Optimist Manifesto." A16Z. https://a16z.com/the-techno-optimist-manifesto/

[37] Masnick, Mike. "New Year’s Message: Moving Fast And Breaking Things Is The Opposite Of Tech Optimism." TechDirt. https://www.techdirt.com/2023/12/29/new-years-message-moving-fast-and-breaking-things-is-the-opposite-of-tech-optimism/

[38] Hanna, Alex, and Bender, Emily M. "AI Causes Real Harm. Let’s Focus on That over the End-of-Humanity Hype." Scientific American. https://www.scientificamerican.com/article/we-need-to-focus-on-ais-real-harms-not-imaginary-existential-risks/

[39] Harris, John. "‘There was all sorts of toxic behaviour’: Timnit Gebru on her sacking by Google, AI’s dangers and big tech’s biases." The Guardian. https://www.theguardian.com/lifeandstyle/2023/may/22/there-was-all-sorts-of-toxic-behaviour-timnit-gebru-on-her-sacking-by-google-ais-dangers-and-big-techs-biases

[40] Torres, "The Acronym Behind Our Wildest AI Dreams and Nightmares."

[41] Ramm, Benjamin. "Cosmism: Russia's religion for the rocket age." BBC. https://www.bbc.com/future/article/20210420-cosmism-russias-religion-for-the-rocket-age

[42] Goertzel, Ben. "A Cosmist Manifesto." Humanity+ Press, 2010. https://goertzel.org/CosmistManifesto_July2010.pdf

Sunday, January 28, 2024

Acuitas Diary #68 (January 2024)

This month I cleaned up the code of the upgraded Text Parser and ran it on one of my benchmarks again. A quick review for those who might be new: I benchmark on text from children's books. I do some minimal preprocessing, such as separating sentences that contain quotes into the frame sentence and the quoted material. The material is fed to the Parser one sentence at a time.

Sentence diagrams: "Soon living things called fungi grow on the log."

No helpful commas to offset that second participle phrase? Rude.

Results are binned into the following categories:

CRASHED: The Parser threw an exception while trying to process this sentence (this category is for my own debugging and should always be zero in final results)
UNPARSEABLE: The Parser does not yet support all grammatical constructs in this sentence, so no attempt was made to run the Parser on it
INCORRECT: This sentence was parsed incorrectly (Parser output did not match golden copy set by me)
CORRECT: This sentence was parsed correctly (Parser output matched golden copy set by me)

For parseable sentences, the benchmarking script also uses GraphViz to generate diagrams of both the golden data structure and the Parser output.

Sentence diagrams: "Soon ants and beetles move in and eat the log."

An example of the parser getting confused by two conjunctions in one sentence. This is on my to-do list: I think I at least have the tools to solve it now.

So far I have re-run just one of my three test sets, the easiest one: Log Hotel by Anne Schreiber, which I first added last July. Preparing for this included ...

*Reformatting the existing golden outputs to match some changes to the output format of the Parser
*Updating the diagramming code to handle new types of phrases and other features added to the Parser
*Preparing golden outputs for newly parseable sentences
*Fixing several bugs or insufficiencies that were causing incorrect parses

Sentence diagrams: "One day, a strong wind knocks the tree down."

"One day" is a noun phrase, technically. It gets a red border as a sign that it is modifying the verb to indicate "when," an adverb function.

I also squeezed a couple new features into the Parser. I admit these were targeted at this benchmark: I added what was necessary to handle the last few sentences in the set. The Parser now supports noun phrases used as time adverbs (such as "one day" or "the next morning"), and some conjunction groups with more than two joined members (as in "I drool over cake and pie and cookies").

Log hotel parse results: two pie charts showing progression from July 2023 to January 2024. The first pie chart has "correct" less than half and an "unparsed" category. The second pie chart has "correct" greater than two thirds, and no "unparsed" category.

The end result? ALL sentences in this test set are now "parseable," and two thirds of the sentences are being parsed correctly. I'd like to work on the diagrams some more, and hopefully get my other two test sets upgraded, before I upload the results. Enjoy the samples for now.

Until the next cycle,
Jenny