Machines + Society #14: AI Governance; Anarchic communes; Adventure Time

A newsletter by Mako Shen

Oct 01, 2020

machines + society

by Mako Shen | Sep 30, 2020

AI Governance: A Preliminary Review of A Burgeoning Field

This essay frames what I've broadly learned so far after spending a couple of months reading and talking to people who think about or work on AI Governance. If you are knowledgeable about this, feel free to skip to section 三 for my thoughts on specific papers or the assorted links below.

一. Why Care?

The motivation for AI Governance as a field is the idea that, left unchecked, increasingly intelligent machines could be really bad for humanity.

As a sidenote, when I say 'AI', I am referring to the technology that enables machines to reason intelligently across multiple domains. This may be the broad artificial general intelligence that Shane Legg and Mark Gabrud refer to, or it may be akin to the AI Services that Eric Drexler describes. I hope this broadness doesn't lead to confusion.

I think of the arguments in two flavors: systemic risks and accident risks.

The first, more intuitive (and so more common), form of reasoning is about systemic risks from AI. Right now people recognize that classification systems have several indirectly undesirable traits. The facial recognition system may classify those with darker skin as gorillas, or else enable an authoritarian government to suppress a religious minority. These aren't things that the AI is directly doing, but things that its existence may enable. In the longer term, for instance, AI may plausibly cause enormous wealth creation for already wealthy individuals. This could lead to a larger systemic risk as the poor majority are increasingly dissatisfied with their shrinking slice of the pie.

The second, less intuitive, reason people are concerned about AI is the direct future risk from AI. A common line of reasoning goes as follows: most all species on earth are incredibly versatile. Chimpanzees are able to recombine smaller units of thought to form new ideas. New Caledonian Crows use sticks to extract insects from logs. New Mexican whiptails can reproduce asexually.

Yet only humans are the only species in charge of our own fate. We alone are able to reconstruct the environment around us and control other species to achieve our goals. Why only us? Not because we are God's chosen people, or because we got lucky. It is because, around ten million years ago, changes in the earth began to favor bipedalism. We suddenly had our hands free, and those who could use their hands in a smarter way survived and bred. In other words, we won because we were smart. If we create machines smarter than us, we will be to general artificial intelligence as chimps are to us. It would not be difficult then to manipulate the surrounding environment to our detriment according to whatever principles it’s following. If this sounds implausible right now, note that at 2015 NeurIPS (the top machine learning/artificial intelligence conference) 300 researchers gave a 50/50 chance for human level machine intelligence within 45 years. Some argue that having AGI on the horizon at all should be serious cause for alarm.

The arguments for caring about potential risks from AI merit far more discussion than I have space for here. Read Human Compatible or The Precipice for a more in-depth presentation.

二. A Framework for AI Governance

So you concede that it is probably worth considering AI Governance more closely. Where do you even start?

The most helpful framework I've come across is proposed by Allan Dafoe in AI Governance: A Research Agenda. He decomposes the problem 'how should we govern AI?' into three major subfields: the technical landscape, politics, and ideal governance.

AI technical landscape:

This is the most heavily researched subfield. It aims to assess, forecast, and create technically safe agents.

Assessing AI progress is the hottest topic of all of AI Governance. The organizations working on assessing AI progress range from OpenAI to Stanford's Institute for Human-Centered AI to McKinsey. Even though consulting groups are already making slide decks about it, there are still many difficult open questions about assessing progress AI.

Some examples: what have been the key factors in driving breakthroughs in research? Ostensibly talent, data and compute have something to do with it, but how does this differ across computer vision, natural language processing, and robotics? Also, how open is current AI research? What does openness in software development have to do with innovation?

Many of the same groups that assess AI progress also claim to forecast it. Yet forecasting is not just extrapolating past trends into the future. It also involves thinking about how properties of AI will alter the future. Do we expect AI to be authoritarian-biased, favoring repressive control over democratic freedom? What about the rate of progress: should we expect an intelligence explosion? The groups doing the most interesting work on these questions include the Future of Humanity Institute, its subgroup GovAI, AI Impacts and the Machine Intelligence Research Institute.

There are a lot of people working on AI technical safety. Most of them, however, are concerned with safety in the short-term sense: how do we stop our computer vision algorithms from categorizing Black people as gorillas? How can we make sure our language models don't penalize an application just because the applicant is female? Most prominent institutes doing machine learning research will have some group that focuses on adversarial examples. See the Madry Lab at MIT, the Safe AI Lab at CMU, Google Brain, and some nonprofits like Data & Society, and the AI Now institute.

A smaller group of organizations are focused on longer term AI safety: rather than focusing on the immediate problems that AI present, they attempt to create programs that are generally safe and beneficial. Some questions of interest: how can we stop our algorithms from cheating in games? how do we get programs to understand our preferences when even we don't really know what we want? The most notable groups trying to answer these types of questions are the Stuart Russell's CHAI group at Stanford, Deepmind's safety team, Google Brain, MIRI.

AI politics:

This is actually what most people think of when you mention 'AI Governance'. This field examines how AI will effect how human institutions interact with each other. Three broad categories in AI politics are domestic politics, the political economy, and international security.

Within domestic politics, two major concerns are surveillance and manipulation of public opinion. Companies within both the U.S. and China are developing increasingly sophisticated systems for monitoring people. What political solutions might we employ to mitigate the serious damages to privacy that are likely? There is also reason to be concerned about the ease with which propaganda is generated by the latest language models. How concerned should we be about the manipulation of public opinion?

Within the political economy, we can think of potential problems on a domestic and an international level. In the domestic economy, there's the big question of automation. What's a truck driver to do when we take away his/her job? Internationally, we might be concerned about a technological race dynamic. For instance, as competition intensifies between companies around the world, we would expect the pressure to lead to AI that is improperly vetted and so unevenly harmful. We already see that today's best algorithms misidentify black people far more frequently than they do white people. The consequences of this are currently well publicized, but as everything becomes digitalized and there is enormous commercial pressure, we should expect disproportionate harms for minorities.

International security is an especially scary, if speculative, category to consider. A powerful AI could change the dynamics of warfare, very plausibly for the worse. Perhaps guided missile systems improve enough to lead one world power to attempt to eliminate the nuclear arsenal of another. Cue world war.

The most prominent groups I know thinking about these types of problems are the ACLU, EFF, CSET, CNAS, CSIS, RAND corporation (and most think tanks that think at all about defense), as well as FHI/GovAI, and various U.S. agencies (think CIA and the Pentagon). This is very U.S.-centric. I'd love to hear of interesting AI governance work being done outside of the U.S./Oxford if anyone has pointers.

AI Ideal Governance:

This field has so far received the least attention. It asks 'what type of future do we want to have'?

It remains largely in the domain of philosophy and science fiction. We are still learning how to ask the most interesting questions:

In the last 100 years, rights have been granted to the women, Black people, and those with different sexual orientations. Some have called this trend the expanding moral circle. How should we incorporate these moral trends (and our moral uncertainty) into our decisions?
How can we prevent humans from becoming enfeebled by their technology? Instead of the powerless rotund rolly-polly humans of Wall-E, can technology help us become wiser, more connected, and satisfied?

三. Various AI Governance Thoughts

On the technical landscape:

I've yet to see any detailed long-term forecasts that are very compelling. Forecasting in general is very very hard, so this isn't very surprising. The best thinking I've seen about forecasts have been about how compute will scale with capability. Gwern's writing on GPT-3 is the best I've read on machine learning progress and forecasting technical capability.

In his own words, “… concerningly, GPT-3’s scaling curves, unpredicted meta-learning, and success on various anti-AI challenges suggests that in terms of futurology, AI researchers’ forecasts are an emperor sans garments: they have no coherent model of how AI progress happens or why GPT-3 was possible or what specific achievements should cause alarm, where intelligence comes from, and do not learn from any falsified predictions. Their primary concerns appear to be supporting the status quo, placating public concern, and remaining respectable. As such, their comments on AI risk are meaningless: they would make the same public statements if the scaling hypothesis were true or not.”

On AI Politics:

The two most striking things I've read so far are both associated with the Future of Humanity Institute.

The first, The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Reduce Misuse?, is by Toby Shevlane and Allan Dafoe (whom you might fairly call the godfather of AI Governance). It tries to draw parallels between cybersecurity and AI to make some general claims about whether disclosure in AI research is good or bad. They provide a nice framework to think through the disclosure, but their comparison between AI and cybersecurity is so broad that it isn’t very meaningful. For instance, one key difference they point out between AI and cybersecurity is that AI can more tangibly interfere in social systems, leaving more lasting damage. What level of AI are they referring to? What type of cybersecurity? This could be true only for a very narrow range of AI and a broad range of cybersecurity. When reading this, I get the feeling that we’re talking at a level of abstraction too high to be useful. I need to see specifically how a specific piece of technology (ImageNet, for example) fares in their disclosure framework.

FHI affiliate Jade Leung’s thesis Who will govern AI? exposes similar growing pains in AI Governance. She does a wonderful job laying out the case for AI as another general purpose technology, and convincingly applies trends from aerospace, biotechnology, and cryptography to the future of AI. Yet the frameworks, supplied by International Relations, are crude. For instance, she analyses everything in terms of a three agents: the state (aka government), firms, and researchers. The ‘state’ within her model, seeks three things: to maximize economic growth, to strengthen military capabilities, and to mitigate technological risks. None of these three assumptions is unreasonable, yet you will be wholly unable to understand why the Office of Science and Technology Policy urges minimal AI regulation while the Portland City Council bans facial recognition technology if you think of the ‘state’ as a single body.

I may sound harsh, but I think Leung, Shevlane, and Dafoe have done a remarkable job laying out the foundation for more detailed thought. It’s just that at this state, this type of AI Governance research is not useful for policy makers (which, in all fairness, these three authors would acknowledge).

On Ideal AI Governance:

I feel like we as humans have struggled to paint any picture of an ideal future with any nuance. The Abrahamic heavens feel awfully bland and boring. The descriptions are usually some version of the Catholic Catechism: "Heaven is the ultimate end and fulfillment of the deepest human longings, the state of supreme, definitive happiness." It doesn't get any more detailed than that. What does your body feel like in heaven? How is society organized? How do you spend your hours?

When we do try to envision a paradise, it so quickly turns to dystopia. Huxley's Brave New World has stripped away suffering, but in doing so has anesthetized society. And so with The Giver.

Our hells are far more detailed. In Dante's third level of hell, we know you'll find Pluto amid an everlasting 'wintery mix'. In Malpais, you pop soma to get “[a]ll the benefits of Christianity and alcohol; none of their defects.”

Perhaps the human condition is so intertwined with suffering that we can't imagine a realistic future without much suffering. I think we can try harder.

One of the best attempts I’ve observed at a future vision comes from Ursula Le Guin's The Dispossessed: An Ambiguous Utopia. There, on Annares, a commune planet, no one owns property, people volunteer for even the most mundane jobs, and women work beside men. Yet there still remains a very human pettiness and dissatisfaction. People are held closer in brother- and sisterhood, but they are poor and hungry. The future, in Le Guin's vision, is better in parts, worse in others. Most importantly: it is detailed and provides a texture for the future.

Our map for the future is largely blank, and what have right now are a list of places to avoid. If we want to end up somewhere nice, we should spend more time thinking about where we want to go.

📰 Assorted Links 📰

What I'm Thinking About

Distributed Models of Morality

One concept I'm fascinated with lately is the idea of societal health through moral diversity. Most ethical theories tend to implicitly assume that their theory, if widely adopted, would lead to the greatest societal wellbeing. What if a better equilibrium is reached if people have different moral theories? Much like a forest's ecosystem is far healthier for having diversity at various levels, could we imagine a society strengthened by its moral diversity?

Clearly not all diversity is equal (a society with 250 Hitlers, and 250 Maos is worse than 500 Ghandis), but my point is that investigating moral and ethical diversity is, as far as I can tell, relatively underexplored. One way we could do that is through attempting to computationally model moral theories as strategies within a game. There is very little work on this, in part because it's extremely difficult to create a suitable evaluation criteria. In once sense, if we knew how to do model morality effectively, we would also already know how to make moral machines.

Yet I've come across a couple very interesting attempts at codifying moral reasoning. Some people at the University of Tolouse have attempted to model moral guilt using agent based modeling. Another pair of scientists at the Uber AI Lab have attempted to incorporate moral uncertainty into reinforcement learning — they explore how different voting systems may be used to make decisions across different ethical theories and implement a few simple scenarios. A recent paper explores how a policy maker might reduce segregation by first modeling segregation dynamics through reinforcement learning and agent based-modeling.

Metascience

The world's largest educational experiment was discarded because of its controversial conclusion (that an education program designed by an outsider, rather than a credentialed academic, was the most effective by far). "The APT findings were suppressed largely for political reasons. In 1976 when Follow Through was being evaluated, Gene Glass, head of the Ford Foundation at the time, appealed to the National Institutes of Health with an incredible statement. He said something to the effect that, “The use of quantitative data is inappropriate and what we need is case studies".[Source]

"Rather than building up principles that flow from overarching theoretical frameworks, psychology textbooks are largely a potpourri of disconnected empirical findings." From Joe Henrich via Nature.

Thousand of scientists publish a paper every five days. "We searched Scopus for authors who had published more than 72 papers (the equivalent of one paper every 5 days) in any one calendar year between 2000 and 2016, a figure that many would consider implausibly prolific1. We found more than 9,000 individuals..." [Source: Nature]

On the decline of Open Source software.

What's Wrong with social science and how to fix it. "Why not do away with the journal system altogether? The NSF could run its own centralized, open website; grants would require publication there..."

"There are better ways to measure progress than the old Moore's Law metric" — Samuel Moore. A critique of Moore's law. Instead we should think about at core, what a computer is: logic, memory, and the connections between them. (D_L, D_M, D_C according to Prof Philip Wong, also VP of Research at TSMC).

Pro-mouthwash research is largely done by scientists with conflicts of interest. "While I agree that the risk of oral cancer from mouthwash use is difficult to quantify,1 it is not completely impossible. We have recently shown that the use of alcohol-containing mouthwashes may lead to acetaldehyde concentrations in the oral cavity of up to 105 μM, which exceeds levels that have been shown in vitro to form DNA adducts and cause sister chromatid exchanges." [Nature]

The Interintellect is a virtual salon for public intellectuals. Have not explored in detail but their stated aim is to help you "learn, practice, get feedback, and find work with the help of our 'virtual city of minds'." The progress studies slack channel created by Jasmine Wang is an independent example of an 'intellectual salon'. I'm reassured that there are a number of new supportive online communities centered around a smaller number of individuals.

Miscellaneous

Tweet analysis of the the bots that attach with WHO Director-General using CCP-related memes.

Bullshit Timebox: set aside two hours a week to swallow a bunch of frogs— do the tasks that feel like bullshit but which need to get done anyways.

Dante’s Nine Circles of Hell.

Tips for a subjectively longer life: Get rid of routines. Be bored.

Studio Ghibli releases 400 free-to-use images. Mostly in Japanese. [H/t Kottke]

"... the very appeal of BMO is the extent to which the character causes problems and defies the boundaries of what we might assume possible... In this, BMO is more like the rugphobic Roomba than the desperate, murderous replicants of Bladerunner. We want our robots human, but we want them to love us, to console and comfort us." Link. From Tim Hwang and Darby Smith's Adventure Time Forum, "the leading journal of Adventure Time Research, Commentary, and Analysis".

Just in case you were curious: a 30,000 word essay dissecting K-pop.

Book of the month

The Dispossessed, Ursula Le Guin.

"The Dispossessed demonstrates the viability of a type of utopianism that defenders and critics alike have difficulty even imagining: namely, a dynamic and revolutionary utopia premised on an acceptance of the enduring reality of social conflict and historical change." - The New Utopian Politics of Ursula K Le Guin’s The Dispossessed.

Music

Décimus — Soleil (Kassav).

Jamie Isaac — Next to Me.

Charles Mingus — II B.S. The bass.

Nate Mercereau — There You Are.

Machines + Society