Future of Life: AI Alignment and Evaluative Assumptions | Part 4 | (#16)
Series: Dear Chris
Dear Chris,
The Future of Life Institute (FLI) wants to give $ support to projects that nourish meaningful human agency and mitigate the dangers of AI concentrating power within a few people, countries, companies.1
I’m putting together a request for $ to support my work. I'm way behind after an unexpected trip to DC and returning from the city with COVID.
I was rushing to submit an incomplete application. Then they extended the due date to 10/31. Hooray.
And so, I went slow(er) to go fast(er).2 Now I have a slew of essays I’ll publish in the next few days then pull together into the formal proposal.
These writings may be a bit like watching a crew pour a foundation for a new home. Well planned. Messy. Compelling viewing if only because you don’t see it every day. But not pretty.
Even so, without it, plumbing and paint don’t matter.
The first question that FLI asks in the funding application What is the name of the organization receiving funds?
Short answer? I don’t qualify for the funding. They want applicants affiliated with a university, non-profit, foundation. Perhaps someone who reads this would like to partner with me. Let’s say I figure this part out or it doesn’t matter and keep going.
Next question How much money do you want? Hmm. A few options…
100k for 1 year to develop the theory of values; plan the projects described; create a series of seminars, videos, a NYT bestseller (created by AI), or, and this is a stretch, blockbuster espionaction/dramedy (created by AI). Or…
500k for 2 years to do #1 AND begin testing theories, methods, and tools outlined below. 3 Or…
One beeeeeeeleeeeuuuuhhhhnnn dollahs.
Next the application asks for a Project Summary, Detailed Project Description. The next eight essays (#16 - #23 of the Dear Chris Series) respond to these parts of the application. For something tidy and final to submit, I’ll ask Claude to rework my talk story into a project proposal written by someone who has a talent for writing project proposals.
And we begin…
The Situation
AI researchers say we must align AIs with human values because AIs that misbehave threaten our well-being, even our existence. But they aren’t sure how to align AIs with human values. Very not sure enough given we don’t know how powerful AIs may become, who will control them, or how much autonomy they will have. They call this the Alignment Problem.4
FLI says AI is increasing the total power available to humanity. FLI wants the power distributed equitably rather than allow AI to concentrate it with a few nations, companies, individuals. Seems right. I’m in.
My Intention
Present a perception of reality designed to formalize the meaning and usefulness of human values in ways that resolve the challenges of alignment and equitable power distribution.
My Assumptions
A resolution to the Alignment Problem is required to avoid an AI-driven Orwellian future and to create a better world where power is distributed equitably.
Alignment is a more difficult challenge than most anticipate.5
Alignment may be possible.
“What are Values?” is the most important question to resolving alignment and equitable power distribution.
Values are the most powerful and least understood forces on Earth.67
Resolving alignment and equitable power distribution requires a human values-oriented perception of reality.
Tomorrow, in Part 5, we have a playdate with Eratosthenes. If you don’t know him yet, he is an Ancient Greek Geographer and, if I’m honest, just one of those all around fun loving guys. Cool dude. 8
https://futureoflife.org/grant-program/mitigate-ai-driven-power-concentration/
The writing for Option 2 could be series of academic papers dedicated to the design, development and findings of each of the methods and tools proposed. I have experience with publishing peer-reviewed research and coordinating groups of researchers to publish.
https://en.wikipedia.org/wiki/AI_alignment
For one “alignment” is a bad framing of the challenge. Suggests A to B. Mechanistic. Newtonian perhaps. But Values are features of non linear, high dimsion, open systems. Coherence or Moral Coherence are better terms to consider. See George Lakoff for the importance of metaphor in understanding and influencing systems. Framing “alignment” in a way that is fit for context immediately takes it out of our mechninistic mental models and repositions the challenge in living ecological systems models. And so we have a more challenging challenge.
Whether values are “real” or useful in addressing this challenge is a choice for us to make. Some go to great lengths to argue that human values are not real and not helpful. “Values” are, in part, a human construct, so whether they are real or are not depends on fitness of our conceptualization of values to our purposes and context. I am proposing a conceptualization that may have not been as relevant, possible, or necessary prior to the Nhà but is designed to be fit for a more systematic purposeful evolution, an evaluative evolution, in the Nhà. https://www.lesswrong.com/posts/ngqvnWGsvTEiTASih/ai-alignment-problem-human-values-don-t-actually-exist
Here’s a common definition of values “… desirable, trans–situational goals that vary in importance (Schwartz, 1992)” sounds like values are “goals”…Helpful for alignment and equitable power distribution? Not much. We can do better.
https://en.wikipedia.org/wiki/Eratosthenes
I'm enjoying reading these Matt! This is a huge problem/challenge/opportunity so I'm glad you're thinking and sharing so much about it. I don't think enough people are, as you mention. I'm hopeful because I think deep down many have positive, 'good' values. I'm concerned that many often act in a way, self included, that don't reflect positive values. And then I believe some are driven by values that are not moving us in a good direction. How we let these positive values that will benefit us as a whole be the 'stars' that guide us is a big challenge. Hopeful!