On 12/7/14, Dylan and I had a conversation with Oxford philosopher Nick Bostrom about his new book Superintelligence: Paths, Dangers, Strategies. We were also joined by former podcaster Luke Muehlhauser (Conversations from the Pale Blue Dot), whose organization often works with Nick and coordinates an ongoing reading group on the book.
Nick's Future of Humanity Institute studies "existential threats" (a la our past discussion with David Brin), and he presents a model (e.g. in this 1997 paper that we discussed a bit) of philosophical work that is continuous with but with a more general scope than science. We previously discussed (none too charitably) one of his papers on genetically or mechanically enhancing human capacities, and Superintelligence does treat that subject briefly, but the focus is on artificial intelligence and the threat that it poses if engineered improperly.
You can hear (and watch) Nick explain the project here, but here's the run-down: There are no in-principle reasons preventing us from developing an artificial intelligence that's as smart as we are (whether or not it really has consciousness or any other uniquely human trait is beside the point; "intelligence" just refers to means-end strategic ability here), and once we do that, whatever its programmed goals, it will have reason to work to enhance its own abilities, meaning that it's only a matter of time before there's an AI out there that's MUCH smarter than we are. Nick doesn't predict when this will happen, but thinks it more or less inevitable given current research programs.
What motivations will this AI have? Obviously that depends on how it's programmed, but whatever its goals, it will have an instrumental interest in enhancing itself, and also in maintaining goal integrity, i.e. preventing us whatever its goals are from being changed.
Nick thinks that unless strenuous effort is put into carefully defining these goals BEFORE FULL INTELLIGENCE IS ACHIEVED, then we're in trouble. Researchers trying to engineer machine intelligence would more likely than not set just any old goal, e.g. coming up with as many digits of pi as possible or building as many paper clips as possible, as their focus would be to get the machine to innovate and learn in coming up with ways to achieve that goal. But if they succeed, then goal integrity means that we're stuck with a super-intelligence who will now think "outside the box" to take whatever steps it deems necessary to meet its goal, e.g. converting all the matter on earth into paper clip material or computing material to calculate digits of pi. Even if its goal is not infinite, e.g. "manufacture 100 paper clips," there's always more action it can try to take to increase the certainty that it has in fact performed this goal, i.e. go ahead and make the universe into paper clips.
So the primary problem to sole here is what Nick calls "the control problem," which involves measures to either prevent a potential superintelligence from being able to ruin the world (e.g. crippling or containing it) or, more directly, come up with ways to set its motivation in ways that we would find acceptable. This gives rise to many problems more familiar to philosophers: e.g. if you want to tell it to refrain from actions that aren't in our interest, then you have to both figure out what our interest is and figure out how to state this unambiguously. For example, if we tell it to maximize human happiness, it's likely to work to rewire us so that only our pleasure centers function and everything else is removed, if this proves the most efficient way to keep us happy, or better yet, kill us all and replace us with creatures that are easier to keep happy.
Moreover, Nick thinks that we can in effect outsource our more thorny philosophical problems to this much greater intelligence if we can tell it to "take whatever actions we would ask you to take if we were better informed and thought about it for long enough." All this is of course connected with the kind of ethical theorizing that philosophers are familiar with.
The story told, we're faced with a number of issues: Is this something that philosophers have any business messing with, or on the contrary is this a real and pressing enough existential threat that it's much more worth our time than just about anything else? Do we buy the control problem as legitimate, or can we deny that goal integrity would be a central feature of AI motivation? Are we in any position to make the kinds of predictions and analyses Nick makes in this book?
You can hear Nick on a number of other podcasts talking both about this topic and about existential threats more generally. Here are a few from Oxford, or you can just search on his name under podcasts or iTunes U and find plenty of them. We did our best in talking to him this time to engineer an interactive conversation so he's not simply repeating what he had to say in these plentiful previous media appearances.