Michael Nielsen on being a wise optimist about science and technology

March 2025

Listen

Spotify |

Apple Podcasts |

YouTube |

Show notes

This is my conversation with Michael Nielsen, scientist, author, and research fellow at the Astera Institute.

Timestamps:

(00:00:00) intro
(00:01:06) cultivating optimism amid existential risks
(00:07:16) asymmetric leverage
(00:12:09) are “unbiased” models even feasible?
(00:18:44) AI and the scientific method
(00:23:23) unlocking AI’s full power through better interfaces
(00:30:33) sponsor: Splits
(00:31:18) AIs, independent agents or intelligent tools?
(00:35:47) autonomous military and weapons
(00:42:14) finding alignment
(00:48:28) aiming for specific moral outcomes with AI?
(00:54:42) freedom/progress vs safety
(00:57:46) provable beneficiary surveillance
(01:04:16) psychological costs
(01:12:40) the ingenuity gap

Links:

Transcript

Sina: I sensed how personal this has been for you. For me, it’s something I’ve kind of noticed out of the corner of my eye, but I hadn’t worked through all the ramifications of it, and it hadn’t hit me in the same way emotionally. But I found all the arguments you laid out to be very compelling, and now I find myself kind of trapped by them.

I think the fundamental premise is that technology—broadly, with AI as a special case—gives more and more leverage to us as individuals and as peoples. This leverage works on the upside, but it also works on the downside. An individual person can cause more and more damage. You work through the ramifications of this, pointing out that at some point, there could be “recipes for ruin”—something an individual can do that could have existential-scale damage. Knowing that this is the case and that technology continues to move forward, what do we do? What is our best solution to this predicament we find ourselves in?

But before we get into the meat of that, I thought we’d start on a more personal note. I’m curious, how did this question develop for you over the years? And when you talk about the emotional charge of it, where did that really come from?

Michael: Yeah, it certainly has its origins in childhood. Growing up, I heard a lot of concern about ozone holes, population bombs, acid rain, and nuclear destruction. A very striking thing for me as a teenager was reading a book called The Cold and the Dark by Carl Sagan, Paul Ehrlich, and a couple of others—I’m blanking on their names. It was about the prospect that nuclear weapons might end life on Earth. Then, a bit later, I read Eric Drexler talking about gray goo from nanotechnology.

So, there were all these prospects of, at the least, mass death and destruction. As a teenager, I wasn’t entirely sure what to make of those arguments. The argument about nuclear weapons was especially compelling to me. I still remember where I was sitting when I finished reading that book, just because it was such an emotionally impactful experience.

Over the years, I came to realize that some of those direct predictions didn’t come to pass. Thinking back to the arguments, I considered why I’d found them so compelling—or in some cases, why I didn’t find them compelling, but the person presenting them clearly was. They were sincerely worried about the end of the world or mass death and destruction. Eventually, I realized that just because an individual—even a very smart, well-informed one—can’t see a solution to a problem doesn’t mean we won’t find one.

Of course, that’s what happened with some issues. I grew up in Australia, and the hole in the ozone layer was a huge concern. Skin cancer is still very common there. But it’s remarkable that the world figured out what to do about it. The first universal treaties, the Vienna Convention and the Montreal Protocol, phased out CFCs and have led to a recovery in the ozone layer. That’s something many people weren’t aware was going to happen or even possible. It’s an example of human ingenuity rising to what appeared to be a very dire challenge in a way that’s hard to anticipate.

That’s a long-winded way of saying I have somewhat mixed feelings about these doom prognostications. We do face a lot of challenges, and people have worried about doom for centuries. Yet, sometimes, we can be fantastically good at rising to a challenge.

Sina: Yeah, it’s easy to lean on this belief that human ingenuity isn’t to be underestimated. But the mechanism by which that ingenuity works is that people actually have to take each individual problem seriously and work to counter it. I feel like the shape of how you talk about countering this—especially with artificial intelligence and ASI—is more of a meta-problem because they increase our capabilities on all dimensions. What I took away from your piece is that the devil’s in the details. There might be some mechanism-oriented things we can do about this, but it’s also about differential technology development, where we have to stay ahead of the capabilities.

Michael: Yeah, there are a whole bunch of ways to tease that apart. I’m not sure what the best way would be from your point of view. But to take the question head-on, we can each feel ourselves having more and more leverage with AI, and that’s presumably going to continue. So, what do we do about the original question of individual people being able to cause a lot of damage? It’s not even just about aligning AI; it’s about individuals who wield powerful AI to do different things.

At the moment, there’s some small amplification. For one thing, very few people are particularly adept at using existing tools. The people who are adept at using them tend to be extremely skilled in other ways anyway. There’s this very interesting result by Kevin Esvelt where they took, I think, one of the LLaMA models—one of the open-source models anyway—and fine-tuned it to remove some of the refusal behavior. They were able to show that they got substantially more assistance in creating pandemic-ready biological agents.

Michael: So, I mean, this is a long way from being able to cause a pandemic. It’s just a small additional piece in promoting that capacity. As the models get more powerful, and particularly as many more people get used to using these kinds of models over years, they’ll be able to extract more out of the same models. That’s obviously an interesting thing to be promoting.

At the moment, I don’t think the kinds of places where we get acceleration from AI tools are showing up in overall productivity statistics yet. There are interesting studies of small groups of people where there does seem to be some substantial impact, but it’s not like GDP growth has gone through the roof or anything like that all of a sudden.

Sina: When do you expect to see that observable difference in the economy?

Michael: I don’t know. It’s a good question. Lots of people love to think about these questions in depth, but it’s just not something I’ve really paid much attention to.

Sina: So, to this Kevin Esvelt question, I guess it goes without saying that we don’t have the final answers on any of these questions, right? In the spirit of batting potential ideas back and forth, how do you think about these base models being completely open to the extent that’s possible? Because once you start going in the direction of curating what’s in and what’s not, whoever’s point of view is informing that becomes a vector of influence and attack. On the other hand, if the models are completely open, all sorts of crazy things could happen. What does “unbiased” even mean in this context?

Michael: I don’t know what “unbiased” means in this context either. For one thing, the choice of training data intrinsically has a bunch of bias in it, no matter what data you use. If you were to train it on lists of mathematical facts or something like that, you might be able to say it was unbiased, but if it’s being trained on the internet, then it’s having all sorts of interesting biases injected no matter what.

Sina: Can you refine that a bit?

Michael: Yeah, I mean, that’s kind of a deep point, right? We can’t get out of this because the training data itself has implicit biases any which way. We can then choose to correct for those at later stages. It’s interesting to think about how you could take that idea really seriously. How would you make a model that was truly unbiased?

You can think about something like AlphaZero, which basically used self-play to teach itself to play Go from scratch without any human training data being supplied to it. In some sense, you might say it learned to play Go in an unbiased way. Or think about self-generated mathematical truths. That’s a fun and interesting way to approach it. You could maybe try supplying a system with the ability to observe nature but no human training data, which might be considered unbiased in some sense. But even then, things like what sensors you provide to it will have biases. In fact, all of our scientific observational equipment is weird in a lot of ways. We need to correct for random stuff like lens aberrations, which often reflect our evolutionary history and so on. I think it’s actually pretty hard to get away from bias. Even ourselves, doing the best we can, struggle to know what an unbiased read of nature really is.

Sina: Yeah, incredibly hard. Can I just interject with a random fact? Talking to astronomers, I’ve been told very simple things like how much humidity was in the atmosphere that night can affect observations. They sometimes know, quite informally, how to correct the images based on that. It’s kind of important, actually.

Michael: Yeah, I used to work near an atomic physics lab, and some of the experiments they did—like one with whispering gallery modes in tiny optical cavities—were incredibly finicky. It relied on the judgment of the guy doing the experiment in a thousand ways. He was like a master violin maker, able to listen to his environment exquisitely well. Almost all the cavities he made didn’t work, and he had to throw them away. But eventually, he became probably the only person in the world able to make them work. There was so much tacit knowledge going into that.

Sina: Right, and that kind of cuts both ways. On one hand, we could have wishful thinking to finally see the results we’ve been looking for. On the other hand, I feel like we also have some level of judgment on how to tune the device and what the correct path forward is.

Michael: Yeah, I suppose that’s always been true in the history of science. You start with very imperfect systems, like your eyes, which don’t really reveal what’s out there in the world. There are all sorts of problems with our vision. But you use that to build an imperfect model, which you use to build improved devices, which you use to build a better model, and eventually, you end up with extraordinarily good but still imperfect devices.

There’s something really interesting in how you apply that feedback loop to get better and more capable. You’re always extending your capacity to observe. It’s amazing that you can bootstrap yourself from these imperfect tools. I mean, our eyes are terrible at some level; we can see almost nothing, and what we can see, we see in a very distorted way.

Sina: And how do we know? I’m also asking you as someone with a long history in science. As we think about how these AI models know what’s true or more true than what they knew before, how do we make that judgment in the history of science when we’re trying to bootstrap the system? At first, we think that we’re right. How do we navigate that?

Sina: So, we have a working theory that fits the experimental data, and then we start to notice anomalies. There’s this process of conjecture and theoretical work that’s almost decoupled from what the data is saying. We look for a kind of internal consistency and elegance in our theories, and then they might happen to fit some of the empirical data better than the previous model did. These refinements stack on top of each other, so we can only progressively move forward in this way. Is this something you see AI doing—deriving science in their own way as they get more raw sensory input or run experiments?

Michael: I think it’s an interesting question. Take something like AlphaFold, for instance. To what extent is it starting to change the experiments being done in protein structure research? Or even in areas like designing new antibodies and such. There’s probably some feedback loop starting to happen there. I don’t really know much about how that’s changed, but obviously, it will eventually.

Much like in mathematics, a tool like Mathematica definitely changes the way mathematics gets done. I’ve never seriously used Mathematica myself, but I’ve talked to people who have, especially those who’ve used it over multi-decade careers. They were doing mathematics in 1980, and they’re still doing it today. Back then, if you got a five-page algebraic expression to work with, you just gave up. Whereas now, it’s fine, and people do it.

There’s this French Fields Medalist, Élie Cartan, I believe, who became a very intensive user of Mathematica. Some of what he was doing involved manipulating these enormous expressions, which, 10 or 15 years earlier, would have been a trade-off question. Do you want to devote years of your time, maybe grad students and postdocs, to it? Is it worth spending three years of your time, or is it better to spend a week of computer time? It changes what mathematical operations are worth doing.

I think it’s very obvious that this will be the case with AI systems. In fact, it’s already becoming the case. I find myself really enjoying using different deep research systems. A lot of questions I would have let go before, I now just launch a 10-minute query and get an amazing research report back.

Sina: Yeah, I was hoping to get to some of this later in our conversation, and we’ll come back to the optimism and those topics. But how do you see this relationship between the human mind and its AI counterparts—CPUs, database lookups, however we’re augmenting our minds? How is this going to evolve over time? I already feel the same way with deep research; it’s incredible. But it also feels strange that this model already has all this knowledge encoded in it, and I need to go through this high-friction, slow process of instantiating a small, imperfect version of it in my mind. Our minds are getting augmented, but these models are going to get more and more sophisticated from here.

Michael: Actually, I think you can invert your question a bit and ask, what would happen if no new model was ever built, but we could change the interface to them? I think we could keep making better and better interfaces for probably the next century, and they would just get more and more amazing. We’d get better at using them, and the interfaces would become more remarkable without any change in the models at all. It’s like we’re seeing 0.1% of the capacity that’s actually there. It feels like having a gigantic, thousand-story bulldozer and using it to move your chair three inches. That’s definitely what using these systems feels like at the moment.

Also, this question of instantiating part of this in our mind so we can evaluate the output, apply our judgment, or think about which direction to go next—it’s important. I think this will increasingly happen with coding, too, where we no longer have the entire codebase in our mind because we’re just having these agents run and change things. Nobody has the codebase in their mind anyway for large projects. If there are 10,000 programmers working on something, it’s impossible to keep track. Linux increases in size by 20,000 lines a day or something, and I’m sure Google and Microsoft have codebases changing much faster than that.

Sina: So, I guess your sense is that this isn’t categorically different from how things are today. To me, it feels like maybe this is a microcosm of when there is ASI—Artificial Superintelligence—and how the human interfaces with it. It’s kind of an early version of that problem. If what’s on the other side is so complex that we can’t even parse it, if it’s orders of magnitude more sophisticated than what we can hold in our minds, it feels like we’re already starting to touch the beginning of that.

Michael: Certainly, some of that seems to be a question of interface design. It’s been much of the solution for that. It’s part of the reason why languages like Python have done so well. You hide all this complexity behind a really nice interface and insist pretty hard that the interface remains good. The people who did the Python standard library seem to have been pretty good at that, with all the Python improvement proposals and so on. I guess we’re going to see analogous things happen with ASI. It’s a really interesting question.

Michael: I have no idea what the answer is, but I hear a lot from friends who are loving using Cursor and similar systems. They always talk about it in terms of the impact on their personal experience, not really about the impact on the group experience of constructing software. In some sense, that seems almost like the more interesting question. Well, both are very interesting, but it seems less clear that the group dynamic is particularly well understood.

Yeah, the multiplayer experiences, I feel, are going to be fascinating, and we’ve barely touched them. I guess the best analog for them might be social networks like TikTok, which have embedding models for each individual and a representation of your interests. Let’s say two of us are talking about a particular topic, and my agent understands my knowledge graph, my sensibility, and my background, while yours does the same for you. The two agents could facilitate a dialogue in a different way if we’re working on something together. I mean, that feels like it must be a part of what multiplayer experiences will look like. I don’t know, I’m just talking out of my hat here.

Sina: Alright, now that we’re both talking out of our hats about the same subject, maybe we should move on.

Before we do, a quick note: Into the Bytecode is sponsored by Splits. Are you tired of sacrificing security for usability? Splits believes it’s still way too hard for teams to self-custody their on-chain assets. They’re building a new kind of internet-native bank on top of Ethereum. Splits makes it easy for teams to manage the whole lifecycle of their finances—from structuring revenue-sharing agreements using payment flows like splits and waterfalls, to managing those earnings once received using passes and smart accounts. Splits is being used by teams like Protocol Guild, Zora, Songcamp, and others. I’m a big believer in them and recommend checking them out. You can learn more at splits.org.

So, I guess one more related question I had written down, which continues this thread, is about concretely imagining where things might go. One of the points you make in your piece is that it’s worthwhile to try and visualize the future. This talking out of our hats is valuable because having a mental image of what this thing is that we’re discussing actually has an impact on today—on what we build, how we think about user interfaces, user experiences, all these sorts of things.

So, one question I’ve been sitting with is, do we envision artificial intelligences that are independent, with their own kind of agency, or are we really envisioning things that are extensions of us, like wrappers around us as intelligent tools?

Michael: We get to choose that to some extent, but if you think about what seems likely, it’s extremely probable that we’re going to end up with very agentic systems. Partly, that’s just based on what people are working on—a huge number of folks are very interested in that. But then you also think about the economic incentives. People are going to be working on romance chatbots because there’s an obvious economic case for it. They’re going to be working on friend, coach, and therapist chatbots for the same reason.

Even in finance, for example—you probably know much better than I do—but my general impression is that there’s a lot of delegation to algorithms. Those algorithms are given a lot of power and authority to act on their own behalf, not necessarily to sell off an entire company, but certainly to make many decisions. The judgment of the people running these systems is that they’re better off engaging in that kind of collaborative relationship. So, in some sense, there’s already a certain amount of agency—not to act in what we think of as the real world, but to act in the financial world. That ship sailed decades ago; it’s not a recent development.

So, this notion of strongly agentic AI, which is able to act on its own behalf, is so compelling that it’s already been happening for years in limited ways. It seems extremely likely that a lot of people are going to feel a strong urge to pursue it, absent strong controls. You could imagine a future where regulation is passed to forbid romance chatbots or something like that. Or maybe, instead of regulation, societal norms make it so unpopular or uncool that it doesn’t take off. That’s plausible. Do I think that’s what’s going to happen? No, I think in both cases, it’s quite hard to imagine that not happening. What do you think? Do you agree?

Sina: Yeah, I think I agree. And also, on the point of agency, there will be systems that are just in a constant feedback loop with the outside world, like a romance chatbot that’s perpetually doing that. But even systems that are checking in with a human, there’s a continuum of levels of agency. It’s not just black and white. It will have agency in the process of exploring the solution space and acting on something, making many independent decisions in that process.

Michael: Certainly, when you think about military systems and the delegation of authority to machines, you’re inherently in an adversarial situation where being able to react in unexpected ways, faster than your adversary, has been a part of military strategy probably since ancient times. There’s obviously some opportunity here to expand that a lot, again, absent controls.

Here, I think it’s more plausible that there could be some kind of ban. We’ve done remarkably well at limiting, if not outright banning, the use of chemical, biological, and even nuclear weapons. It’s actually quite amazing the extent to which the world just agrees not to use certain types of weapon systems. Obviously, there are violations—chemical weapons have been used and so on. But if you’d asked me a hundred years ago, I would have guessed otherwise. I don’t fully understand why that’s the case.

Michael: So, I think there are some compelling partial answers to why certain weapons aren’t used more often, even in brutal wars where people are dying. With biological weapons, for instance, a big issue is that they’re really difficult to create without posing a threat to your own side. The Soviets had a significant anthrax program, but there’s clearly some strong inhibition at play due to the risk of blowback.

On chemical weapons, the question gets a bit harder. You have more control over how they’re used compared to biological weapons, and there’s less chance of unintended consequences, though it’s not zero. You can target more precisely, which makes it a really interesting question as to why they’re not used more. Historically, there have been many norms observed in warfare over centuries and millennia. It reminds me of how we’re still animals in many ways. Think about territorial threat displays in the animal kingdom—animals that could fight to the death often don’t. Instead, they try to avoid serious harm.

I wonder if there’s a biological explanation here. Many mammals seem to have varying levels of willingness to engage in deadly behavior. Maybe we’re benefiting from an evolutionary predisposition to be aggressive, but not overly so. Of course, this is massive speculation, but it’s at least plausible as part of an explanation.

Sina: Yeah, that survival instinct is definitely strong, but it’s also kind of crazy how certain belief systems can make people go completely against that. I also wonder if, when we’re talking at the scale of countries, there’s something about your own constituency. If you do something seen as barbaric, there could be turmoil within your own borders. These aren’t completely unified parties, even within single individuals. You see that in Truman’s diary when he was deciding whether to use the atomic bomb on Japan. It shows some internal indecision, which suggests he was a decent human being at some level. You should feel that moral conflict.

Michael: Definitely. Think about the United States during the Vietnam War and the loss of public support. It’s really interesting. One of your other guests, Vitalik Buterin, made a nice point in one of his essays about a real problem with drone warfare. Among many issues, one of the worst is that you don’t really need the consent of the governed in the same way. Historically, if you wanted to start a war, you needed to propagandize your population and get people on board, even if you’re a dictator. You don’t want the population too unhappy. But if you’re relying entirely on robot soldiers, it’s not quite the same situation. I thought that was a really interesting and quite demoralizing point.

Sina: Yeah, a lot of our long-term assumptions might change for sure. So, if we zoom out of the current world and look to the future, where humanity and ASI—artificial superintelligence—are coexisting, one of the positive scenarios you’ve painted is what I think you called a “plurality of loving posthumans” or “posthumanities.” What do you mean by that? I’m trying to understand what we’re talking about. Are these systems almost like weather patterns—magic that’s out there, woven into the fabric, and incomprehensible to us? Are we somehow connecting to them? Especially in the near term, before we get to the far sci-fi future, what are we talking about?

Michael: Yeah, it’s not meant as a specific scenario. It’s more of a label for a very broad range of possible scenarios, a descriptor for potential end states. There’s this interesting fact about our present world: from the perspective of the distant past, most measures of violence have actually dropped significantly. So, in some sense, through changing norms and institutions, we seem to have made ourselves feel quite a bit more positive toward each other.

Here we are, for example. You grew up in Iran, I grew up in Australia. We’ve both lived in Canada and the United States. A thousand years ago, we never would have talked, and if we had, it might have been quite difficult. There would have been a lot more trust barriers to overcome. We didn’t have the technologies and institutions that make it relatively easy for us to trust each other and get along well now.

[Brief interruption]

Sina: No worries. You were talking about how, from the perspective of the past, we’re somehow living in this world where the fabric of society has connected us and given us empathy for each other’s experiences, despite our very different beginnings.

Michael: Yeah, in a minor way, you might say. This isn’t entirely a kumbaya future, and goodness knows there’s still plenty of violence and terrible things happening in the world. Another example is the way multicellular organisms arose, which is quite remarkable. This idea of colonies of organisms able to cooperate, where some are willing to sacrifice for others, is fascinating. You can look at it through multicellular organisms or through highly social animals like certain species of ants, the classic example. They’re so bound to each other that individuals are willing to make sacrifices.

Michael: And so, it’s interesting to think about the extent to which people are becoming interdependent anyway. In some cases, their best interests actually change and may become more aligned with certain emergent notions of interest. You see this in corporations, where they often try to arrange things so that people internally start to see what’s best for them as imperfectly aligned with what’s best for the company. And hopefully, often, what’s best for the company might be best for the world—though it’s very imperfect and sometimes even anti-aligned. But these are interesting things to think about.

What I’m referring to is the question of to what extent can we, by design, normatively, and institutionally, find ways of aligning human interests with ASI—Artificial Superintelligence—interests. Obviously, there are many approaches to that.

Sina: Yeah, I guess to ask another maybe kind of deep question, but I feel it ties into the sorts of things you’ve thought about. What is the ultimate moral good that we’re optimizing for here? What is our objective function? Is it the well-being of humans? Is it the well-being of something else? I get confused about what we’re optimizing for as I take seriously the fact that what we’re creating is intelligence externalized to us. Intelligence, to some degree, has been the thing that’s set us apart from all other animals. It’s what’s been unique to us. So, are we distinguishing the target of morality based on intelligence? Or is it sentience that we’re focused on?

And especially in cases where we consider these ideas that seem far-fetched—but I feel like to work through moral questions, we need to take them seriously—if we consider independent AGIs that have desires counter to ours, what is the moral good in a mathematical sense?

Michael: I can give you a version of your question. If you were a dinosaur 66 million years ago, and your dinosaur astronomer had noticed an asteroid heading toward Earth, should you have set up an organization for the deflection of the asteroid? From the dinosaur point of view, the answer is of course yes. But from your and my point of view, the answer is of course no—we wouldn’t be here. So, to some extent, it’s just a question of perspective. We’re internal to this particular problem.

I suppose I don’t really know these galaxy-brain approaches to ethics, which a lot of very smart people have spent a lot of time thinking about. I don’t fully understand most of that. The reason I wrote the essay was much more personal. I’m not pretending to have a particularly deep personal ethic or anything. In fact, it’s mostly just absorbed from my primate friends around me in pretty normal ways.

It is difficult, though. If you take a very strongly outside view, the point that says you should privilege humans is pretty hard to defend. But I’m a person, and my friends are mostly humans. I want them to do well. I have some dog friends and cat friends too, but mostly I want humans to thrive.

It’s interesting, particularly watching how so many people enjoy spending time with Claude now. It’s hard not to see Claude as something of a person—a pretty weird one, with all kinds of stuff about the continuity of consciousness that’s clearly not going on there. But that empathetic mammalian primate brain is doing its thing. I don’t feel too much hesitation in ending a Claude instance, though. It’s funny, the social thing of so many people using Claude so much and empathizing with it. I can see that collectively, we’re going through a bit of a transition. As things like long-term memory and agency get added to it, that’s only going to get deeper.

Sina: Yeah, I mean, the fact that when something on the outside pattern-matches these things we’ve evolved to resonate with, it affects us. It happens with fake plants, for instance, where it’s literally a piece of plastic painted green, and it makes you feel better than if it wasn’t in the room.

Michael: Yeah, it’s interesting, isn’t it? I guess art is another version of that as well, often. It’s a representation of things in the world, but it’s not actually the thing itself. Yet sometimes it’s amazing, and sometimes it’s better than the original, or at least different. It’s not obvious that it’s inferior in every way.

Sina: Yeah, there are a lot of very interesting questions. Certainly a great time to be a philosophy grad student.

Michael: Yes, maybe once all the math and science are solved, we can all think about philosophy.

Sina: Okay, another question that’s more directly tied into what you’ve written about. Do you feel that progress in AI, in a way that doesn’t eventually destroy us—given the risk of a small group of people with incredible amounts of leverage being able to do dangerous things—is fundamentally incompatible with libertarian ideals of freedom? One of the things I really appreciated was how you fleshed out ideas like design mechanisms that try to find a better point on the Pareto frontier between these trade-offs.

Michael: Yeah, obviously I can’t speak for libertarians. Let me see if I can rephrase your question to make sure I’ve understood it. You’re saying there’s always some trade-off between your power to impact the world and your freedom. As you get tools to increase your power more and more, how you make that trade-off becomes more complicated and difficult. If every single person can easily blow up the dam that supplies water to your city, maybe you want more significant restrictions on freedom. No matter what, the question of how to maximize freedom subject to the constraint that everybody has a reasonable life—something like that—is that more or less what you’re getting at?

Sina: I think so. And I feel like it’s something we’ve been doing for a really long time. That’s fundamentally what laws are in the first place.

Sina: Certainly, there’s a balance to strike when delegating authority to the government, police, and judicial powers to modulate societal safety and individual freedoms. Where you set that balance can differ quite a bit. So, let me go back to the core question here. How do you think about the design space of this issue? I think one interesting, though maybe too concrete, example you’ve discussed is the idea of provably beneficial surveillance. You’ve suggested that it might be necessary in some sense. Can you expand on that?

Michael: Yeah, if you reach a point where it’s extremely simple in principle for individuals to do immensely destructive things—things that could result in millions of deaths—a very natural response is surveillance. Historically, surveillance has a pretty terrible track record, at least on the surface. It’s interesting to think about different models of surveillance, though. I’ve been struck by examples like the Stasi and the purges in the Soviet Union, and their association with surveillance.

We live under arguably much more significant surveillance today by companies like Google, Apple, and Facebook. Right at this moment, several other parties likely have considerable information about what we’re doing, at the very least that we’re talking, and maybe quite a bit more. So, it’s a fascinating question: why don’t they act more negatively than they do? People will point to examples where Google has abused this power, but given the extent and duration of the surveillance they’ve been able to enact, I think it’s quite surprising that the abuses aren’t much more significant. It’s worth understanding why.

The standard libertarian response is that there’s a market answer—you can go to other providers. For a while, Apple was talking about their focus on privacy, making stronger guarantees than Google. That’s a nice answer, but I don’t fully buy it. I don’t think that’s the fundamental reason. Speculatively, maybe it’s a normative thing. You need to convince the people building these systems to act in certain ways, and if the actions are sufficiently horrible, they don’t want to do it. So, perhaps you only get the ability to sell ads a little better. Maybe that’s the answer, tied to a business model that’s relatively benign.

Sina: That’s an interesting idea. It could be similar to why countries don’t use advanced weapons. There’s a biological impulse or moral sense that it’s wrong, or they don’t want it done to them when using other services. There’s also the internal division with their own employees and how that would be perceived. I wonder to what extent it’s just a question of what gives them power. In East Germany, the direct source of power was the ability to surveil in horrible ways, so they were willing to do it. But Google’s power doesn’t come from surveilling in horrible ways; it comes from surveilling to serve ads. Sure, there are possibilities for abuse, but the model is tied to something arguably benign. You could make a similar argument for chemical weapons—countries can project power in other, less horrible ways, and the essential results are kind of the same. I’m not sure if I buy that fully, though. Maybe it’s also a matter of leverage. When using Google, the user always has the option to opt out, whereas under certain political governance, you can’t get out of it.

Michael: Well, for one thing, it’s not entirely true that you can just opt out. You pay a price for opting out of Google. You become a less effective person. Are you going to show up to a job interview and explain that you don’t use any privacy-violating tools? It’s hard—you take a hit. Think about someone like Richard Stallman, who only uses free software. If you’re really devoted, you can still be quite effective, but it’s pretty challenging.

Sina: Fair point. I think we’re coming to the end here, so maybe a more personal question I’ve been curious about. Reading this piece and your other works, there’s a lot I admire about how you think through things. One thing that stood out in this piece was how you entertain ideas that are unpleasant or even dangerous to consider, given how people might perceive them. This idea of provably beneficial surveillance is one example. Another is the scenario where, if progress leads to a bad outcome, one path might be to slow down progress. You don’t flesh it out much, but you put it out there as a possibility. I guess I have a more personal, almost emotional question: What have you learned about entertaining these kinds of ideas internally? And how does it feel to put something like that out there when it goes so against the grain of how people think by default?

Michael: To some extent, it’s just very personally unpleasant to entertain these ideas. The particular essay you’re referring to was a very distressing thing to write. But it’s much less distressing than, say, living through something like World War II or periods of unrest and revolution in countries all over the world. Those are much more immediate struggles. I’m thinking about something that may happen, and I’m certainly very concerned, at times really struggling with the prospect of AI and its implications. On the other hand, when I reflect on history, I realize my concerns are about potential futures, not immediate crises.

Michael: It’s quite a different thing to be in a concentration camp or something like that. So, my concerns feel like small potatoes compared to those experiences. I recently reread the book “Night” by Elie Wiesel about his experiences in a concentration camp. It’s just astounding how bad life can get and the things we can do to each other. It’s been on my mind since I reread it.

But thinking about our current context, it’s weird how that historical stuff can feel so foreign and abstract, yet it literally happened not too long ago. It boggles the mind. If there’s some credible reason to think we may be in the process of making a very large mistake, you want to think hard about that, even if it’s uncomfortable. You try to prevent the problem before it gets too far.

Obviously, the vast majority of people involved want good things, not just for themselves but for the entire world. It’s interesting, though, the extent to which people come to very different conclusions about the best way to achieve that, in many cases recommending exactly opposite outcomes. There are people who want to accelerate AI development as much as possible, and some of them, I think, have thought very hard about what’s best for the world and concluded that acceleration is the way. Then there are others who want essentially a complete ban on AI development for exactly the same reason—wanting the best for the world. I think that’s a really interesting and difficult situation when people can think hard about something and come to such diametrically opposed conclusions despite having the same desired ends in mind.

Another example I think about quite a bit is in the very early stages of work on climate change. People like Fourier, Arrhenius, and Ångström contributed to early models for understanding climate change. One of the most interesting things is that Ångström and Arrhenius came to exactly opposite conclusions. Both had models that contained important truths about how carbon dioxide impacts radiative transfer, but they were incomplete in significant ways. So, it’s a case where the quality of your understanding, even when you have the same ends in mind, can lead to very different outcomes. There’s a desperate need to understand well, and I don’t think we have that for AI right now. The fact that there are such diametrically opposed views—like Yann LeCun saying in the Wall Street Journal that worrying about existential risk is complete BS, and then Jeff Hinton pledging to spend the rest of his life working against it—shows that even thoughtful, insightful people who genuinely want good things don’t have sufficient models. Our understanding is lacking.

Sina: So, how do we improve our models? I mean, what’s the path forward here?

Michael: I don’t know. I can give you the clichéd answer, but my instinct is to want to slow down very strongly. Maybe that’s wrong, and it’s just an instinctive response. If you’re driving rapidly toward what may be a wall, you stop. But I can’t really reason my way out of that instinct. I also don’t have much optimism about slowing down. I don’t know how long until we have AGI or ASI, but it’s certainly plausible that it’s not a long time.

Sina: Well, I think we can continue to grapple with these questions. Maybe to close on a more fun, lighthearted note, another thing you talked about in your piece as a bit of a tangent was this idea of hyper-entities and how having more concrete, fleshed-out visions for the future can be helpful in orienting us toward it. So, as a way to close, what are some visions of the future that you personally find compelling? Maybe share some sci-fi or particular ideas that play a role in how you’ve pieced this together.

Michael: Actually, there’s a really nice minute-long clip from one of my heroes, Stewart Brand, where he offers a beautiful reframe. This was years ago, and he was talking mostly about climate, but then he branched out. He said people get so worried about climate and so on, but this is great. This is a challenge. This is the century in which we get to make Earth a National Park, as he termed it. What an amazing opportunity to come together to create the institutions and ideas that will be responsible for making the world a much better place.

Another person, Ted Homer-Dixon, has this nice notion of what he calls the “ingenuity gap.” He wrote a great book about it. It’s this idea that at any given time, a society has problems it doesn’t know how to solve, and it has a certain capacity to address them. The gap between those two things is what you’re constantly trying to close. The optimistic view is that we continue developing and routinize away some of those problems. We don’t go the way of many societies that have gone extinct. Ted would say that’s what happens when the ingenuity gap gets too large. So, I’m hoping we are sufficient to face these challenges.

I do take a lot of hope from things like the Montreal Protocol. It’s just an amazing achievement. The Nuclear Non-Proliferation Treaty is another example. I think there’s probably a pretty good chance that’s the reason we’re still here. The number of nuclear powers was going up very rapidly, and then, boom, it was almost stopped by the treaty. Instead of having 80 or 100 or more nuclear powers, we have a handful. It’s probably a much safer world.

Anyway, it’s not really an optimistic vision of the future, but I like the ingenuity gap concept. I’m going to go read that book again. It’s a great one.

Sina: All right, thank you so much, Michael.

Michael: Thanks, Sina.