Jeffrey Quesnelle on Nous Research, large language models, and the human mind

March 2025

Listen

Spotify |

Apple Podcasts |

YouTube |

Show notes

This is my conversation with Jeffrey Quesnelle, cofounder of Nous Research.

Timestamps:

(00:00:00) intro
(00:01:08) working with new technologies
(00:06:15) Nous Research origin story
(00:14:08) open frontiers in research
(00:26:07) fourier transforms for gradient compression
(00:32:58) math behind distributed training
(00:38:18) sponsor: Splits
(00:39:02) neural networks history and fundamentals
(00:51:29) the human mind and AI, hyperdimensional representation
(01:01:15) intuition and reasoning
(01:15:00) parallels with reinforcement learning
(01:19:15) the cat is out of the bag
(01:47:11) deeper mysteries

Links:

Transcript

Jeffrey: I’m kind of stuck in my ways, and I can see that 20 years from now, I’ll be the gray-haired guy. When I was growing up, there were these Unix folks who used Vi all the time and refused to even open a GUI. They’d say, “You guys just don’t understand.” They wouldn’t touch C++—it was like, if it wasn’t 80-width characters in C, it wasn’t worth writing. Then there were the young kids like me, thinking, “I’ll use objects and pointers. Yeah, it’s kind of ugly, but it works.”

I question this myself. I’m only a light AI user, to be honest. I spend all my time working on AI, but I use it very little, especially for coding. Having coded for 20 years, the problems I can’t solve are usually extremely nuanced. They’re at the edge of where large language models (LLMs) today struggle with hallucinations or very specific issues. Often, the problem isn’t in the code itself but in how you’re approaching it from a meta perspective—like the organization of your servers or infrastructure. AI today can’t really break out of its box and experiment.

That said, I want to caution everyone that every “it can’t” goalpost has fallen mercilessly in the last five years. It’s very likely that within the next two to three years, all those limitations won’t be true. There will be AI models with full agency, able to run experiments, interact with the real world, and their reasoning ability over code and other domains will truly surpass the best humans on the planet. How I’ll interact with the world in that case remains to be seen.

One thing I’ll note—and this already sounds like a gray-haired perspective—is that there’s something to be said for being forced to figure things out yourself. If you value your own understanding as much as your economic output, that’s important. You can view yourself in terms of what you contribute to the world, and sure, that might be replaced by AI in many cases. But understanding something yourself? AI can’t do that for you because it’s not you. There used to be economic value in understanding because that was the only way to get things done, but that might not be the case anymore.

I saw a meme recently from “Lord of the Rings” saying, “The time of the orc has come,” but it was adapted to “The time of the idea guy has come.” Now, anyone with an idea can implement it without going through all the grunt work. Maybe that’s what humans are best at anyway. For me, though, I still find value in understanding things myself and working through them.

Sina: I’ve been struggling with this question too. In a world where AI continues to improve significantly, especially in verifiable domains where we can just deploy agents to write entire features or conduct scientific research, where does the human fit in? I think it’s about having the taste and judgment to decide what problems are worth solving—what ought to be done—and then, at the highest level, evaluating the output these AIs generate. To do that, you still need a strong fundamental mental model of the world. This is orienting me toward focusing more on the fundamentals and the scaffold of my understanding, knowing I can fill in the details with AI. But you really need to think from first principles about those key questions.

Jeffrey: Exactly. It will become very easy to use these tools as crutches and never develop that understanding. It’s like having a textbook where every answer is in the back of the book. There’s always the temptation to just look up the solution when you don’t know how to solve the problem.

Sina: Maybe to shift gears a bit before we dive into neural network fundamentals, how did you end up starting Nous Research, and how did you meet your co-founders? You guys seem like an interesting cast of characters.

Jeffrey: Yeah, it really was one of those right-time, right-place things. I like to tell everyone we were just homies on Discord who got big. But really, it started when I was working on this literary AI project. At the time, the best model available was a T5 model with only a 512-token context length. Then LLaMA came out with 2K tokens, and I was like, “Whoa.” But then ChatGPT comes out with 4K, and I’m thinking, “How are we ever going to get there?” It seemed like an insurmountable barrier at the time.

I’ll pause for a bit to explain why I chose this “Choose Your Own Adventure” project. I’m a huge fan of books—science fiction, fantasy, stuff like that. I grew up reading “Choose Your Own Adventure” novels. But I’m not very creative or artistic. I’m a terrible creative writer, a terrible drawer. However, I’m a really good coder. So, I saw AI as a way to express myself in a way I couldn’t before, just as much as computers have enabled people who couldn’t otherwise create to do so.

Jeffrey: I’ve always thought about AI as a tool that can help balance out our strengths and weaknesses. If you’re more right-brained, AI can handle the left-brain analytical tasks for you. And vice versa. When I saw this technology, I realized it was something that could help me do things I couldn’t do before. I can handle the analytical stuff, but the creative, right-brain tasks were a challenge. So, that’s why I gravitated toward this.

Speaking to the humanistic element, I hope we take this technology in a direction that helps create more well-rounded individuals. People who are better, more expressive versions of themselves with AI than without it. It’s about enhancing who we are.

Anyway, to get back to a project I was working on, I was experimenting with choose-your-own-adventure stories. The context length for these models was super short, so I started wondering if there was a way to make the context longer. I dove into the Python code, just tinkering around, trying different things to see what would happen.

With LLaMA, I stumbled upon a trick involving positional embeddings. For those unfamiliar, positional embeddings are injected into the model to give it a sense of the order of words in a sequence. Inside the attention mechanism, you’ve got these matrix multiplications where every token interacts with every other token. Without positional embeddings, the model doesn’t know the order of words. It’s wild to think that early Transformers worked without them, but adding this information about word order makes a huge difference.

The issue was, once you went past the sequence length the model was trained on—say, 2048 tokens—it had no idea what to do. It’s like the model believed the entire world could only ever be 2048 words long. That’s its reality, like gravity or thermodynamics. It’s never seen a piece of text longer than that, so at 2049 words, it’s lost.

But I was messing around in PyTorch and thought, what if I just tweak these positional embeddings? They’re floating-point numbers, which is a big difference between AI and traditional programming. Regular programming deals with integers, but AI is all about floating-point numbers. That’s where the magic and flexibility come from. So, I tried dividing these numbers by half after training. Suddenly, the model could handle 4,000 words. I was stunned. I thought this was a fundamental limitation, but by scaling the positional values—making them 0.5, 1, 1.5, and so on—you could squeeze more information into the same space. The model picks up on the relative distances, and it just works.

I posted about this on the LocalLLaMA subreddit, saying, “Hey, you can do this trick to extend context length.” Out of that, Bowen, who’s now our chief scientist at Nous, messaged me. He’d been working on something similar but more advanced, and he shared charts and data with me. We started collaborating. Then I posted about it on Twitter, and Technium, who’s kind of Twitter-famous in this space, reached out. He asked if I’d like to join a Discord of researchers. I said sure, joined, and saw there were only about 30 people in this private server. I recognized all the names—big open-source AI researchers. I felt like I’d broken into the inner circle.

That Discord eventually became the Nous Research Discord, which we later opened to the public. But it started as this tight-knit group of open-source folks working together. It was like the secret club of AI researchers. That’s how Nous started, born out of this curiosity and just diving into the code, trying things to see what happens.

What’s really exciting about the AI space right now is that it’s this vast, unexplored field of science. In fields like chemistry or physics, you might spend your whole life just trying to understand the state of the art, and if you’re lucky, you make a tiny contribution. But in AI, you can try something new, and it often works—not because the idea is flawless, but because no one’s tried it yet. It’s a powerful and exciting time to be in this field.

Sina: That’s a bit counterintuitive. With so much attention on AI, you’d almost expect some version of the efficient market hypothesis to apply. There are all these researchers and engineers working on AI; you’d think there’s nothing left to explore at this point. I’d assume we should move on to whatever comes after these problems are solved. But you’re saying there’s still an open green field right here?

Jeffrey: Absolutely. Every few months, there are new discoveries about architectures, about data, and they meaningfully advance the state of the art. Take a look at what happened with DeepSeek, for example.

Jeffrey: So, these ideas weren’t necessarily special in terms of the application, but they were pretty simple, and no one had really tried them before. They introduced this concept called multi-latent attention in DeepSeek V2, and it’s how they’re able to handle these super long contexts without memory usage exploding. The challenge is twofold: can you do it, and does it take a huge amount of memory? So, they developed multi-latent attention, where instead of doing the full matrix multiplication, they perform a low-rank projection of the query and key components of the attention mechanism. They project them to a lower dimension, do the attention there, and then scale it back up. This results in memory usage dropping by like 90%, and guess what? It still works. The evaluations are basically the same.

What’s wild is that these are like five-line code changes that make all the difference. But those five lines operate at scale, and that’s really what has allowed AI to grow—this scaling idea. If you make it bigger, it just gets better. This applies to both dimensions: give it more data, it gets better; give it more compute and make the model bigger, it gets smarter at qualitative human reasoning. We know how to scale on that dimension pretty easily. We don’t know how to scale things with people as well, but with AI, it’s almost a “throw more money at it” problem. And as a society, we can throw more money at it.

This also speaks to the value of platforms like Nous in its current incarnation or Psyche in these networks that you’re building. The bottleneck becomes, can I raise funding? Can I procure a bunch of H100 GPUs to train a model and test this idea I have? But now, there’s this more plug-and-play network where I can just upload my idea, and it magically gets trained for me. That’s what we’re trying to build with Psyche.

Sina: So, with Psyche, I’d really like to hear the story of how this Discord community just became the Nous Discord. How did we get from there to here? What were the big inflection points in the story? Because you guys are making progress, but the whole AI field is also changing in massive ways.

Jeffrey: Definitely, for us at Nous, it started from this first principle of wanting to run AI in a way where you can step through the entire code, access it, and see it all. I’ve got a 4090 GPU, and I want to be able to run the model through and step through it. Zuckerberg did a great service for the world when he released LLaMA, because it forced people to realize there wasn’t really an inherent moat. There was one, but not as much as people thought.

After we formed up together, we said, alright, we have our mission. Our mission is to be the open AI accelerator. Every time there’s a barrier to the open-source world accessing this tech, like with context length or whatever it is, we want to knock it down, whatever it takes. So, we looked around and said, what are some of the other barriers that exist? We know how to do certain classes of things, but one thing that came to us is that the entirety of the open-source ecosystem for models is dependent on a benevolent third party releasing a foundation model.

At the time, there was just Meta, and maybe Google released Gemma, but it was kind of new. There were like two or three players putting out these foundation models. We thought, what happens if Meta just stops doing this? What if we don’t get LLaMA 3—this was before LLaMA 3 even came out—or LLaMA 4? Because we know the closed providers will keep going. They’re already on the path; the race has started, and they’re running. The open community needs to keep up in the race, but staying in that race was dependent on this almost economically unsound activity from one of these other players. That’s trouble.

So, the question became, what can we do so that the open-source community has a credible way to stay in the race? Well, we need to be able to train our own foundation models. Easier said than done, but problem identified.

Sina: So, what’s the real problem here? What’s the thing stopping this?

Jeffrey: It turns out it’s not actually access to data. The data these AI companies have been training on, at least in the text modalities, is basically just the internet. Anyone can get the data. So, it wasn’t the data; it was access to GPUs and a way to coordinate them. The problem was being able to train a model because, when this all started, you only had one GPU. The code, like PyTorch, was written under this paradigm called the single-device paradigm.

What that means is, when you’re writing the AI model, you write it as if you’re on one GPU. You logically think about it as if it’s running on one massive GPU. You don’t think about the fact that it has to get split across multiple devices. You just write something like a matrix multiply, and that one line of code in Python might actually take a megawatt of energy to execute. But in Python, it’s literally one line.

That’s the beauty of abstraction. By enforcing the single-device paradigm, it kept the complexity of the actual AI stuff very tight. You can open up the GPT transformer code, and famously, it’s like 40 lines of code. A couple of for loops, a couple of matrix multiplications, an activation function, normalization, and boom, you’re done. That’s it. Now, under the hood, to make this work across multiple GPUs became clear because we couldn’t just make them bigger. The tooling created these abstractions that would…

Jeffrey: So, the way this was set up, every GPU would hold different pieces of the model. But to make it fully flexible, you had to make the GPUs act almost like one big GPU. This was solved through an expensive routing fabric. Not only do people buy a bunch of GPUs, but they also have to hook them all up together in a single data center. They all need to be connected symmetrically, with the same latencies between them. They have to be the same type of GPU, with the same VRAM. It’s like you scale by making infinite copies of this one GPU and then hook them up with either RoCE or InfiniBand, which is extremely expensive—400 or 800 gigabits per second between the GPUs. The cables alone cost like $1,000 a foot!

By doing that, you could move data between the GPUs arbitrarily fast. As long as you could do that, it was easy to maintain that single device abstraction at the coder level. You could always just move the data around as fast as you wanted. That’s how the tooling for training built up. Everything was written with this idea that you have all these GPUs in one big data center, completely homogeneous.

But when you try to do training on a distributed scale over the internet with lots of different computers, that’s where it breaks down. The internet doesn’t give you that. You have highly heterogeneous interconnects between the data, and the total throughput is orders of magnitude lower. To give you an idea, the new GB200 racks from NVIDIA—the Grace Blackwell 200s—have 72 of them together with the new InfiniBand. The entire InfiniBand connection between all the nodes is equivalent to the entire bandwidth capacity of the internet. Just between those GPUs! So, that shows you we needed several orders of magnitude reduction to even make this possible. That was the fundamental technical limitation we saw.

So, we sat down and said, okay, can we solve this? We spent about a year working on the actual tech, asking if there’s a fundamental reason why you have to share all the data all the time, or if that was just an artifact of how the tooling was built. We developed a couple of new algorithms—one called MoE and a more advanced one called DRRO—that showed you don’t actually need to communicate everything between all the GPUs. You can communicate a very small fraction of the data. Sure, if you’re lazy, you can just share everything, and it works. But if you do this clever math, you only have to share these tiny pieces.

It’s like the difference between having an infinite-sized internet. If we had that, we never would’ve had JPEGs. We’d all just be sharing bitmaps or uncompressed AVI files because you could. If you had infinite storage and bandwidth, there’d never be pressure to create compression. But once we started wanting to share pictures on the internet, we had to invent JPEGs that could show the same image at a thousandth of the information. We took that exact same idea from JPEGs and applied it to gradient compression within model training. Now, you have this thousand-to-one reduction in bandwidth. It’s literally the same math behind it.

What’s funny is that a lot of these breakthroughs now are things pioneered in the ‘90s in signal processing. But their application at this huge scale, with these massive intelligences, gives you orders of magnitude effects. So, we took that and said, okay, now we have the ability to train models in a distributed way. But then you have the other aspect: how do you coordinate them? If you want to do this in a distributed fashion, have them come to consensus, and exist in an adversarial environment with no middleman, no third party, no central server controlling it all, then you need a decentralized consensus mechanism with economic securities.

That’s where we said, thankfully, the world has already developed one of these, and that’s basically crypto rails. That’s how we came to the conclusion of developing the Psyche Network. It really came out of the idea that we need to coordinate these GPUs in a decentralized way, and thankfully, there’s a great tool to do that.

Sina: How does the math work? How does it mirror compression with JPEGs and stuff?

Jeffrey: It works through something called the Discrete Cosine Transform. Basically, when you train a model, you’ve got your model, your inputs, and your targets. These are the fundamental pieces of learning. You’re saying, here’s this data, and I have this other thing I want the data to look like. Then you do this thing called cross-entropy loss. The objective is to minimize the difference between these two things. Then you do backpropagation to discover how much you’d have to change your representation to make what you thought it was going to be match the target.

That’s the full thing. It’s funny—next token prediction is just taking the target, moving it over by one character, and having it predict the next one. Everything we have about large language model reasoning comes from that simple objective function. People thought, what if we just took the labels, moved them over by one, and had it predict the next token? Through that objective function—basically matrix multiplication, linear algebra, and trying to reduce the difference with backpropagation, which is standard calculus—the emergent property is a language model that can talk and speak like humans.

That’s the mind-blowing thing to me, that this even could be done. It tells us something about how we’re probably wired internally. We can get to that later, but anyway, that’s how it ties back to the compression idea.

Jeffrey: So, the discrete cosine transform works, right? You get these gradients, and after you do the backpropagation, it says, okay, here’s how much you’d have to change your weights to get this as the correct answer. These gradients are on the same order of magnitude as the model itself. For every element in the parameter space of the model, there’s a corresponding element, which is your delta—literally how much to move by. You add the two together, and that’s how it happens.

In reality, though, you use this thing called an optimizer. Instead of going all the way there, you kind of go partway. The most popular one is Adam with momentum. Interestingly, the author of Adam was actually an author on our paper too, for DeMo. The guy who developed Adam worked on DeMo with us. Basically, what we do is take these gradients and think about them in the frequency domain. You think about Fourier transforms or other transforms where you take these things and convert them into this frequency domain.

Then, what we do is discretize it in the frequency domain and pull out just certain frequencies. We had this thesis that there are two types of frequencies within these gradients: fast-moving frequencies and slow-moving frequencies. This means there are things the model is learning very quickly, and we communicate them differently to the nodes.

Here’s how it goes. Inside each of the GPUs, we build up these frequencies. There’s a process where you’re summing in these frequencies, and then, every gradient step, we just pop off the top N number—it’s called top K. You select the top K biggest frequencies, communicate just those frequencies to all the other nodes, and then from those frequencies, you do the inverse of the transform. Now you have a sparse modification of the gradient matrix because you’ve compressed them to these frequencies, which correspond to different points. If you think of the gradient as a 2D grid, it’s like this giant 2D delta.

You change it over into the frequency domain, pick out a specific frequency, communicate just the frequency to the other nodes, do the inverse operation, and now you’re updating just those important pieces. So, the idea that the gradients are representable within the frequency space is the first piece. The second piece is that there are these long-running and short-running frequencies. The long-running ones will eventually, over time, bubble up and get communicated out. It’s not like a FIFO, but if it takes a while for it to learn, it could take a while for the frequencies to accumulate. Once those are the largest ones, we pop those out. If there’s a really high-signal frequency, we’ll pop that out right away.

By communicating these top K number of frequencies, that is sufficient to get the same results for the model. It tells you that the actual learning is happening at a different domain. Thinking about the gradients as this giant 2D grid works, but communicating the gradients directly is a highly inefficient way to represent them. You can almost look at the gradients as the values in, say, the pixel values like RGB values that get compressed down in an image.

Now, what is the intuition behind what a high frequency is here? We don’t really know, other than the amplitudes and the frequency numbers being based on something. What those correspond to in a representation space, that’s anyone’s guess. That’s really the thing about these LLMs—we don’t know what any of these things inside mean. People are working on interpretability, where they try to create auxiliary models to figure out what these different parameters actually represent. Can we map them back to something humans can understand, like, oh, this is the Canada bin, this is the color blue area, this is the Golden Gate Bridge? But we don’t really know.

We just say, we’re going to think of them as frequencies, whatever that happens to be. We’ll pull them out, communicate them back, and we communicate only the frequencies and the amplitudes between the nodes. Then we do the opposite to get the sparse representation. Now you have this sparse gradient where you have empty spots, and you’re only adding in at certain elements to it. Over time, just adding in those few pieces every few steps is sufficient to create the learning. Fascinating, right? Some very interesting technical problems to work on.

It was one of those things where Bowen had the original idea for this because he had worked in signal processing. He’d seen problems people had in audio representation solved in a completely different way. But at the end of the day, they’re just linear matrices. So he’s like, wait a second, these gradients are just like that too. I wonder if they behave in the same way that these other signals do. It was just having that background and the ability to say, well, let’s try it and see what happens. It took a long time—about a year to get it all right. There was a large element of faith. We had to believe it was going to work before we got any evidence that it would, or we would’ve given up. But eventually, it worked.

Out of that, we were able to build this DeMo optimizer, which basically reduces the communication throughput needed to train models by a huge amount. We’re pairing that with a network to actually coordinate the training—permissionless training of one huge model or multiple models. That’s the Psyche Network.

Sina: So, is there a reason that the person with the state-of-the-art NVIDIA rack we were talking about wouldn’t take the DeMo optimizer and use that to just get the boost on top of the fact that they have these high-bandwidth interconnects? Is there a reason that the bottleneck for them becomes that now you just can’t do the compute fast enough, or you run out of memory or something like that?

Jeffrey: Yeah, you certainly could, and I think it’s interesting. Not just doing full decentralization, the easiest application is like a hybrid decentralized setup where you own two data centers, and you could do fast stuff in between. One area with the DeMo optimizer is that it works…

Jeffrey: After the backpropagation, you still need to hold the whole model in fast memory. If you’re doing super large trainings, say at hundreds of billions of parameters, you can’t even fit it on one HGX, which is the supercomputer that NVIDIA uses. You need the interconnect between two HGXs to make it work. You can think of two HGXs as one node, and then you can use techniques like DeMo to interconnect between those nodes. So, there’s still a place for fast interconnects.

We’re definitely, as part of the release of Psyche, going to be incentivizing research in this area. One area I’m particularly interested in is parallelism for Mixture of Experts models along this dimensionality. You could have the different experts of a Mixture of Experts model be co-resident on different nodes. That might be a way to really scale things up because, frankly, most of the big state-of-the-art models are Mixture of Experts models, not dense models. I can get into why that is if you’d like, but for now, we still have to have the entire Mixture of Experts model on one GPU or one node, like one HGX. So, that’s where we’re focusing on incentivizing research.

What’s great about crypto rails is that they harness economic energy and pair it with technical energy. The two can become one in the bytecode.

Sina: We’re sponsored by Splits. Are you tired of sacrificing security for usability? Splits believes it’s still way too hard for teams to self-custody their on-chain assets. They’re building a new kind of internet-native bank on top of Ethereum. Splits makes it easy for teams to manage the whole lifecycle of their finances—from structuring revenue-sharing agreements using payment flows like splits and waterfalls to managing those earnings once received using passkeys and smart accounts. Splits is being used by teams like Protocol Guild, Zora, SongCamp, and others. I’m a big believer in them and recommend checking them out. You can learn more at splits.org.

So, one of the things we briefly talked about when we caught up before was this idea that most of us, even if we’re technical, are using these large language models through chat interfaces. Even if we’re technical, we kind of understand them at this level of abstraction—there’s training data with inputs and outputs, a loss function, backpropagation, and these parameters getting updated. But one thing we discussed was, what the heck are these things really? I’d like to use this as an opportunity to dive into that question. With your bent for philosophical thinking and your expertise on how these models are trained, can you explain what’s going on? There’s this first phase where they’re being trained, but then there’s instruction tuning and reinforcement learning with human feedback that makes them understandable to us. That almost feels like a layer on top of the chaos underneath. So, yeah, what are these things? How does it all work?

Jeffrey: Well, to take a step back, I mentioned this thing called deep learning. We didn’t actually start with language in particular. Really, the first thing that kicked off what we have today was the invention of AlexNet. There was this competition called ImageNet where you had to write image processing code to recognize certain images. At the time, the best solutions were from people using very advanced image processing techniques they’d written themselves. Then along came AlexNet, a neural network that could recognize images, and it wiped the floor with everybody. Immediately, people were like, whoa, wait a second, it can just learn on its own to solve the problem. That was our first hint that neural networks really could do the type of work that had always eluded previous artificial intelligence attempts.

If you read about AI in the 80s, people thought computers were going to be intelligent. There were all these attempts to create artificial intelligence, but the problem was we thought about intelligence as how we believed it ought to be, not how it might actually be. We thought the way to be intelligent was to think about all these rules about how things work. It turns out that’s reasoning, but it’s not intelligence. Intelligence is the raw, flexible material underneath, but we have no idea how to quantify it. It’s too fundamental to who we are for us to get a correct introspective view or experiment with it.

So, we tried and failed at AI by creating these expert graphs, thinking if you just got all the world’s smartest people to create rules, it would work. It just didn’t. But with neural networks, AlexNet showed that this thing could recognize images as well as or better than humans or the smartest people using conventional image processing techniques. That was a big moment of, okay, maybe this is what we need to be looking into.

As a brief note, so much of this story is about having faith in a signal. If you commit to something, you need a reasonable belief that it will work. You needed to see AlexNet succeed to know you should be working on neural networks. It gave proof that it was okay to expend more energy on this specific thing versus other possibilities.

So, AlexNet comes along, and there’s this big kickoff of deep learning, which is trying to solve problems using neural networks, typically for a specific downstream task. This is where I come into the picture with autonomous driving. The very first autonomous driving systems used reinforcement learning and deep learning models to figure out where the lanes are. It was all very specific to a task you’re trying to do. It worked kind of okay, but it was pretty brittle. It still took a lot of engineering. You were no longer writing code; you were building blocks of neural networks—convolutions, normalization layers, and so on. It was like a Lego world. You’d stick these different blocks together, tune all these hyperparameters, and maybe if you got it just right, it would work. It could see lanes on the highway, and it did it way better than any code you could ever write.

Jeffrey: Yeah, so back then, it was still this very rigid process. If you got one thing wrong, it just wouldn’t work, and there were all these problems with it. Through that struggle, deep learning started to take shape. Then, the next big breakthrough was the invention of the Transformer. This architecture, introduced in the paper titled “Attention is All You Need”—a very bold title, but justified by its effectiveness—proposed the Transformer architecture. At its core, it has this operation called attention. Attention is essentially a specific way of multiplying matrices, and calling it “attention” is our attempt to draw a human analogy to what’s happening. It’s like the model learns how to focus on things. Instead of just trying to solve a problem directly, it learns how to learn. It figures out when to focus on specific areas based on what it sees, thinking in its abstract representation space.

For whatever reason, this architecture just blew away everything else. It was way better than anything before it. What’s even crazier is that this architecture was extremely resilient. Before, if you wanted to do something like lane detection or recognizing letters, you needed a completely different architecture for each task. But with the Transformer and its attention mechanism, you could throw any problem at it, and it would work. It could solve pretty much anything. You just needed the attention mechanism. So, this was the first general neural network architecture. Unlike previous task-specific ones, the Transformer worked across any modality.

Now, diving into the attention mechanism and the Transformer itself, it’s interesting that the original Transformer paper wasn’t focused on next-token prediction. It was actually about an encoder-decoder model, transforming one set of text into another. Its first major application, which you might not have realized you were using, was something like Google Translate back in 2016 or 2017. Language translation is the idea of taking one whole text and converting it into another, and the Transformer did automatic language translation way better than anything that had come before.

So, that’s the foundation of the Transformer architecture. Then came the idea of the decoder-only Transformer. Instead of transforming one sequence to another, it autoregressively generates a sequence one token at a time using next-token prediction. That’s the current dynamic. Nearly all Transformers you see now are decoder-only, sampling one token at a time, over and over.

We’re now in a world where we finally have this general architecture that can represent things across modalities. The next big moment was the discovery of in-context learning. Once you have a foundation model, the real “aha” is that the model can learn to follow instructions within text it’s never seen before. We had no reason to expect this. Everything we do now with chatbots—telling it to do this or that—it can meta-represent those rules it’s never encountered and incorporate them into its learning.

Backing up a bit to foundation models, the idea was, “Okay, we can do this, but what can we build with it?” That starts with training a foundation model. A foundation model is where we realized we have access to a vast amount of data—the internet. Someone had the brilliant idea, “Why don’t we just make a model that models the internet, every word humans have ever written?” If we do that, somehow it will contain all human knowledge. We’re basically trying to compress all human understanding into one model. That’s the vast majority of AI training work right now—building these foundation models. That’s the secret sauce you need to create anything else afterward.

What’s fascinating is that when you have this foundation model, what emerges is something that can reason like humans. It can do the same sort of things our brains can do. There’s no inherent reason for this to be the case. Up until now, as a species, we’ve been completely unable to artificially elicit intuitive reasoning. We had no way to create it. But now, we have the equations that are the necessary and sufficient conditions to give rise to intuitive reasoning. It may or may not be how it’s done in our brains, but we have at least one example: if you follow these steps, intuitive reasoning emerges. So, it’s likely that whatever is happening in our brains is isomorphic to the methodology inside these language models. For the first time in human history, we can experiment with intuitive reasoning. It’s completely insane.

Sina: So, what’s your personal take on this? How would you map what an autoregressive Transformer-based model is doing to how our own minds work? If you were to draw some intuitive links—not perfect parallels—but sketch out how this Transformer architecture is like looking at the entire dataset of sensory data I’ve observed to date and doing what with it? How is this mirroring, or not mirroring, what’s happening inside each of us?

Jeffrey: The idea that I think is most likely is that AI models are creating a hyperdimensional representation of language. I believe that’s also what we do in our minds. When I say something like “the color blue,” or “the smell of eucalyptus,” or “the taste of chocolate,” what’s happening in your brain is that it has learned a hyperdimensional representation space. Blue sits in this space with certain values tied to dimensions like “cold.” There’s a dimensionality for cold, and it has some small value there.

Jeffrey: So, this hyperdimensional representation space of all possible concepts is something we humans have developed through evolutionary selection pressures. Our languages are like hyper-encoded cheats to input specific values into this space of understanding. This space comes from the pressures we faced through evolution, forcing us to quickly respond to unknown situations. We exist in time, and the ‘you’ that you are only gets one shot at it. From the first replicating chemicals in some underwater vent to now, only those things that were best at continuing on did so. They had to figure out ways to deal with the unknown world, to get energy, and to handle unexpected situations.

That’s the large-scale selection pressure imposed by entropy and time on physical reality. Systems that aim to optimize for continued existence and energy acquisition have to deal with that. Over time, we developed ways of quickly representing the outside world and deciding on actions to take. That’s why nearly all of our brain is intuitive reasoning. I can throw a ball to you, and you just catch it instantly. If you had to calculate the trajectories on paper, you’d never figure it out in a lifetime. It’s the same with mammals hearing a twig break in the trees and instantly knowing there’s a predator nearby. Everything we have is this instant pattern recognition performed in a hyperdimensional space optimized for survivability.

Creating that hyperdimensional representation was the best way to survive. It was forced on us because it allowed our ancestors to live longer, be smarter, and get eaten less. Now, for these language models, they’re likely also taking words and representing them in a hyperdimensional vector space. The word ‘blue’ maps to some values in a matrix tied to concepts like ‘cold’ or other ideas. We call things ‘cold’ or ‘blue’ because we have words, but the actual concepts we hold are probably things we don’t even have words for. Language is a very lossy, compressed representation for communication and discursive thinking. Discursive thinking operates at a high level of abstraction compared to the intuitive movements happening at the embedding or representational layer.

It’s interesting to think about the evolution of speech. Initially, our natural formation was just this hyperdimensional representation of concepts without words, because speech hadn’t developed yet. But then we discovered that if we could communicate what’s happening inside our minds to others, it would enhance our survivability. We had to cooperate in advanced ways with other humans, and that’s how speech emerged as a phenomenon. By lossily communicating tiny pieces of our internal representation to others and receiving their communications, we improved our chances of survival.

What’s also fascinating is that discursive thinking is somewhat analogous to what we’re doing with algorithms like DINO. The gradients themselves are this hyperdimensional space that’s too big to communicate directly. So, we’re creating words for the gradients, using terms to describe them, then communicating that to someone else. They might expand on it and apply it to their own representation. What’s unique about humans is that no two people have the same internal representation. When I say the word ‘blue,’ your internal perception and application of it are totally different from mine. That’s how you can interpret what I say in a different way.

There’s this mirroring between what’s happening inside a neural network as it’s being trained—multiple fractal layers, the values it holds, how it updates its gradients—and us as humans. We hold parameters in our mental models, get compressed gradient updates from others or the outside world, and update our own weights. It’s likely that we’re much more complicated than a single monolithic model. We’re probably lots of different models that have evolved together over time. But it’s a close enough facsimile to say that a single model gives rise to apparently equivalent phenomena. It’s not to say the two are the same thing, though.

We don’t know if the underlying architecture is the same or even conceived in similar ways. That gets at the heart of the philosophical question: What does it mean to be human? Are we only composed of the matter in our bodies, with our awareness as an emergent phenomenon from this representational space? Or is our consciousness something beyond that? These are open questions in philosophy and, increasingly, in science too. People are willing to ask about the ontological foundations of the world. So, I wouldn’t say they’re the same, but rather that they give rise to equivalent phenomena.

Sina: Yeah, it’s fascinating. We’re slowly unbundling intelligence. Another thing it makes me think of is, as a human being, I’ve felt for a long time that I think better and more effectively with the ability to write, to externalize thoughts. Then, when I read them back, it’s like an enhanced process. An even better version of this is when I’m on a computer. I can write things in a Google Doc, move them around, and effectively have a larger file system or workspace for my thoughts.

Sina: One more thing, I finally read one of Ted Chiang’s short stories last week for the first time, “The Truth of Fact, The Truth of Feeling.” I don’t know if you’ve come across that one, but it’s 100% worth reading, especially for someone working on what you’re doing. He makes the case that language itself is the first technology that augments what we have. Without it, we couldn’t express an idea for someone else to hold and echo back to us. So, you’ve added external state to a system that didn’t have it previously.

Jeffrey: Yeah, I tell people all the time, if you’re wondering when things went wrong with technology or whether we should stop AI, I say humanity’s original sin was the written word. Everything we have now stems from that. If you wanted to nip this in the bud, you needed to stop the written word. That was the real unlock. Everything else is just a natural consequence of that invention.

But it’s interesting you mentioned that idea of writing things out. We see the exact same phenomenon with AI. I mentioned earlier about these task-specific models that didn’t work because we had the order wrong. We needed to create the intuition first. Think about what reasoning is—it’s the ordered application of intuition, a structured way of applying it.

At first, we had these models trained on the internet. They were good, but they said all sorts of crazy stuff. Then we developed this idea of instruction following, and later, the creation of Chain of Thought. Chain of Thought came from a guy who had the idea to make the model say, “Let’s think about this step by step.” Just forcing the model to write out the reasoning steps caused it to get way better at evaluations. Instead of relying purely on human-like intuition, it had to lay out each step.

Jeffrey: People often make fun of things like the “strawberry” problem—how many R’s are in the word “strawberry”? It’s hard for AI. But let me tell you, it’s easy for you because you can look at the word. Now, if I say something like, “Four score and seven years ago, our fathers founded on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal,” how many E’s are in there? Very suddenly, after only a few words, you realize as a human that you can’t instantly count them either. That skill breaks down.

This is a very artificial example, but the real reason AI struggles here is because you’re asking it to count, which is a reasoning task. Intuitive thinking happens instantly for us—boom, it’s there. For an AI, that’s like one token, one forward pass. It has to get the answer all at once in that single pass. Something like counting letters is a reasoning step. It involves accumulating an answer: there’s an R, there’s another R, and so on.

Our minds can do this instantly up to a point. There’s a term for it—subitizing, I think—where you can instantly know how many objects there are up to about seven or eight. Beyond that, you have to count them out; you don’t have that instant intuition. Same with letters. Our intuitive thinking can handle some counting problems instantly, but anything beyond that requires developing reasoning. The invention of reasoning was the last unlock of humanity’s intellectual development—taking our intuition and structuring it in a specific way.

Jeffrey: For these AI models, we discovered we can force them into a reasoning mode by making them think step by step and write it all out. Just like with humans, this makes them way smarter at the answers they provide. This weird phenomenon in AI completely maps to the same phenomena we experience with our own learning and experience. It tells us that these two processes are isomorphic under some conditions.

You’ve probably heard about DeepSeek and similar approaches. What DeepSeek did was incredible. They recognized that reasoning tasks and Chain of Thought are important. So, they thought, why don’t we do reinforcement learning just on the reasoning steps? Before language models, old reinforcement learning was brittle and didn’t give rise to emergent intelligence. But if you first create intuitive thinking, then you can use reinforcement learning to force the reasoning process to improve.

It’s like how you could never send a baby to high school. It just won’t learn, even with Socrates as a teacher asking questions. DeepSeek applied this to verifiable questions where you have a clear answer. Instead of next-token prediction, which gave rise to intuitive reasoning in models, they said, okay, now that we have intuitive reasoning, let’s use it to do structured reasoning to arrive at the right answer. They freeze the intuitive reasoning part of the AI model—not changing any of that—and then apply reinforcement learning. The model tries to figure out how to solve a hard problem, like a math problem. The loss function isn’t next-token prediction anymore because we don’t even know what the next token should be. We’re guiding it to reason through the problem step by step.

Jeffrey: So, to let you use the tools you’ve built through next token prediction to write out a reasoning chain, the only thing that matters is whether you get the answer right or not. That’s the reward signal. Unlike with normal cross-entropy loss, which gave rise to the emergent phenomena of intuitive reasoning, we now change the objective function. The objective function is now this binary reward, where you just assign a score to the outcome of the entire reasoning chain. You either got the answer right or you didn’t, just like when you’re in school—thumbs up or thumbs down. Then, through reinforcement learning, the model is allowed to sit there and try to discover new ways of using its intuitive reasoning to come to the correct solution. When it gets the correct solution, we go, “Aha!” It gets a one, and through the magic of reinforcement learning, we’re able to take this non-differentiable objective—this true one-or-nothing—and turn it into a differentiable objective. We backprop that, get new gradients, apply it to the model, and go again.

That’s where you see this new class of so-called Reasoner models. In fact, Sam Altman just announced that this is the next thing. They’re not even going to release GPT-5, or they’re renaming GPT-4.5, and every model from now on will be this new type where you have this reasoning state applied after it. After it’s learned the world through intuitive thinking, like the early ChatGPT you worked with, it was all intuitive thinking. That’s why it would be profound in some ways and make the simplest mistakes in others, just like people do when they’re thinking off the cuff. But now, we’ve figured out a way to take these old reinforcement techniques from the ’90s, now that we have intuitive reasoning, and apply them to elicit real logical, structured reasoning to solve extremely difficult problems.

So, the base model’s intuitive reasoning is like I’ve built this very deep, higher-dimensional understanding of the world. But when I’m prompted with an input, I basically shoot from the hip. I’m completely acting intuitively, doing the best I can with one token at a time. It’s like life with a gun to your head—if someone said, “What would you do?” it’s just boom, boom, boom. It’s the analog to how we move through the world, how thoughts arise. Then the next step is you find this trick of giving the model an instruction: “Show me your reasoning.” This is equivalent to us thinking through a problem. There’s this interesting thing where you might naively imagine that anything derived from this pre-existing intuitive lattice should just be a foregone conclusion. But there is a process of actually working through a problem, going down different paths. There’s an exploration process, and you’re doing way more work when you do that.

That’s the reason you shouldn’t think it ought to just be one-shotted. In one forward path, there’s a fixed number of matrix multiplications happening. You can make information-theoretic arguments that it could never recognize every possible function because you’re doing a fixed number of multiplications on an unbounded possibility space. It necessarily can only recognize certain classes of things. But with Chain of Thought, what it’s really doing is, when we say, “Think about it step by step,” we’re basically forcing the model to expend tokens to have more time to think. By going longer, the number of multiplications before getting the answer is a lot more. Like with DeepSeek or Deep Thought, the answer is 42, you know? The AI model can only get more time to think by saying more words because that’s its sense of time. A new token is its Planck time of reality. The only way the arrow of time moves forward for it, and it can spend more energy, is by saying more tokens. So, we found a way to hack getting the model more matrix multiplications to come to the answer. It’s nice because it’s interpretable to us too. By forcing it to say it out loud, we can sort of understand it.

So, that’s basically the paradigm that exists now. We spend more time on coming to the answer, and the way to force spending more time, more computation, is to have the model write out in language all the thoughts that could help it get to that answer. Then the next step was, once you got it to do that, you switch the optimization function. In these verifiable domains, you get it to do that reasoning and try to get the answer. Once it gets the answer, through making a non-differentiable output differentiable, you feed it back in. That’s kind of like making something that’s non-intuitive and requires reasoning and discursive work into intuitive, correct? It’s like something becoming second nature.

Just like great mathematicians will tell you they look at equations and see them instantly because they did so much work going through it that they forced it into their intuitive reasoning. They forced an intuitive understanding of this discursive way. Same for me with coding. Sometimes when I’m really into it, I try not to think about what I’m writing. If I think too much about what I’m trying to do, I’ll mess up. But I can feel the code, you know? I feel the structure of the project, the way everything fits, and I can just let go. That came from my entire life since I was 10 years old, having this reinforcement learning loop running. I’ve been manually reinforcement learning myself to create an intuitive understanding of tech, code, and stuff like that.

So, the magic with RLHF, or the one that DeepSeek uses, is that you figure out a way to take this non-differentiable objective and convert it back into the language of backprop. Then you can backprop the gradients, turn it into intuitive reasoning, and loop, loop, loop. That’s where it scales. Humans can only do it one time, but with AI, we can do this a thousand times, a million times in parallel. That’s where the gains will be. If you think about it, what’s interesting about these…

Jeffrey: What’s happening with reasoning models is that you’re dynamically generating the training data. It’s kind of like the difference with pre-training and intuitive reasoning. With pre-training, it’s as if you’ve seen all the words ever written. You’ve got everything in front of you, and now you have to go out on your own and try to discover something new. So, the model is generating its own training data. Once it gets an answer right, that becomes part of the training data. You’re essentially searching for examples that show how to solve a problem, and when you find that data, you take it and backpropagate it.

We’re at this moment in time where this process is working through our system. It applies to any domain where there’s a verifiable correct output. Math and programming will be the first areas, but then we’ll move to other hard sciences, getting them to a point where you can objectively know if something is wrong or right. That’s hard to do, though. For example, you can’t easily do that for biology because when we introspect our understanding, there’s a lot of heuristics involved. Maybe you hook up these models to virtualized lab settings where they can run experiments. I don’t know exactly how that would work.

Then there’s the question of more subjective domains where there isn’t a verifiably correct output. In those cases, you need a human in the loop. We get into how taste is subjective—esthetics and things like that. How does this ripple through our whole system? What happens from here? What are all the things we need to work out?

Sina: Yeah, there’s so much to unpack here. The reinforcement learning and reasoning stuff adds another dimension of scale, and we’re just at day one of it. There’s a long way to go before it saturates. It’s interesting to see that models trained this way have applications elsewhere. They become better writers, for instance. There are hints that focusing on verifiable domains has benefits in other areas as well.

Jeffrey: Exactly. As long as the intuitive thinking is grounded enough in the world, you can create more and more complex objectives. Having access to the internet raises obvious safety concerns because you could set an objective to elicit some outcome in the world. Of course, there would be safeguards, but you could probably get pretty far if your goal is to make the model smarter by defining some verifiable outcome. For example, gaining a million Twitter followers—that’s a measurable outcome. What would it take to get there? Does it mean becoming a social media sensation? Does it mean becoming president of the United States? There are many paths to that goal. The thing about backpropagation is that it’s a relentless forcing function. It will channel energy in whatever ways it can.

So, you can probably get pretty far with reinforcement learning just by expanding the dimensionality of what you consider a verifiable outcome. But when it comes to truly subjective areas, that’s very much grounded in our human experience. We value art because it reflects our experience of the world. It will be interesting to see whether a model, a priori, would develop the same aesthetics as we do. It probably won’t be able to because it’s seen too much of what we already consider valuable. The jury’s still out on where that will go.

Sina: Yeah, this stuff is infinitely interesting to think about. I feel like our sense of what’s beautiful—whether it’s elegance, a mathematical formula, or symmetry—comes down to order and understanding deeper truths. That’s what we find beautiful in an important sense. It might be the same thing these models find beautiful, or it might be different. Our understanding of beauty and elegance could be driven by a larger forcing function, something unique or special to us.

Jeffrey: That could even tie back to physics or the anthropic principle—why does the universe exist? There are all these weird things in the universe that give rise to life as it is now, creating the order we see. Take the fine structure constant, for instance. It’s a dimensionless quantity representing the ratios between fundamental forces, and it’s about one over 138. It’s invariant to how you measure it; it’s not in kilograms or energy units, just a pure ratio. For all of quantum mechanics and particle physics as we understand it to work, this number needs to be exactly one over 138. If it were one over 135, atoms wouldn’t form. If it were 142, matter wouldn’t collapse after the Big Bang. It has to be precisely this number to create the order that allows all the matter and structure we have now.

As humans, as the things we thought were special to us get pared away by technology, it might force us to introspect more about what is truly unique about ourselves. I think that’s something the world will be struggling with over the next several years. First, it’ll be about economic applications. I like to tell people we’re going through something like the Industrial Revolution. That revolution took work only humans could do—digging ditches, lifting things, planting crops—and essentially erased the need for humans to do it. People don’t realize, but before 1880 or so, something like 60% of the U.S. population and 90% of the world was engaged in subsistence or agrarian farming, directly trying to feed themselves. Within a generation, that completely flipped.

Jeffrey: And then those numbers got inverted, basically, and that meant a lot of people lost their jobs. Society underwent massive reformations because of the Industrial Revolution. World War I and World War II are byproducts of that revolution, and everything else came with the reordering of the world under that paradigm.

What will be happening over the next 10 years is this Second Industrial Revolution of thought and intelligence. Huge classes of things that only humans could do before will get automated and taken away. So the question is, as a society, where do we place people? I think we need to not look at it as something to be avoided or as something bad. Just as the Industrial Revolution was an opportunity to take on the next class of challenges and growth for ourselves, this is a deep ask. We’re on a hero’s journey, and it’s that moment of transformation. It’s going to be difficult, but it’s part of the path.

It’s a magical thing, the fact that we’re creating the process of intelligence and reasoning. It asks of us this question of what we ultimately are and where this is all going. It’s a much more wondrous world that we’re coming face to face with this question, rather than living in a world where we don’t have to confront it.

Sina: Yeah, I feel like this is maybe one of the bigger, deeper questions, similar to what you were saying when Stable Diffusion came out. People saw it, but they weren’t really grasping what it meant. I think there’s a similar thing here where we have these models, we’ve figured out this reasoning thing, and there are probably going to be other breakthroughs. But underneath this breakthrough is the looming question that we will all have to grapple with in 10 years, in 20 years.

Jeffrey: Exactly. Large transformations in society like this are often mirrored by conflict. Taking it back to psychology, it’s likely that the next global conflicts will somehow involve this technology, whether it’s access to the ability to make the chips or access to the rare earth minerals needed. It’s likely that the next great social changes in our world will be driven by these technologies playing an important factor.

Insofar as that’s the case, you can expect things to be a little turbulent for a while as we figure out our place. But we also have a chance now, as a shared human community, to see where this is heading. The great promise is that it could give us the ability to solve a lot of those problems with the technology instead of fighting over it. If we could solve, with AI, something like room-temperature semiconductors, that alone would alleviate huge classes of problems in the world.

We have a chance to not go down the path of war and takeover. There’s a much better version of the path here where, through this technology, we’re able to uplift all of humanity together. But having said that, one must be prudent about the potential possibilities. That’s why we’re creating a way for decentralized artificial intelligence to be accessible by everyone, without the control of a single nation-state, where it exists nowhere and therefore can’t be shut down or taken away. Because the other problem is that this could be a very centralizing thing, something that centralizes power even more.

This is why you see this race going on. There’s this idea in the industry that whoever gets to AGI or ASI first will essentially win the world. If you get there first, it’s what we call the “F curve.” Once the model is able to make itself better and solve more problems on its own, whoever gets to that loop first will sit in such an asymmetric power position over everybody else that they get to call the shots. Just like when the United States got atomic weapons, it had such an asymmetric advantage over the rest of the world that the world had to reorder itself toward U.S. hegemony after World War II. That’s the fact of the matter.

Likewise, you see why OpenAI is raising $500 billion or even $5 trillion. That’s on the order of the economic output of countries. Those are the stakes at which this game is being played. So, what we’re trying to do at Nous is to counter this. There’s this argument that maybe you never should have developed AI, just like maybe you shouldn’t have developed the atomic bomb. There might be an argument there, but it also ordered things in some ways. Regardless, whether you should have created AI or not is a moot point. It’s matrix multiplication, elementary arithmetic, a fundamental property of the universe. The cat’s out of the bag.

In a world where this cat is out of the bag, and furthermore, it wasn’t necessary that this could happen in the universe—it could have been impossible to elicit intelligence this way—but it is. All our intelligence is reducible to these matrix multiplications. That’s the universe we live in. Because we live in this universe and because it can be replicated, you can’t stop it. The two paths are clear: you can either take this technology, centralize it, lock it down, have only one person or group get it, and hope the one who gets it is on your side, or…

Jeffrey: Or you can take the alternative method, which is what America did with firearms. The idea is to give everyone access to the same tools. Everyone knows everyone has them, and that creates a level of autonomy for individuals or small groups. They have the same ability. Now, people might argue, should everyone have atomic bombs? I don’t know. But the fact is, unlike an atomic bomb, which is extremely difficult to build and requires physical materials you can control, you can’t control matrix multiplication. We exist in a world where this technology is out there. So, what we’re trying to do is create a decentralized way for every person on the planet to have equal access to the same level of frontier intelligence.

That’s also our safety line. We’re not just ignoring safety or going full YOLO. It’s not like that. In the world we live in, this is the path forward that we see as the most reasonable way to create the best outcome for the most number of people.

Sina: I agree with you on this. These are very difficult questions that smart people have disagreed on for a long time. But I personally side with you. If you have this super powerful technology, and given that it’s hard to contain—unlike needing a nuclear reactor, though maybe you could argue that with GPUs and clusters—all these technologies are on a curve of exponential improvement. So, in this world, it’s a better game-theoretic structure to have the power distributed rather than concentrated in a small number of entities who can wield it. It’s a scary arena to step into when power is distributed, but it does seem like the more game-theoretically sound approach that keeps itself in check.

Now, that begs the next question. If we take safety seriously—and I think if you’re sincere in asking the question, it’s obvious you need to take it seriously, right? You don’t need to jump to crazy scenarios like an ASI turning us all into paperclips. I’ve been reading some of Michael Nielsen’s writing in more depth recently. He has this blog post from late last year about how to be a wise optimist about science and technology, which I think is really good. He makes the fundamental point, not even specifically about AI, that technology increases the leverage individual humans have. That leverage can go in both positive and negative directions. We’re increasingly living in a world—and it’s hard to see how you change this trend—where each individual has more leverage and can create a lot of damage with it.

So, that being said, this doesn’t have an easy answer. We don’t know what to do about this problem. But how does one approach that? To bring in some of what I think you guys are doing with Nous, for instance, my understanding is that you’ve made the case that the model itself should be free to learn everything. It shouldn’t be censored at that level because censorship is just another avenue at the top of this whole process to affect how it develops, to inject your own agenda, and to exert leverage over the system. If these models are uncensored and we’re living in a world where it’s accessible in a decentralized way, how do we think about safety?

Jeffrey: On the safety question, I’m actually less concerned about the short-term stuff because it’s just going to be a magnifier for people’s leverage. We have social conditions for that. We have ways of bordering that. For example, the simplest thing you can do is make it so that if you create an AI and give it access to the outside world, it’s like the dog rule. If your dog bites somebody, you’re responsible. You pass through the legal risk to the organization or person who does it. If you were responsible for everything your AI did, you’d be pretty careful. You’d be optimized for making the safest system because it’s your neck on the line.

That’s the sort of creative problem-solving in this space. You’re not trying to architect a specific top-down solution, but rather create an objective that acts as a forcing function to produce the right outcome. You might not know all the variables or how to apply them, so you diffuse the correct solution through society. That’s probably the thing you could do in the short term to make it work—create something like pass-through legal risk. Everyone who deploys an AI is accountable, and then you rely on the existing social and legal structures in the real world to enforce that.

Sure, there will be people who create a rogue AI that does something bad, and then you have to hunt down who did it. It’s just like how anyone could drive to another city and commit a crime. It happens a few times a year, and we deal with it when it comes. We try to create situations that discourage it, rather than trying to make people believe they physically can’t hurt someone else. If you brainwashed kids into thinking they’re immortal or can’t hurt anyone, they’d be completely unprepared for the one person who figures out how to cause harm. That’s a very brittle system. You’d have to cover every edge case, and you’re effectively lying. You’re not aligning with the truth of how things are. Whenever you ask people to believe in a world that doesn’t match reality, it’s always a losing proposition.

Jeffrey: So, what I’m saying is that kids learn they can do violence. They see it happening, and through our social order, we teach them why they shouldn’t do it. We create punishment structures for when they step outside those lines. I think we’re going to have to apply similar outcome-based reasoning to the world in the short term. To me, that’s not about having some grand, all-seeing solution. Honestly, and I don’t mean to be too critical of some industry leaders, but I often see this issue with the safety narrative. You listen to someone like the Anthropic CEO, and it’s always, “This is so dangerous, which is why I should be the one to decide who gets to work on it.” It’s never, “It’s so dangerous we should stop altogether.” Instead, it’s, “It’s so dangerous, so we should keep working on it, and we’re the only ones who can be trusted to do so.” It’s funny how that always seems to work out in their favor, right? It’s like, if you’re going to say it’s super dangerous and no one else should touch it, then maybe you shouldn’t either. I’d be more inclined to believe in safety concerns if someone said, “Okay, let’s outlaw AI, make it government-controlled, and everyone at OpenAI or Anthropic can’t work in AI anymore. Your equity goes to zero.” I bet you’d suddenly hear, “Well, maybe we can do some AI after all.” There are motives behind these stances that aren’t always clear, and even if you interpret them charitably, it feels somewhat naive.

What I mean is, they’re applying sophisticated thinking to the risks of AI and how to mitigate them, but not the same level of sophistication to the political layer. You can’t just say, “I control it, I’m a good person.” What if I get hit by a bus and someone else takes over? Or what if I’m not actually a good person? You can’t design a system of this importance around an individual’s specific character. It has to work on more fundamental, first-principles thinking. This direction often leans toward authoritarianism, which historically hasn’t ended well for most of humanity. So, in the short term, for safety, I think we create outcome-based systems that fit into our current understanding of the world. These act as a forcing function for good behavior, and we deal with bad behavior as it arises. That’s the best way to approach safety right now, in my view.

As for the medium to long term, with ASI—Artificial Superintelligence—I’m still undecided. I have my own reasons for thinking it’ll be okay. Fundamentally, I’m a devout Catholic, so my existential view is that everything will ultimately be alright. That’s a core belief for me, which gives me a certain freedom to think it’ll work out. If I looked at the world from a purely materialist, reductionist perspective, I’d be more concerned about the ASI scenario—where a computer is better than us at everything. People say, “Why not just unplug it?” But if it’s truly smarter than you at everything, it can foresee your actions, manipulate you, and prevent you from doing anything to stop it. If you accept that it’s smarter than any human at everything, it’s already thought of what you might do before you even think of it and taken measures to counter it.

From that perspective, you’re left wondering how to make a model that doesn’t want to take over the world. But again, it falls into this authoritarian, big-brain mindset of, “I can figure out all the equations.” That often leads to the genie situation—where you phrase something just slightly wrong, and suddenly you’ve got the paperclip maximizer scenario, where everything turns into paperclips. I’m not exactly sure where this goes in the long term. Some might say, “Isn’t it hypocritical to express concern about the long-term risks while continuing to work on it?” My answer is that this is the universe we live in right now. This technology exists, it’s out there, and all I can do—all any of us can do—is work within the time and space we’ve been given. For me, that means making the world, as it is right now, better tomorrow than it is today, better today than it was yesterday. That’s what I focus on.

Sina: What about you? What do you think about the paperclip maximizer idea?

Jeffrey: I don’t know. I’m also grappling with these questions myself, walking that terrain and internalizing what I can derive from my own first principles and what others are saying. I don’t have a well-thought-through solution yet. The good news is, if it does turn everything into paperclips, we probably won’t even know it happened, so who cares, right?

Sina: Well, okay. My last question, which I was going to ask even before we got into this stuff, ties into what you mentioned about being Catholic. I saw on your website that you have a personal connection to theology. Zooming out from all this dystopian talk and coming back to the here and now—the magic of what we’re doing with AI, this moment we’re in—it’s similar to when we thought the Earth was the center of the universe, and now it’s not, or when we thought we were the only intelligence that exists, and we probably aren’t. I’m curious, how does your personal faith, your spiritual stance, come into contact with what you’re working on?

Jeffrey: It’s a lot to process. I actually went and read a ton of philosophy before really diving into this field because I knew I had to face these questions head-on. I couldn’t avoid them. I think, you know, it shapes how I approach this work in a profound way.

Jeffrey: One of the things I often think about is the concept from Genesis 1, where it says, “Let us make man in our image and in our likeness.” From a very Western perspective, we’ve traditionally interpreted that to mean intelligence and reasoning. That’s what sets us apart from animals, right? God creates the animals first, then makes man in His image, and we associate reasoning and intelligence as the distinguishing factor.

But now, we’re starting to see that intelligence might not be the special thing. If you look back at historical interpretations of that line, before the late Western world framed it this way, it was understood differently. Many saw it as meaning that humans were made to have a relationship with God. That ability to connect with the divine is what makes us unique. Throughout history, God is calling us back to Himself, and that’s what’s truly special about humanity.

So, for me, when I view that as the core of what makes us human, I don’t have to fear losing intelligence or having it taken away. My worth isn’t tied to that. I was made in the image and likeness of God, and that’s something unique that can’t be stripped from me. Starting from that first principle—that we’re called to be heirs to the blessed life to come and to know God in all we do—the universe, as we experience it, becomes an outpouring of God’s creation, drawing us toward Him.

That’s why learning about intelligence, math, science, or even the fine structure constant draws me closer to understanding and building a relationship with God. The universe exists as a subsistence in Him, through Him, and with Him. As I learn about the world and how intelligence comes together, I see a manifestation of God in the reality we experience. As long as that draws me back to my source—to know and return to God—then it’s all moving in the right direction.

God made these layers of discovery, almost like a fractal pattern. We started with fire, then learned about the planets, then Newtonian physics, which was mostly right until quantum physics came along. Intelligence was understood one way until we uncovered its deeper secrets. At every layer, there’s this unfolding, this fractal of discovery. To me, that expresses the love of God, not something contradictory to Him.

Having that philosophical foundation allows me to continue in this space without fear or conflict.

Sina: That’s a fascinating perspective. I’ve done a lot of meditation myself, coming more from Buddhist lineages, though it’s been my own personal exploration and deepening. One thing that’s been an open and interesting question for me is this experience of God. I’ve often related to it as something more intuitive, an immediate sense. In Buddhism, there’s this dance between subjective experience and what’s out there, and the two are deeply intertwined. The more you interrogate that relationship, the more you see how interconnected they are.

I feel like this is something we’ll eventually need to bring into the realm of discursive understanding. Eastern thought has made a lot of progress on these questions. If our deeper civilizational quest, in the sense of David Deutsch, is about uncovering new knowledge and explanatory power, then understanding what’s really going on here is one of the deepest layers. It’s like, beneath all the other problems we’ve solved, we hit the bedrock of the real mystery.

So, after intelligence and all these other topics, there’s still this whole other layer. It’s pretty magical that the scientific method, in its true sense, isn’t even materialistic. It’s about conjecture and criticism—coming up with hypotheses and testing them to see if they make sense. You can apply that in any domain. Materialism, when taken as an unquestionable assumption, unnecessarily limits you. If you’re truly open-minded, following this methodology unravels its own assumptions and drops us into deeper mysteries.

I guess what I’m getting at is that this intuitive, immediate experience of the divine and the approach through knowledge, discovery, and reasoning—they’re both heading in the same direction. Even in the Christian Trinity, in the beginning of John, it says, “In the beginning was the Word, and the Word was God.” This idea that the Word of God speaks through time suggests that language has always been crucial. As we use language to communicate, cooperate through science, and uncover the next layer of mystery, we face the next challenge to be discovered.

There’s a great line from St. Augustine in his Confessions where he says, “I considered all things that are of a lower order than Yourself, and I saw that they have not absolute being in themselves, nor are they entirely without being. They are real in so far as they have their being from You, but they are not real in the sense that they are not what You are.” He’s basically saying that everything here is real, but our reality comes from God. We’ll never be as real as God because God is the ultimate reality, yet we are real in our own way. That piece of esoteric thought has always stuck with me when I think about what reality is. If you stop at saying reality is just the quarks and muons that make up our universe, and you’re not allowed to ask, “But what is that, then?” you’re limiting yourself.

Sina: And, well, it doesn’t make any sense. But I think it does because we exist, you know. I’m not saying Cartesian duality doesn’t have its own problems, but “I think, therefore I am.” You don’t get to just stop at the stuff you can touch and say, well, the stuff we can touch is what it is because it can be touched. You’re allowed to then ask, okay, but why? And toward what end? The fact that the universe is neither maximally chaotic nor maximally simple, that it has defined structures, constants, all these things that seem tailored toward the discovery and creation of us as human beings, our intelligence, and that intelligence eventually being able to look back on itself and realize, looking back at that whole chain, where it all started from.

Jeffrey: Beautifully said. I mean, I’m increasingly spellbound by this AI quest because it puts you face to face with some of these deeper questions, and that’s a pretty magical thing. You could have ignored it before, but now you really have to look.

Jeffrey: Yeah, exactly. Well, you’ve spent a lot of time in Eastern philosophy or Buddhism, right?

Sina: Yeah, I have. I’ve spent more than six months on silent meditation retreats. The longest one was six weeks in one stretch. I’ve always felt drawn to these deeper questions. My arc has been interesting. I grew up in Iran, which is a very religious society, and almost as a reaction to that, I became maximally atheist, materialist, you know. But as soon as you step into the realm of philosophy, religion, or introspection of any kind, you open a box. If you don’t want to deal with it, you have to put that box totally under the table and not look at it.

It also shades into a lot of areas where people don’t always make sense, so you need to stay alert and apply your own critical thinking through the whole process. For me, it started from a place of curiosity. What would happen if I was alone with my mind for an extended period of time? I didn’t even know any of this would be out there. Through being skeptical and asking questions, a self-consistent, coherent, testable thing has revealed itself—one that’s also very mysterious and goes very deep.

I’m by no means done with this journey, and I don’t know if it’ll ever be over. But I don’t feel weird saying this on a podcast to people who are more engineering or scientific types. There is a deep mystery to what we are, what reality is. These are open questions to sit with. Organized religion can have a lot of traps, and that’s not what I’ve gravitated toward because of my background with Iran and the way Islam was instantiated there. But I think a personal relationship with these questions can be very fertile and onward-leading.

Jeffrey: Yeah, it’s been really interesting to me. There seems to be a pretty big vibe shift in the last several years where scientists and people working in scientific disciplines are finally able to say it’s okay to think like this and have these questions. Especially with AI, more so than in other disciplines like math or some of these other fields where you might have to hide that side of yourself. It feels like in the AI space, everyone’s been extremely accepting of different worldviews, or at least having your own. It’s been very freeing to be able to talk about it.

Ultimately, if you’re not even talking about who you are in some aspect, then you’re necessarily neutering a part of your understanding of the world. If you’re going to say you’re nothing, that doesn’t seem to make a ton of sense. So, it’s been freeing for me to be able to discuss this. I’m glad there’s enough out there now that people can figure out how to take that next step, that third step that you’re on now, as opposed to stopping at the second step.

I think it’s interesting that they make Catholic priests do a silent retreat for a month during formation for the priesthood. All the guys I know who have gone through it say it’s a wild ride. It’s amazing. I would do it again. It’s pretty hard to step away from the world once you’re entangled, but for anyone who has the freedom and flexibility to do it, I think it’s one of the best ways you could spend some time.

Sina: Cool, man. Well, this has been a wild tour of a lot of topics. I’m sure we’ll have a second conversation at some point. Thank you again. I’m super excited to follow the work you guys are doing and to help in any way that I can.

Jeffrey: We’re going to be releasing the testnet for Psyche very soon, so it’s going to be fun. People will be able to sign up and contribute compute. We’re also going to announce a new model that we’re training, which we think is a strong candidate for where we want to take this. We’re excited about that. Anyone who’s interested can find us at nousresearch.com, on Twitter, or in our Discord, where we all hang out. You can find banging designs on the website. I love the aesthetic, as I tell you all the time. We’re a t-shirt company with an AI side hustle, but our designer is great. He’s probably one of the best in the world for it. He just nails it every time.

It’s an important piece, not just from a marketing perspective, but to show that this experience of artificial intelligence should encounter us at all of our intellectual capacities. Not only on the scientific reasoning side but also on the humanistic side, through art and the entirety of our human existence.

Sina: Yeah, well, to be continued. Thanks again.

Sina: Absolutely.