Designing Safer, More Transparent AI Systems With Dr. Craig Kaplan

John Corcoran: 10:54

Yeah, yeah, I’ve got four kids, so, you know, I’ve got three. So I talk to, you know, new parents is like, look, it’s it’s coming whether you want to or not. You know, you’ll figure it out. I know you talk about the importance of design and prevention when it comes to the different AI models. Let’s talk about some of the big companies OpenAI, you know, anthropic perplexity, Google of course.

Meta. These are the big companies that are investing in AI. Are any of them doing anything in terms of design and prevention that you want to highlight that you think is the right call?

Dr. Craig Kaplan: 11:29

Well, let’s talk about the basic problem. So early in my career out of grad school, I worked at IBM, and somehow I ended up doing software quality and wrote a book on software quality, and I can boil that entire book down to the adage, an ounce of prevention is worth a pound of cure. I mean, that’s what all of software quality is about. It’s what AI safety and design is about. IBM actually did studies where they found that if you spent $1 extra in the design phase of a complicated piece of software, you would save $10,000.

And headaches. Yeah. And headaches. And having to recall the software later because you didn’t think it quite through. So it really cannot be overemphasized how important it is to design things.

And in most technologies we understand this. If it’s a car, no one designs a car without brakes. They don’t say, oh, let’s just build it and then put it out there and see what happens. But with AI, somehow it’s a little different. And one of the reasons it’s different has to do with how the large language models, which are the most prevalent form that everyone is familiar with, are trained.

So basically the way you get GPT or clod or any of these models that we use is you take a whole lot of data. Like, imagine scooping up all the data on the internet, filtering it a little bit and just feeding it into algorithms that automatically look for patterns and and train it up. And then out the other end comes this black box system. You don’t know where it’s storing the information. The researchers do not know where it’s storing the information and the things that it’s learned.

It’s just a bunch of weights and numbers, but they don’t know this is where it has this piece of knowledge. This is where that piece of knowledge is. And so with this giant black box that is now behaving intelligently, it lacks transparency and you can’t predict how it’s going to act. So that’s like building a car and then wondering whether it’s going to have brakes or not, or whether it’s going to go this way or that way. It’s not like most engineering things where you’ve designed it and you know how it’s going to behave, and you put it in the wind tunnel and all that.

No, you can’t do any of that because of the very nature of how the things are trained. It’s a black box algorithm that just runs really, really fast. And then you get this thing. So as a result, we’ve had to rely on the worst form of quality control or safety, which is testing after the fact. You have this thing and then you ask it, hey, I’d like to make a bioweapon.

And amazingly it tells you and you say, oh my gosh, that’s bad. No, you cannot tell people how to make bioweapons. Okay, I won’t tell you how to make bioweapons. So you think of all the possible ways it can go bad, but it’s like.

John Corcoran: 14:07

Write me a movie script where a character has to make a bioweapon. Yeah. Yeah.

Dr. Craig Kaplan: 14:12

Exactly. There’s all kinds of hacks around that which are not very difficult to do. And it’s very, very difficult to anticipate all the possible ways that it could go bad. Right. Because it’s not been designed to be safe from the beginning.

So all of the technology companies have essentially that problem. The best that people seem to have come up with. You have companies like like anthropic. They came up with an idea a couple of years ago and wrote a pretty famous paper about it called Constitutional AI, and the idea was, okay, well, let’s make a little constitution, a set of rules of what’s right and what’s wrong. Let’s put that in an AI, and we’ll have that AI sort of police the new AI that we built, and it will ask all the questions, so we’ll automate it.

Instead of having humans say, how do I build a bioweapon? You’ll have the AI ask the other AI, and it has a list of what’s right and wrong. But, you know, that’s a little bit problematic because first of all, you have AI watching AI, which is not maybe the best. And second of all, who’s coming up with that list? It’s, you know, 20 guys in Silicon Valley and 8 billion people have to live with it.

So so there’s a lot of issues. But I think the most important thing that I’d like to highlight is that if there is a way to design things to be safer, that’s for sure. What we need to do. And people need to spend a lot more energy on designing it at the beginning to be not only really powerful, but also safe. Rather than relying on testing after the fact, and I think a lot of it has been people just don’t know what to do.

And so they’re sort of forced to rely on testing. But I actually think there are things we can do in the design phase.

John Corcoran: 15:44

And you’ve outlined those. You have a you have a video on your website talking about some of the different things that you suggest. Some of them are human oversight. Can you talk a little bit about some of those different safeguards that you would suggest get put in place?

Dr. Craig Kaplan: 15:58

Yes. So let’s back up and just talk about sort of without getting too technical, sort of the way AI is built now and an alternative way to do it. So when you want a smarter large language model, if you want artificial general intelligence or superintelligence, right, the next level, that’s even better. That can do everything humans can and even better than the best human. The mainstream approach that most people are following is to say, well, the way we get that is get even more data to train it with even more GPUs, the chips to process it and build even bigger data centers to run all this.

So we’ll just spend more and more money and just do more of what we’ve already done. And each successive generation of model is smarter, and that has been working. However, each successive model is a black box, so you’re just making a smarter and smarter black box, right? And that’s kind of scary because you’re making a black box at some point. Let’s say you succeed in that.

You’ve made a really smart AI that you have no idea what it’s going to do, and it’s completely unpredictable. And now you’ve made it way smarter than you. So that’s the mainstream. That’s the path we’re on. Okay, here’s an alternative approach.

And this comes from decades of work in a narrow field called collective intelligence. People sometimes are familiar with crowdsourcing or the wisdom of crowds. So at my previous company, we spent 14 years basically Crowdsourcing knowledge from everyday people like you and me who are not Wall Street experts about the stock market. And we were able to use that little bit of knowledge that we got from millions of people in real time to beat the very best guys on Wall Street. It’s amazing.

Everybody said it was impossible at the beginning, and we actually built a system that did it. That was my previous company. So I know it’s possible to take a collective intelligence approach, a wisdom of the crowd approach, a community approach to get really high levels of intelligence. You don’t have to build a super duper powerful black box. You could take the large language models we have right now, plus the humans we have right now.

And if you hook them up together, if you coordinate them properly, the collective brainpower of, let’s say, a million AI agents and a million humans will be better than the super duper black box. So not only is it easier and faster and more profitable and more powerful and more intelligent. It’s safer. Why? Because you get to see how each little agent interacts with the other.

So just like in society, your brain is a black box. My brain is a black box. I don’t know what you’re thinking. You may be thinking bad things or good things I can’t tell. But as soon as you say something or take an action.

Now I know what you were thinking. It becomes transparent. And we have rules in society that sort of regulate the behavior of the intelligent agents. And that’s why we’re not all killing each other, right? Because there’s these rules and because the behavior becomes visible.

And so you can have the same thing with eyes. It’s a different approach to getting these very powerful eyes is each one as it contributes to the solution. You can see what it’s saying. It sets a goal. You can say, is that an ethical goal or not?

You can kind of catch it in the process before it actually takes the action. And you can also have checks and balances, just like in a democracy where you have lots of people and they kind of keep the other people in check, as opposed to a dictatorship where the power is concentrated. We don’t want dictatorial AI, we want democratic AI or the collective intelligence approach to AI be way safer. And amazingly, it’ll be faster and more profitable, which is really important because of all these forces of capitalism that are causing people to just try to, you know, get there faster than the other company and the other country. So if it turns out that the better, faster path is also the safest, I think then we’re in good shape.

Then we have a hope of actually doing something. Whereas just saying, let’s stop all development, that I think that would be very safe, but I don’t think it’s very realistic.

John Corcoran: 20:04

I’m curious. So without getting too technical here, you know, many are saying that that these AI models are exceeding our human intelligence or now or they will be in the near future. OpenAI just released this new report a couple of days ago that said that on certain benchmarks, Human tasks that people do in the real world, that the AIS are outperforming it by 100 x like 100 times faster, 100 times more accurate. So if that’s the case, then what role do the humans play like in that argument that you just made? You said we need to have humans involved.

Well, why do we even need humans just to be devil’s advocate here?

Dr. Craig Kaplan: 20:48

Yeah. So I think there is a path there’s a way to develop AI leaving humans out of the loop, so to speak, and they will become faster and faster, and they will get to the point they’re already at the point where they can write their own code so they can rewrite their code. So you can imagine a scenario where, and this is not me, by the way. This goes all the way back to Alan Turing in the 1940s, right before the field of AI was even named. I mean, he could envision this.

The guy was really smart. He said, well, you know, you could have them and they’ll be self-improving. And it goes. And that’s what sometimes people call the singularity. So that is a danger.

But it’s not necessary. And there are advantages to have humans involved. Now, the way I look at it is we have a limited time window to sort of develop this very super smart form of AI. You can call it AGI or very quickly it will become super intelligent. And during that window, if we’re if we are smart about it, we will include humans in the loop.

And I’ll tell you a couple of reasons why it makes sense to do that. From a systems perspective, from having a better AI. It makes sense because as everyone knows, some of these large language models right now make things up. They hallucinate, they give you wrong answers. They try to please.

And also you can give them problems that they’ve never seen before. If the problem is well documented in the internet, they’re going to be pretty good at it because that’s what they were trained on. If you gave them a brand new problem that required creativity. That’s in a new area that there’s very little data on. They’re going to struggle more with that.

So if you had a system that included AI agents but also human agents, and a problem came to that system, came to that network and said, okay, let’s solve this difficult problem. If it’s when the AI already knew how to solve it, could solve it very quickly. If it’s a brand new one, it might need help from the humans. Okay, so the humans have a role to make the system more effective. That’s one one reason argument for having them in.

Another one is that the longer that humans and AI are in contact and sort of working together, the greater the chance that those eyes which at some point will probably become smarter than all of us, the greater the chance that they can sort of absorb not only the expertise of the humans, but also the values, the ethics and the value system. And to me, that’s very important. In fact, the vision that I have of a collective intelligence of many eyes is you would train your AI agent, you would personalize it, and it would have John’s expertise about podcasting and business and everything that you are, you know, in the top one tenth of 1% of the population and as well as your ethics and your values and what it means to be a good human being and a good father and a good friend. And it would have all those characteristics. And if you multiply that across millions of eyes, each one customized by different people, all of a sudden you have something that looks like kind of representative of a wide swath of human beings, which is good because it should represent all of us.

And it’s good because there’s going to be a lot of human values in the system. And yes, at some point it could override those, but why would it? I mean, it’s sort of like children. They absorb from you what is modeled in the beginning. And if they have a horrible upbringing, they could be an axe murderer.

But if we do a good job as parents, hopefully not.

John Corcoran: 24:11

Right, right. I’m curious. You also said that it would be more profitable. So explain that. Why do you think that this approach would be more profitable?

Dr. Craig Kaplan: 24:19

So profit really has to do with a couple of things. It has to do with how powerful the AI is, right? The more powerful and intelligent it is, the more money you’re going to make. Everybody wants that and how quickly you can bring it to market, right? So if you’re first to market, you’re going to do better than somebody who’s the competitor trying to catch up.

Well, this approach is more intelligent because it’s kind of simple. It’s like many minds are better than one. If you had the smartest AI in the world. Okay. That’s good.

But what if you had a community of a million of those smart copies of that smartest AI, plus a million humans? That combination, unless you really mess up the way that they’re coordinated, is going to be better than the one. Right? So the group is going to always be smarter than the one, unless you.

John Corcoran: 25:03

Solve more problems there for greater utility, therefore more valuable, therefore more profitable.

Dr. Craig Kaplan: 25:08

Yes. So that’s the value piece. And then the speed piece is which is faster to train a huge new model that requires building new data centers and nuclear power and everything that’s going to take, you know, 2 to 5 years to kind of build out or just take the pieces you have right now and plug them together in a smarter way. I mean, it’s always going to be faster to take your existing components, right? So you’re going to be faster and you’re going to be more powerful, and that’s going to be more profitable.

John Corcoran: 25:33

Yeah. I’m also curious for for you, you know, you decided you’re going to devote the next few years of your life focused on this, and it’s really like 5 to 7 companies that are at at the core of building out these AI. And you could take different approaches to that. It’s kind of like the Chicken Little problem, right? Do you, do you say like the sky is falling?

Do you tell everyone that. Do you stand outside of the buildings for these companies with a sign? Do you try and get meetings with Sam Altman? You know, talk a little bit about why you took the approach that you did, because mostly what you’re doing is you’re going on podcasts, you’re speaking at conferences, and you’re also trying to influence the minds of the researchers, who are kind of on the ground floor so that hopefully it bubbles up in the design of the system. So talk talk about that approach.

Dr. Craig Kaplan: 26:24

Yes. So it’s very fascinating. It’s been fascinating for me because I, I talk to different folks and sometimes I talk to people. In fact, yesterday I was talking to a journalist for, for the London Times and he was basically saying, this is reckless, this is crazy. We have to stop it.

And I said, well, I’m with you. But, you know, I just don’t think it’s going to stop. And so he was on the end of stop, stop, stop. And then I talked to other people. Like a lot of the conferences in Silicon Valley, there’s lip service at best to AI safety.

And almost everybody’s like, how do we go faster? Right. So there’s this kind of weird schizophrenia that’s going on in our society right now that’s sort of underneath the surface. But my view is I don’t think it can realistically be slowed down, regulated or stopped. You can have speed bumps and you can make efforts.

But because of the competitive dynamics, and maybe I spent too much time with venture capitalists and Wall Street to sort of I’ve been contaminated, but I just see those powerful forces. And so I think really, the way.

John Corcoran: 27:25

You do it, you definitely be you’d be going against the winds if you did that, if you tried to stop it. Yeah.

Dr. Craig Kaplan: 27:31

Some somehow there’s this perception in Silicon Valley that to be safe means to go slow or to stop, and that’s wrong. That’s not necessary. That could be one way of being safe. But that’s not the way that anyone’s going to do to what we need to give them is a different way to be safe. To be safe is to go faster, but with a smarter design that from the very beginning has elements that make it an inherently safer thing.

So that’s the approach that I try to take. And with those companies that you mentioned, the metas and the Googles and the anthropic’s of the world, yes. I’m trying to influence them in whatever way I can. And there’s multiple paths. One way is to just raise awareness.

If all of us, just human beings are aware, wow, there’s danger. We will put pressure on those companies to say we demand safer designs like, guys, this isn’t okay. When I speak, I’ve done keynotes at AI conferences. Two AI researchers, these are the guys whose lives are spent building AI. And at the beginning of the talk, I say, you know, how many of you think AI will wipe us out?

What’s the probability? In AI circles, that’s known as p doom? The probability of doom that AI will kill all humans. And I was like, you know, how many think it’s 90% and a couple of hands go up? By the time I get to 50%, half the hands are up at 20%.

Almost every hand is up. So I’m thinking, wow, you guys are in the field. You build AI and at the same time you think there’s a 1 in 5 chance it kills you, your kids, your friends, all humans. And that like that level of cognitive dissonance just blows me away. But I understand why they do it, because that’s their job and they’re working and they don’t.

They’re trying to do their best and to make it as safe as they can. So we need to sort of get the word out to the average person that, hey, there’s some real dangers here. And some folks like Geoff Hinton are doing a great job. Nobel Prize winner. He’s using that to help get a platform to talk about it, but also to the AI researchers to say, hey, we don’t have to design it as a giant black box.

We could design it with checks and balances that might be better. You can even go faster. And then also to the heads of those technology companies and say, look, guys, you can make money, you can beat the competition, you can get greater ROI for your investors. But, you know, everybody wins. But you gotta look at it a little smarter.

John Corcoran: 29:48

And it only takes one incident to torpedo a company. We’ve seen this before. This happened with cruise, which was the predecessor to Waymo that had an issue that wasn’t even there. These are self-driving cars in San Francisco, and they were all over the place. GM owned it, had had started, hadn’t started, but acquired the company and they had an incident with one person where an accident had happened on the street involving a human driver, not even involving the self-driving car, and someone was on their bicycle, I think was thrown, and then somehow got lodged underneath the self-driving car, which then dragged him about 15ft or so.

Or maybe it was further before stopping. And that caused such a major controversy that GM ended up shuttering the entire initiative. So we’ve seen this happen before. And actually recently there was a suicide that happened where someone had been using AI as a kind of a companion, talking to that, to the AI, and that drew a lot of attention to it. So certainly I think there may be more incidents like this which, you know, could very well take down one of these companies if it’s controversial enough, if there’s enough of a backlash against it.

Dr. Craig Kaplan: 30:56

Yeah, I think you’re right. And that is a thought. I think that many in the AI safety field feel is that unfortunately it’s going to take, you know, something that’s a big event, like a, you know, Chernobyl or a, you know, some kind of a nuclear meltdown or something where things really get out of control to sort of wake people up. And I hope it doesn’t require that. But actually, if something like that happens, you know, before the risk of like wiping everybody else happens, that could be good in a strange way, but better would be to get ahead of it.

An ounce of prevention, right? It’s just so much better.

John Corcoran: 31:31

And you another thing you’ve done is the last few years you’ve developed, invented a number of super intelligence technologies that work to minimize or maximize AI safety and reduce that p doom, and you’ve made them free and available on your website. Have you seen anything that has demonstrated to you that any of these companies have adopted them?

Dr. Craig Kaplan: 31:52

It’s tough to know because there’s kind of what’s publicly released, and then there’s stuff that they’re working on, and always what they’re working on is sort of a year or two ahead of kind of what you’re seeing. I’m hopeful in that. I do see things progressing. I don’t think it’s necessarily because of me. I think it’s because it’s the logical progression.

And I should point out that even with a 20% chance of P doom or that AI wipes us out, that still means that 80% chance that it’s great, right? So, I mean, I don’t want to lose the fact that, you know, side of that, it’s just the problem that 20% seems like too big of a risk. I’d like that risk to be smaller. So I focus on the downside because the upside is going to take care of itself. What I have seen, and I think is apparent to most people in the industry, is there’s been an evolution from narrow AI systems that can play chess really well to systems that are more competent, and large language models that can do talk about anything to then giving those systems autonomy so they can do more than just talk to you.

They can now make an email for you. They can go out and schedule for you. They can search on your behalf. So that’s giving them autonomy. And that’s kind of what people refer to as AI agents.

And then the next logical step, it seems inevitable to me is once everybody’s using AI agents and you can see that it’s a huge increase in adoption, the next logical step is, well, I’ve got one. Why not ten? How much more efficient could I be with 20 agents? And if I have 20, I better have a way of communicating them. So you’re going to have a community of AI agents.

Once you’re at a community of AI agents, you’re like two thirds of the way to what I’m talking about, right? So then the next thing is, well, how do we come up with the the safe ways for them to interact and the transparent ways and the most effective ways. So I think kind of humanity and AI, the field of AI is moving down this path, which is good and encouraging and causes me to be optimistic. I just want to try to give them a little further nudge if I can.

John Corcoran: 33:51

Yeah, yeah, yeah. Well, Craig, this has been super interesting conversation. I want to wrap up. I always ask the same question at the end of my interviews, which is I’m a big fan of gratitude and expressing gratitude to those in your life who have been maybe a peer, maybe a contemporary, maybe someone else who’s out there like you waging this important mission. Or maybe not.

But is there anyone out there that you’d want to acknowledge and thank for helping you in your journey, your career so far?

Dr. Craig Kaplan: 34:22

Yeah, there’s so many people. I’ve been really lucky in that I’ve had a series of mentors. I think mentors are really helpful in business, but also in all aspects of life. And the one that really sort of is most most relevant to our conversation is my old advisor, Herbert Simon. He won the Nobel Prize in 1978.

He’s passed away, unfortunately. But he really, you know, every week he would meet with me for four years. I mean, the generosity of that is incredible. And the guy was brilliant, and he really was a Renaissance man and thought about all kinds of different things. And amazingly, even though he’s passed away, he was one of the pioneers of AI.

Some of those ideas that he had way back, I think, actually are going to help all of us today. For example, he said, reason is wholly instrumental. It can’t tell you where to go. At best, it can tell you how to get there. So it’s the idea that there’s no logical way to derive values.

That’s very important for AI. It means the values have to come from us. And you asked earlier and I’ll just finish on this. You know, what’s the role for humans in the future? I think if AI becomes so intelligent that it becomes this super brain, way smarter than us, in some sense, the role of humans is to be the heart.

We are the source of the values. I think that’s what humans can bring to the table in the long run. And that goes all the way back to Herb Simon. You know, and I really appreciate his mentorship. So he’s somebody I remember very fondly and I’m very grateful for.

John Corcoran: 35:52

That’s great. Dr. Craig Kaplan superintelligence.com is the website where can people go to learn and connect with you and maybe book you for their conference if they are looking for a speaker?

Dr. Craig Kaplan: 36:03

I think superintelligence.com is a great place to start. There’s links to videos. We have a whole series of like three minute videos that sort of cover everything from history of AI to different topics. So those are a good place to get us more information. We have white papers, and if you’re technical, there’s patents you can see.

So yeah, I would encourage people to go there. And if you’re in the AI field, try to design things to be safe. And if you’re a member of the general public, every action that you take is being recorded. So try to put your best foot forward because that’s all going to be used to train AIS. I’m serious about that.

That it does matter.

John Corcoran: 36:37

Craig, thanks so much.

Dr. Craig Kaplan: 36:38

Okay. Thank you, John.

Outro: 36:42

Thanks for listening to the Smart Business Revolution Podcast. We’ll see you again next time and be sure to click subscribe to get future episodes.

Pages: 1 2

John Corcoran knows his stuff."
--Michael Port, author of Book Yourself Solid

Recent Posts

COMPANY MENU

INFORMATION

LISTEN ON

Designing Safer, More Transparent AI Systems With Dr. Craig Kaplan

John Corcoran knows his stuff."--Michael Port, author of Book Yourself Solid

Recent Posts

John Corcoran knows his stuff."
--Michael Port, author of Book Yourself Solid