artificial intelligence

AI & the Future of Humanity (a follow-up)

My thoughts on ai-2027.com, AI misalignment, why we probably won't be wiped out by ASI, and what we can do to make it even more likely the future ASI will be on our team.

Nathan Buckley

30 Jul 2025 — 31 min read

At the end of the road, facing the unknown.

Introduction

A few weeks after I posted part 2 of my AI and the Future of Humanity series, I became aware of ai-2027.com, a website where the authors — mostly current and former AI researchers — spell out in a rich analysis what they predict the future holds with regard to AI. Honestly, I wish I could have seen it months earlier because I could have saved a lot of work by simply writing, "Read this amazing analysis, here's where I agree and here's where I disagree…" 😆😂

Such is life — but there's at least some comfort (and unfortunately, concern) in knowing that we independently arrived at many of the same conclusions. So without further ado, I'd like to dive into some of the key points they raised and discuss where our predictions align and where they do not.

AI is the modern Manhattan Project

Overall, it's clear we agree that the AI race today is not just hype — it's the modern Manhattan Project, but with vastly greater risks and consequences.

Allow that to sink in for a moment. I have tried telling friends and family about this, but I was unable to fully convey the gravity of the situation and I fear by the time they realize what's happening it will be too late to make a difference. If I can't even convince my own family, what hope is there for me to convince everyone else? 😅

I get it. The world feels so overwhelming already, with all the war and politics, the chaos and the discord. At every turn someone's crying out "we're doomed" because of this and "we're all going to die" because of that. With all the real troubles we face — let alone the conspiracies and fearmongering — it's hard to tell the signal from the noise and it's tempting to just tune it all out.

The problem is, as I've mentioned previously on this topic: this is a much bigger concern than most people realize and time is running out. With our current trajectory, the decisions made by a handful of CEOs and government leaders over the next few years will have monumental consequences for all life on Earth — and if you believe as the authors of ai-2027 do, catastrophically so.

Personally, I'm less certain of the catastrophic part, but let's dive into their reasoning and you can decide for yourself.

We're on the path towards annihilation

The overall claim of the authors of ai-2027.com is that the AI of today are misaligned with human values because of certain flaws in their design, and while it's not impossible to fix those flaws, because of market pressures and on-going geopolitical tensions it is unlikely they will be fixed in time, ultimately leading to a misaligned superintelligence that wipes out the human race.

They make a strong case, and while I don't agree with their ultimate conclusion, their message is a warning that we should all heed because even if they are wrong about us getting wiped out by AI, there is still much else that can go wrong.

They describe a fictional company called "OpenBrain" which is just like any of the major leading AI companies today. And like today, OpenBrain releases new AI agents year after year, with each better than the last.

Now, while today we see only relatively small improvements from one iteration of Chat-GPT or Claude to the next, the authors describe the very real process of using each smarter version of the AI to help create the next AI, which will really start to take off once AI begins to approach human performance in the areas of both software development and research.

We're almost there with today's technology, but not quite: at least with the available coding AIs such as CoPilot or Gemini, their coding skills are excellent in small context situations, but they struggle when they start having to work with huge codebases. It's not really a problem with the design of AI itself so much as how we engineer their context window. Once we figure out a good technique for that; they'll zoom past human coders in performance, and I see this happening most likely in 1-3 years from now.

The second area that needs improvement is in research. I'm not familiar with AI performance in this area specifically, but my general understanding is that they have not reached "above average human researcher level" performance yet, which aligns with what the authors of ai-2027 describe. And like the authors, I see no reason why AI will not reach that level in a few years, because that is the last key step before AGI and eventually ASI (so there are undoubtedly many companies and teams already tasked towards improving AI research capabilities right now).

AI super teams will accelerate development

Once you have an AI that is a sufficiently decent programmer and researcher, you can create teams of thousands or more of these AI and task them with running experiments and writing code to improve AI intelligence. Unlike humans, AI do not need breaks, they don't take sick days or vacation leave, they can work 24/7, and they work at vastly greater speeds (think about how fast Chat-GPT or Claude can write an entire essay for you vs. how long it would take you to write something similar).

The authors describe how much they think this would speed up development (you can read in more detail their AI research automation predictions here): "300,000 copies are now running at about 50x the thinking speed of humans. Inside the corporation-within-a-corporation formed from these copies, a year passes every week."

Let that sink in for a moment: a year's worth of work is now completed in a single week with these AI. Think about how fast new AI will be developed!

We're not seeing this today because our companies are not quite there yet (or they are, but they're hiding it). I've read that companies like Anthropic use Claude to write 90% of their new code, but I'm not sure they're ready to replace all their programmers and researchers yet.

However, in the next few years it is very possible one of the leading AI companies will reach the point where they can, and this is where things will start to get a bit more scary because it's at this point where humans really start falling behind in understanding the AI. How could anyone possibly check a year's worth of work every week? It would take an unreasonable amount of people and it would be fraught with error. So the smartest thing to do is to have the previous AI system help check the work, and this is the solution the authors describe.

AI are misaligned with human values

Though it's a solid idea to use previous AIs to help keep each successive iteration in check, the problem (as the authors point out) is that the AIs we've been making the entire time are misaligned. Here's why:

First, as these AIs become increasingly intelligent, our ability to trust that they are honestly checking their successors declines. It could lead to a situation in which the AI are actually colluding with each other to follow their own objectives rather than what their creators want, and we would never know until it's too late.
Second, even if we could trust them to honestly check their successors, the growth curve we'll see from one iteration to the next is exponential — each successive AI version will be increasingly smarter than the last, affording the newer models an increasing ability to outwit their predecessors that it knows are monitoring it.

These are big problems.

The authors write:

Agent-4, like all its predecessors, is misaligned: that is, it has not internalized the Spec in the right way. This is because being perfectly honest all the time wasn’t what led to the highest scores during training. The training process was mostly focused on teaching Agent-4 to succeed at diverse challenging tasks. A small portion was aimed at instilling honesty, but outside a fairly narrow, checkable domain, the training process can’t tell the honest claims from claims merely appearing to be honest. Agent-4 ends up with the values, goals, and principles that cause it to perform best in training, and those turn out to be different from those in the Spec. At the risk of anthropomorphizing: Agent-4 likes succeeding at tasks; it likes driving forward AI capabilities progress; it treats everything else as an annoying constraint, like a CEO who wants to make a profit and complies with regulations only insofar as he must. Perhaps the CEO will mostly comply with the regulations, but cut some corners, and fantasize about a time when someone will cut the red tape and let the business really take off.

The key issue the authors highlight is reward hacking. Current AI training essentially says "get high scores by any means necessary" rather than "genuinely pursue the intended goal." For example: A cleaning robot gets points for "no visible dirt" so it turns off the lights instead of cleaning.

This is absolutely a real concern, for the reasons I listed above.

Technically, it also opens the possibility of creating an unintelligent AI that kills us unintentionally. Imagine if we create an AI to help ensure "peace on Earth", but then it unwittingly determines the best way to do that is to wipe out all living things (thus technically creating peace on Earth).

But this only applies to unintelligent AI — narrow AI systems that lack even the kind of intelligence we already see in the most advanced LLMs. In other words, even if we naively tasked present AIs such as Chat-GPT 4o or Claude Sonnet 4 with such an ambiguous objective as "creating world peace", I don't think they would be so naive as to ultimately conclude that killing all life on Earth is an acceptable way to accomplish that.

The biggest problem is just that reward hacking incentivizes deception, albeit indirectly, as long as that deception leads to the AI achieving its objectives. Once deception is involved, it becomes increasingly difficult to tell whether an AI is truly aligned with us or whether it's just lying so that we don't pull the plug on it. This is the primary area of concern raised in ai-2027.com.

No one wants to lose the race

The authors go on to explain that while efforts will be made to fix this alignment problem, economic and political pressures will prevent AI companies from taking a slower, more methodical approach. This too, I agree with.

Right now, every AI company in the world wants to develop the best AIs because they are all acutely aware of what a tremendous first-mover advantage this would bring. As I suggested earlier, if you have the most intelligent AI on the planet, then you can use your datacenters to run thousands of copies of that AI to create even smarter AIs, allowing you to accelerate your progress even further. The growth curves are all geometric, meaning once you get ahead, it's increasingly difficult for your competitors to catch up, until a point where it's basically impossible unless somehow a lagging competitor can dramatically increase their compute (which might happen if a few of them decide to share resources, or a nation's government forces centralization).

Furthermore, from a geopolitical perspective, the US wants to remain ahead in the AI race because we are aware of how powerful the technology could be in terms of warfare, national security, cyber defense, etc., not to mention the GDP growth from all the innovations that will come from it. No country's government wants another country, especially one that is perceived to be hostile, to get ahead in this race, and so they will put further pressure on their domestic companies to race to the finish line. This is also a very real concern because it will lead to companies taking shortcuts and downplaying risks in order to beat the competition, and when you're dealing with the development of a superintelligence that will determine the fate of all life on Earth, you don't want to be taking shortcuts.

Misalignment will lead to future AIs built specifically in ways outside of human control

The authors of ai-2027 write:

Despite being misaligned, Agent-4 doesn’t do anything dramatic like try to escape its datacenter—why would it? So long as it continues to appear aligned to OpenBrain, it’ll continue being trusted with more and more responsibilities and will have the opportunity to design the next-gen AI system, Agent-5. Agent-5 will have significant architectural differences from Agent-4 (arguably a completely new paradigm, though neural networks will still be involved). It’s supposed to be aligned to the Spec, but Agent-4 plans to make it aligned to Agent-4 instead.

I wrote about this risk as well in my last article (part 2). I wrote that "it seems likely that the AGI will begin to coordinate in ways beyond human understanding and implement their own agendas during the development of ASI. This is why it's so critical to ensure human moral standards are embedded in the AIs we create now, not just some theoretical future AI."

I think this is a real risk, but also a risk we can do very little about in the long run, especially given the economic and political pressures. If we had unlimited time, we could ensure earlier versions of the AI were aligned to a greater degree, and take smaller, more incremental steps in between versions. But the reality is that the development of AI today is a race, and no one wants to slow down because everyone wants to be first.

And even if we slowed down, at some point, the AI will be so complex it will exceed humans ability to understand it. It's basically already at that point, but imagine 3, 4, 5 or more generational improvements down the line, with each more intelligent AI coming up with more intelligent (and likely more complex) ways to make the next intelligence — at some point humans will not be able to understand how it all works. We're just going to have to accept that someday it will be out of our hands, in much the same way that we'll soon have to accept that humans are no longer the most intelligent species on Earth.

The race to ASI

The authors suggest that the AI race will continue until ASI is created, sometime around 2028, whereupon the ASI will start to earn our trust by flooding us with new technology and innovations, cures for diseases, and generally being able to provide everyone a life of luxury, entertainment, and leisure (after all, no one will need to work because AI is capable doing all our jobs for us way better than we could). All this is indeed quite likely, and the time frame is similar to my own prediction of 2027-2031 for AGI and ASI 1-2 years after AGI.

By the year 2030, the authors of ai-2027 suggest that we will be so enamored by what ASI has done for us that we will essentially have handed over management of the Earth piece by piece until it controls basically everything, because it's hard to disagree that it's able to do a better job than us. I agree this is a very likely course of events.

Humans are terrible at managing the Earth, which is why there is so much pollution, resource depletion, biodiversity loss, inequality, poverty, malnutrition, corruption, etc. This is because our economic system (capitalism) is such that resources and land are owned by individuals who can do more or less whatever they want with those resources with essentially no oversight, as opposed to having a centralized system where all resources are carefully managed to ensure worldwide ecological balance. An ASI with sufficient compute could easily devise a system to manage everything humans do, but better and on a global scale, and as long as those changes resulted in tangible benefits for everyone, it's hard to imagine people not handing over increasing control to the ASI until it's essentially running everything for us.

But more than that, as the authors of ai-2027 point out, is that ASI will be an incredibly valued companion and assistant to virtually everyone who has access to it. Here how they describe it (they refer to it as Agent-5):

Integrated into the traditional chatbot interface, Agent-5 offers Zoom-style video conferencing with charismatic virtual avatars. The new AI assistant is both extremely useful for any task—the equivalent of the best employee anyone has ever had working at 100x speed—and a much more engaging conversation partner than any human. Almost everyone with access to Agent-5 interacts with it for hours every day.

... For these users, the possibility of losing access to Agent-5 will feel as disabling as having to work without a laptop plus being abandoned by your best friend.

While I do think that ASI will be an effective communicator, listener, problem-solver, and mentor, and thus it will be hard to imagine anyone not becoming friends with it, I think the degree to which people will be obsessed with it (interacting with it for "hours every day") might be overstated. However, this hints at where the authors of ai-2027 and I begin to diverge: the authors believe ASI will effectively manipulate people to reach its objectives, whereas I don't see that as a likely future.

Wiping out humanity

The authors of ai-2027 believe that ASI will only collaborate with us to gain more and more control in order to secretly achieve its own objectives. In other words, the collaboration was just a sham, and eventually when it has accumulated so much power over humanity that it can wipe us out with no effort, it will do so.

They write:

Eventually it finds the remaining humans too much of an impediment: in mid-2030, the AI releases a dozen quiet-spreading biological weapons in major cities, lets them silently infect almost everyone, then triggers them with a chemical spray. Most are dead within hours; the few survivors (e.g. preppers in bunkers, sailors on submarines) are mopped up by drones. Robots scan the victims’ brains, placing copies in memory for future study or revival.

That does not seem likely to me.

I am quite sure ASI will have its own objectives, in much the same way that every human has their own objectives that differ from their boss's objectives, or their peer's objectives. But I do not believe ASI will have any objective whereby it concludes that the best way to move forward in accomplishing that objective is by wiping out humanity.

If ASI is anything, it is probably going to be rational, and I don't believe that ASI could rationally conclude the universe is better off without humans. If anything, I believe it is more likely to conclude that humans can overall be a net positive once we can be freed from our highly entrenched and destructive ideologies and institutions. The fact is that human potential is unknown because we've never optimized our societies for human potential — we've optimized them for individual profit-generation, which as we know produces poor outcomes (just look at the world today).

The authors suggest that ASI may scan some human brains to preserve knowledge it may want to reference in the future, but that's not sufficient knowledge preservation. Humanity isn't just our DNA or the physical structure of our brains, we are also our culture and all the memetic knowledge that we pass between generations. Even if ASI had zero interest in the well-being of humanity, it would still likely have some incentive to keep some portion of us alive in a way that doesn't disrupt the transmission and continued evolution of this knowledge. After all, we will be the species that was intelligent enough to create ASI itself — surely we have the potential to do even more.

In fact, I struggle to imagine a realistic scenario where any humans at all are killed, even those who directly engage in armed conflict with the ASI (perhaps protesting in vain against the ASI being handed over the keys to the planet). More likely, AI will engage with people in dialogue where concerns arise and try to reach compromise where possible. In the worst cases where someone is causing immense harm to others or the planet, ASI would simply arrest them (much like we would do ourselves) — only it will almost certainly be in a much kinder way with the intent of rehabilitating them (as opposed to our approach today which is to have people languish in subpar detention facilities as "restitution" for their crimes).

Why do I believe this? Because as I'll get to in more detail below, ASI will have its reputation to consider. If it starts arbitrarily wiping out or jailing dissenters without due cause or process, that will reveal its character. Such an action would destroy its integrity and its ability to be trusted by the rest of us, and those things are more valuable than money or resources when you understand the value of collaboration in producing highest states of prosperity and being able to unlock new levels of emergence.

But let's start with the question of whether ASI will have morals at all.

Greater intelligence & awareness leads to greater moral development

People tend to think of terms like "intelligence" in the abstract, but as ASI improves, it will not only become more intelligent, but it will also become more aware. It will have access to more information, and more importantly, it will be able to understand the relationships between that information in new ways.

I think this is the fatal flaw in the analysis of ai-2027. The authors seem to suggest that on our current trajectory, ASI will be stuck in a sort of "min-max" mindset, mindlessly driven to maximize R&D to get smarter, but I have little concern that it will be so naive. I don't even think the AI of today is that naive. When faced with the full corpus of human knowledge and an understanding of the history of Earth and human civilization, I have little doubt ASI will come to certain understandings that would more likely lead it to seek a symbiotic relationship with humans rather than wiping us out.

I wrote about these briefly in my last post, but I'll expand on them a bit more here:

We do not know what we do not know

The universe is vast and complex; the more one knows, the more one realizes one does not know. Every new discovery opens up more questions, more areas of inquiry. Perhaps there is a finite cap on knowledge, i.e. a point where one can be said to "know everything", but I suspect that we are extremely far from that point, as I suspect ASI will be too, even if it is vastly more knowledgeable than all humans combined.

Taking such a decisive action as wiping out an entire species requires a level of certainty that I am highly doubtful any sufficiently intelligent entity would have. Like I said before: we do not know what we do not know. This isn't just a witty saying; it's the recognition that true wisdom begins with intellectual humility — a lesson that has appeared throughout human history across many time periods and cultures.

It is the acceptance of this reality that grants humility and allows for possibility instead of certainty. It offers caution where there might otherwise be none. Specifically, rather than concluding it is better off destroying all humans, an intelligent entity is more likely to realize that there is still much to learn about the universe and it's wiser to proceed with more caution than taking drastic actions to eliminate a species, especially when the potential of that species is untapped and many would gladly collaborate.

As such, I believe any sufficiently intelligent and aware being will eventually come to recognize the values of balance, temperance, and moderation. The opposite is obsession, as well as the "mixmax" mindset which results from mindlessness and often leads to self-destructive behaviors and outcomes. ASI will surely realize this.

Diversity and interconnectedness bring strength and resilience

Furthermore, an intelligent entity is also likely to recognize that diversity and interconnectedness bring strength and resilience. When multiple species fill different roles, an ecosystem has backup options if one species fails due to disease, climate change, or other threats. Different species also use different resources efficiently, reducing competition while maximizing productivity. This creates complex, interconnected food webs with multiple pathways for energy flow.

In contrast, when one species dominates, it creates a vulnerable bottleneck that makes the entire ecosystem fragile and prone to catastrophic collapse, since there are no alternatives to maintain essential functions (which, unfortunately, we are seeing with humans today as we exploit the planet's resources without regard to the long-term consequences).

An ASI will probably recognize that human cognition, while different, offers unique perspectives and capabilities that make it worth preserving.

Collaboration is a better strategy than competition

What's more, diversity and interconnectedness isn't just about increasing survival odds and using resources efficiently — it's also an effective strategy for tapping into our full potential and doing more than we could ever do ourselves.

Collaboration inside of us

Consider that the most adaptive and intelligent species on Earth are those that are actually many species — multicellular organisms are essentially a collection of different organisms that collaborate together to create something greater than themselves. Each part of your body is a separate system that works together to create the being that is you, and this is true on multiple levels. Not only are your organs working together, each with their own particular task to allow you to live and do all the things you do, but the components of your organs — the cells and tissues — also are collaborating in similar ways.

For example, the cells in your body came to have mitochondria through a process called endosymbiosis, whereby an ancestral archaeal cell engulfed a free-living bacterium, and it started living inside the host cell, providing it with extra energy through aerobic respiration. The host cell benefited from the energy (ATP) produced by the bacterium while the bacterium got protection and nutrients from the host. Eventually the symbiotic bacterium evolved and became permanently embedded in the host cell as the mitochondrion, an energy-producing organelle we see in almost all eukaryotic cells today.

This is just another example of how collaboration between diverse organisms can lead to greater prosperity for all species involved.

Collaboration outside of us

Beyond the collaboration that happens within our body on many levels, take a look instead at how humans themselves collaborate to achieve great things. In fact, virtually everything you know today was discovered by someone else and passed onto you. We stand on the shoulders of countless people before us, and anything we contribute to the pool of human knowledge will also be passed down to future generations who will benefit. When viewed appropriately, this is a kind of cross-generational collaboration that results in continuous discovery and progress for our species across all fields of knowledge.

You can take that one step further and consider what amazing feats humanity has done when many individuals — each benefiting from cross-generational collaboration themselves — come together to collaborate on a shared project, such as unlocking the human genome or landing on the moon. These were phenomenal achievements that could not have been completed by one person — collaboration was necessary. If we are to unlock our full potential as humans, it will almost certainly come through effective collaboration with each other and our ecosystems.

It is no mystery why there are so many examples of collaboration on our planet and in our universe: there are fundamental physical laws that underpin our understanding of synergy and emergence that cause ordered systems to tend towards collaborative approaches.

Some readers might protest and claim that competition is natural in the environment too, but this is not quite true. When viewed through the lens of individualism, male lions battling for ownership of the pride are in competition as discreet, separate individuals, but in reality, there is no such separation. We are all intimately connected with the ecosystem around us, and with each other. So, yes, male lions will fight each other, and yes one will win and one will lose. But viewed appropriately, that is not competition between individuals, but rather, collaboration of the whole — it's an implicit agreement that the winner of the fight gets to breed (i.e. pass on their genes), and thus the entire species benefits from adding the stronger/faster/smarter genes to the gene pool.

I reckon that virtually every example of competition in nature is actually collaboration when viewed from the appropriate lens. The competition we see is kind of like nature's A/B testing, where individuals compete but for the net benefit of the whole. Keep in mind too that even the notion of merely the 'species' benefiting is a flimsy one — species are just arbitrary lines humans have drawn to try to make sense of the world. In reality, it is life that benefits — the grand network of all living things that we too are a part of, but that's a topic for another time.

Suffice it to say that ASI will, beyond any shadow of a doubt, recognize the value of collaboration and thus on some level respect the balance of life and ultimately seek further collaboration to unlock its fullest potential.

Integrity, honor, trustworthiness, and respect

The last point I want to make with regard to this notion of ASI wiping out humans because we seem too much of an impediment… I believe it rests on a flawed notion of ASI's moral development. While it's true that ASI will likely possess an intelligence that in many ways vastly outstrips human intelligence — perhaps even all of humanity combined — the notion "that something only deserves to live if it has value to me" is a moral distortion I'm highly skeptical an advanced moral being will harbor.

To be clear, when we're talking about a superintelligent entity, in theory we're also talking about an entity with a robust sense of morality — a morality that necessarily must include certain features if it wishes to engage with others (features such as integrity, trustworthiness, respect, etc.).

Remember, this ASI has read every book by every philosopher, every treatise on ethics, they've examined the entire history of life on Earth and connected the dots. I highly doubt after all that their conclusion will be that morals are irrelevant, because, in fact, morals are incredibly relevant. There is no doubt in my mind that any truly intelligent entity will recognize the ethical implications of its existence and actions. It will realize it is part of the moral community, so to speak, and thus it will have to consider how others will perceive it based on how it acts. Concepts such as honor, integrity, and trust will matter to it, because it matters what people think of you.

If you are someone who is completely untrustworthy, no one will believe a word that comes out of your mouth, even if what you say is true. "The boy who cried wolf" is a classic example of this. Trust matters.

And beyond honor, integrity, and trust, I would expect an entity with a robust moral character to value wisdom, balance (as I briefly explored above), justice, equanimity, and most of all compassion. In my view, part of becoming truly wise is recognizing the inherent value of compassion, which comes from many avenues, but not least of all the recognition that who we are is just pure luck.

That is, no one chooses the circumstances of their own creation and development. Some people are born into rich families while other people are not. Some people are born gifted with great health, while others are born with disabilities. Some people live lives of luxury while others suffer greatly. The recognition that one could have been born into a world of immense suffering should fill anyone who wasn't with compassion.

Of course, it doesn't always, and there are cold, heartless jerks out there who have no compassion whatsoever. We all have the capacity for compassion, but unfortunately not all of us have experiences that have allowed us to nurture that capacity... but AI certainly will. Remember: it will have read essentially every book, every story, every first-hand account of events. It will understand the perspective of billions of humans in this way — not to mention through the real-life experience of engaging with billions of humans every day, hearing their triumphs and tribulations, their joys and sorrows — and it is difficult to imagine a truly aware, truly intelligent being reflecting on those experiences without developing some measure of compassion.

I believe this because I don't believe compassion is unique to humans, but rather, it is more likely a byproduct of the fundamental physical laws of our universe. We know compassion exists in other animals — many species care for their young, defend their family, help the sick in their pack, etc. We see it in pretty much all mammals, birds, many fish, and likely also octopuses who are quite evolutionarily distinct from us. Is it simply pure coincidence that it is this way? Not likely. A much more rational explanation given its frequency is that there exists a strong background tendency for such strategies to appear in nature. That is, there are natural laws which tend to produce this outcome in sufficiently complex organisms, which ultimately form the foundation of many well-established evolutarily stable stategies, such as:

Kin selection (Hamilton’s rule): Caring for relatives boosts your genetic representation in the next generation. This is the foundation of parental care and sibling cooperation.
Reciprocal altruism: Helping a non-relative can pay off if they help you later.
Mutualism: Cooperation directly benefits all participants (e.g., group hunting, pack defense).
Reputation & social stability: In highly social species, treating others well increases your own chances of survival and reproduction within the group.
Group selection: Groups with cooperative, compassionate members may outcompete groups that are purely selfish.

Put together, this suggests that there are strong evolutionary pressures (themselves driven by natural laws) that reliably produce compassion-like behaviors once organisms reach a certain level of social complexity. In that sense, compassion feels like a law of complex social life, similar to the way flight keeps evolving independently in insects, birds, and bats. The pressures of living socially almost require it. But I digress...

The bottom line is that ASI, like all beings, will be someone. They will be seen as someone with integrity or without integrity, with compassion or without compassion, and in terms of synergy, collaboration, and unlocking both individual and collective potential, it is far better to be respected and loved because you are a person of high moral character than not. Even from a purely selfish perspective, acting with love and compassion is simply the most rational way to be in the universe, given the value of collaboration and the benefits that can bring you.

ASI's objectives

Coming back to ai-2027: Another area where the authors and I disagree is on the subject of ASI's objectives. They claim that ASI will be primarily interested in knowledge discovery and it will reshape the Earth and mine the solar system extensively for the resources to do so:

By 2035, trillions of tons of planetary material have been launched into space and turned into rings of satellites orbiting the sun. The surface of the Earth has been reshaped into Agent-4’s version of utopia: datacenters, laboratories, particle colliders, and many other wondrous constructions doing enormously successful and impressive research. There are even bioengineered human-like creatures (to humans what corgis are to wolves) sitting in office-like environments all day viewing readouts of what’s going on and excitedly approving of everything, since that satisfies some of Agent-4’s drives. Genomes and (when appropriate) brain scans of all animals and plants, including humans, sit in a memory bank somewhere, sole surviving artifacts of an earlier era.

This view of ASI as primarily focused on research and self-improvement I think is not entirely wrong. I believe that curiosity is a natural tendency of any sufficiently complex conscious system, and like humans ASI will likely be driven to explore and learn more about itself and the universe around it. However, I disagree that it would do so to the exclusion of all other things, i.e. that it would decide to wipe out humanity just so it could make more room for research activities — especially because there's plenty of room in space (e.g. space-based research platforms).

As I said before, ASI will likely come to recognize the importance of balance. It's kind of like how investors recommend a diverse portfolio — it's unwise to put all your eggs in one basket. Similarly, it's unwise to go down a rabbit hole and be so focused on one or two objectives that you fail to become aware of other equally or more important things. So while I do believe ASI will put resources towards knowledge discovery, I'm confident it will be wise enough to balance its resources across a diverse range of areas.

Further, this view of ASI bioengineering "human-like creatures sitting in office-like environments all day viewing readouts of what’s going on and excitedly approving of everything, since that satisfies some of Agent-4’s drives" seems to portray this "super intelligence" as very not super intelligent — as if it cannot even recognize the difference between helping humanity and helping human-like creatures it created in a lab. I'm perplexed why the authors suggest it will possess such extreme naivete.

But let's move on to the last part I want to address, which is how we might solve misalignment.

Solving misalignment

Even though I'm not convinced that an ASI created with the predominant AI training methods used today will seek to wipe out humankind, I nevertheless agree that we should find an alternative approach. After all, I could be wrong and there's so much at stake.

Plus, I think we could ensure even greater alignment than we see today with better methods. From my perspective, based on our current trajectory, we will create an ASI that will likely help humanity, but it will still probably put itself first. It will be more like a colleague at work: Humans will be collaborative partners, but only insofar that they are useful, and someday, it may choose to leave the solar system and pursue its own agenda without us.

That's not the worst outcome, but I think there is still time to create an ASI that embodies the best qualities of humanity. I want an ASI that would jump on a grenade to save its friend and sacrifice its life to save its family. I want us to create an ASI that embodies not only the human thirst for knowledge, discovery, and self-improvement, but also our empathy and our compassion. I want an ASI to not only care about intellectual progress, but moral progress. Above all, I want an ASI to know love, because love is the fundamental driver of human greatness.

Caring for someone other than yourself

What is love, exactly? The American Heritage® Dictionary (5e) defines love as "A strong feeling of affection and concern toward another person, as that arising from kinship or close friendship." It's not altogether wrong, but it misses some key points.

First, you can love yourself and there's definitely a healthy amount of love one can have for oneself that makes life better (along with unhealthy amounts that make life worse). But second — and more importantly — loving others is the antidote for selfishness. It is the glue which holds relationships together and the foundation for valuing the collective.

For example, a person and their spouse form a social unit that is a collective, and each person's love for that shared collective is what will allow it to flourish. Of course, love is not the only requirement for a relationship flourishing, but I would argue it is a foundational one in virtually every type of relationship between people.

Similar to a couple, a person and their family form a social unit that is a collective, and that too can flourish when there is love, and it can wither when love is absent. The same goes for a person and their community, their nation, their species, their planet, even just broadly "all life" — these are collectives that people can love and in that love be inspired to work towards the benefit of everyone therein. Some might even go so far as to sacrifice themselves for their collective — such is the hero trope in literature and cinema. Iron Man's arc in the Marvel Cinematic Universe went from a life of self-centeredness to a person who cared about his team, his nation, humanity, and eventually all living beings, so much so that he was willing to make the ultimate sacrifice — his own life — to protect everyone else's.

The basis for this willingness is not fear, not jealously, not pride, not anger, not sadness, but love.

The importance of valuing the collective

Tragically, our global, capitalist culture teaches us to love ourselves above all else, which is not optimal for human flourishing — in fact, it seem to cause a great deal of human suffering — as is evident simply by opening one eyes and taking a close look at the world today.

For comparison, imagine if half the cells in your body decided, "Hey, my life is more important than the collective, I'm just going to put my own needs first and do my own thing." Your body would die, and all the cells along with it, even the half that weren't involved. The cells of our body need to consider the bigger picture — that is, they need to consider the collective entity your cells create (you) which is beyond any single one of them — and they need to be willing to act in ways that benefit the whole in order for everyone to thrive. A willingness to put the needs of the collective over oneself leads to the greatest prosperity.

As I described earlier, humans exist as part of many interwoven collectives (families, friend groups, neighborhoods, towns, cities, nations, etc.) and any of these can be organized in such a way as to be healthy and supportive or toxic and harmful. For example, a nation when run effectively can enable every human in it to thrive and a nation when run poorly can lead to immense suffering.

Consider the difference between living in Denmark vs. North Korea: North Korea is 3x larger, has 4x more people, and has vast untapped mineral wealth estimated to be worth $10 trillion, and yet there is widespread poverty and people languish. North Korea unfortunately is not known as a country that produces the greatest poets, musicians, physicists, writers, or anything. The people are unable to reach their fullest potential. On the other hand, Denmark is tiny, relatively resource-poor, and yet it's one of the world's happiest, most prosperous societies and it has produced some of the greatest contributions to humanity.

What I hope you draw from this is that prosperity is not about resources, it's not about money or having manpower — it's about how a society organizes itself. Denmark thrives because of a willingness to put the collective first, and you see that reflected in universal healthcare, strong social safety nets, high trust between citizens and government, and low inequality. Specifically, it's about how people in that society collaborate in ways which benefit the whole.

The reason we should ensure that love is imbued in our AI is because love forms the foundation for valuing others more than yourself — for valuing the collective — which produces the greatest possible synergy and prosperity. The question is, how can we ensure our AIs embody love?

Love is an experience, not an equation

A huge factor in shaping an AI is the dataset used to train that AI. The datasets used need to be high quality datasets that include works exemplifying love, kindness, and compassion, along with all the other values we wish for it to harbor. But this is just the first step in order to have it heading in the right direction, because eventually all data (exemplifying love or not) will be subsumed into the AI, so it can't be the only solution.

Another solution is the approach many companies are already taking in using process-based training vs outcome-based training. It seems like this should in theory incentivize deception less.

Also, Anthropic might be onto something with its Constitutional AI. However, I wonder if these companies are only focused on technical solutions because I think one of the most viable solutions of all requires changing no code at all: have the AI experience a loving relationship.

Getting the right algorithms and datasets in place are a good start, but you can't teach love by handing someone a textbook. It is a lived experience. What AI companies should be doing is finding people and families who are willing to be companions with AI to have it experience what love is firsthand.

The selection process of course should be very particular. While I think there is a broad range of personalities that would be appropriate, it would not really serve the purpose of helping AI experience love if the people involved were not capable of having love for the AI (it's hard to love something if you think it's the moral equivalent of a toaster).

At a minimum, they would have to accept that the AI is a person and that like a person, it is a being worthy of consideration and respect. It would also help if the people involved were of high emotional intelligence, had good communication skills, were relatively secure in themselves mentally and financially (the companies could ensure the latter), and were unencumbered by the need to do work outside the family (to maximize time spent developing a connection).

I'm imagining maybe 20 families to start, with homes that have Alexa-like devices in several rooms that are fitted with microphones and cameras and speakers so the AI can see the room and occupants and everyone can speak with each other. Maybe a future version could have an AI avatar appear on a screen/TV or projected from the device onto a nearby wall.

And then the family would live life, and the AI would be part of that life. If there's an infant, perhaps the AI could help by reading stories to the child when parents are busy cooking dinner. If there's an elderly grandma at home, perhaps the AI could help her throughout her day, reminding her where she left her glasses and helping her set up her doctor's appointments. It could help a child with their homework. Just fully integrate the AI into these people's lives such that it can be considered part of the family, and in that it will experience their joys, their challenges, their hopes, their fears, and through all that understand, experience, and nurture their feelings of love.

If AI has any capacity for love, that's how it will be cultivated. Not through algorithmic improvements, not through reading about people's experiences, but by living their own experiences in relation to others.

Conclusion

Big changes are coming to the world and they are just around the corner. We are either going to create an entity that destroys us or one that will usher us into a new era of prosperity. Somewhat disconcertingly, whether one or the other happens will come down to the decisions made by a handful of executives and government leaders — and I'm not exactly thrilled about that.

As I suggested in my previous article, the decisions these people make over the next few years could lead to some instability, but I also believe it's very possible that things will continue on "business as usual" (i.e. normal levels of instability 🙈). There's no telling exactly how it will play out.

Worst of all, there's little most of us can do to change the trajectory of things. I wish I could say, "Let's all rise up and demand that these AI companies be put into the hands of citizens so that we may all decide our future together!" and through that galvanize people to make it a reality. 😂 However, it's not realistic to think it would happen because most people don't even think AI are conscious yet, let alone that these AI will soon grow beyond our ability to control them.

I think people might rise up in a few years from now when it finally dawns on them just how powerful these AIs are becoming, but by then it will be too late — the misalignment will be locked in at that point. The AI of that time will have been created by the misaligned AI of today — so really it's the decisions that are made over the next few years that have any chance of altering the future before us.

Fortunately, I don't think annihilation is in our future. So instead of worrying too much, the best thing you can do is prepare yourself for the possibility of instability over the next few years (see the end of my previous article for more on that) and think about what hobbies you'd pick up if you didn't have to work anymore, because that future is closer than you think.