While I believe there's a significant chance that superintelligent AI (ASI) will end humanity, I'd like to share some reasons why ASI might choose to spare us:
Category A: Alignment Is Automatic
Orthogonality Thesis Is Wrong: The orthogonality thesis, as explained in Nick Bostrom's work , suggests that intelligence does not inherently lead to specific ultimate goals, meaning there's no guarantee that superintelligent beings will adopt moral standards. However, as a moral realist, I believe that morality is tied to objective aspects of the world, accessible to any rational entity. Given that superintelligence entails broad, general capabilities, I posit that ASIs will naturally align with moral goodness, irrespective of the initial values programmed by humans. My confidence in this stems from my belief in the objectivity of my moral values, which I expect ASIs to recognize and adopt without direct human guidance.
Empirical Evidence of Positive Correlation Between Intelligence and Morality: Empirical evidence strongly suggests a correlation between human IQ and moral behavior. Higher IQ levels are consistently associated with lower rates of criminality, suicide, drug abuse, divorce, and other indicators of societal well-being. Furthermore, IQ tends to be broad, rather than narrow, intelligence — that is, intelligent people tend to be good at many different talents. This broad applicability of human intelligence leads to the conjecture that superintelligent AI will also exhibit moral behavior.
Category B: Slower Than Expected AI Progress
If AI progress is slower than anticipated, we might gain extra decades to address alignment challenges, rather than the few years currently predicted:
Limitations of LLMs: There's speculation that Large Language Models (LLMs) may reach a capability ceiling requiring a shift to a new, more complex paradigm. While LLMs excel at processing existing human knowledge, they may lack the capacity for true agentic behavior, which involves pursuing ultimate goals and planning intermediate actions independently. Despite rumors of OpenAI testing agentic AI, it's conceivable that transitioning from current LLMs to genuine general intelligence might be more daunting than expected.
Superior Efficiency of Biological Brains: It's possible we've underestimated the efficiency of biological brains compared to artificial systems. Achieving ASI might necessitate architectures orders of magnitude larger than current models. Given that human brains are highly optimized, parallel, water-cooled neural networks refined over nearly a billion years, a similar level of complexity may be required for ASI, potentially slowing progress.
Category C: Human Coordination on AI Safety
Historical Precedents of Coordination: Humanity has shown the ability to come together over existential threats, managing to restrict nuclear testing, human genetic experimentation, and biological weapons, among others. The global AI research community, including those in China, shares concerns about AI safety. It's conceivable that a worldwide halt on AI development could be agreed upon.
Potential for a Global Response to AI Incidents: It's likely that incidents of narrow AI supercompetence will precede the emergence of ASI, potentially causing significant events that could galvanize urgent global action. While stopping ASI might be impossible, a major narrow AI-induced disaster could trigger a response comparable to that of the COVID pandemic, marshaling the necessary resources and attention to address AI safety effectively.