Deep into the Forest

Share this post

Towards Sentience? Probably Not.

deepforest.substack.com

Towards Sentience? Probably Not.

Estimated Reading Time: 7 minutes

Bharath Ramsundar
Jun 24, 2022
4
Share

TL;DR

The last few months have seen a heated debate in AI circles about the Scaling Hypothesis, which argues that large enough models may be capable of achieving artificial general intelligence. We survey some of the recent evidence for and against this hypothesis and argue that while there is compelling evidence for the effects of scale, the field of AI is close to hitting fundamental hardware limits that will prevent a superintelligence explosion.

How Intelligent are Large Models?

Over the last few months, there has been a raging debate in AI circles about the intelligence or lack thereof for large models. The last few months have seen a wave of ever more powerful image generation models that show amazing ability to generate rich sophisticated images on user demand. 

Analysis from Google demonstrates how their large models grow progressively more coherent with increased scale, providing a powerful argument for the benefits of scale.  https://ai.googleblog.com/2022/05/vector-quantized-image-modeling-with.html.
Google’s Parti model shows an astounding degree of compositionality, with a broad robustness to perturbation of image style, and a basic understanding of image physics (see the water tree in the lower right corner).

At the same time, there are many known failures of these models. Google’s blog post itself highlights some classes of failure such as “color bleeding”, “incorrect spatial relations,” “improper handling of negation” and many more.

Google offers a compelling visualization of a series of failures by its model along with a classification of failure modes. (https://parti.research.google/)

Gary Marcus has provided a compelling analysis of the systematic limitations on his Substack (see his Horse Rides Astronaut post, or a more recent essay about relevant ideas from linguistics). Marcus argues that these failures are indicative of a fundamental failure in current paradigms and that we need to go back to the drawing board to make progress. A blog post by Yann Lecun and Jacob Browning agrees that limitations exist, but argues that these limitations are “hurdles” and not “walls.” Multiple lines of evidence suggest that various “neural scaling laws” exist and provide a clear pathway to improvements, at least for the short term. Early evidence also suggests that large models exhibit “emergence” in that there are capabilities that arise rapidly at sufficient scale which don’t at smaller scales.

The paper “Scaling Laws for Neural Language Models” is rapidly growing into an influential driver of AI progress. The “scale is all you need” hypothesis posts that this empirical relationship continues all the way to superhuman intelligence. https://arxiv.org/abs/2001.08361
One of the most intriguing and compelling arguments for the scale hypothesis is examples of emergent behavior where models suddenly exhibit capabilities at large enough scale that they don’t exhibit at smaller scales https://arxiv.org/abs/2206.07682.

Does the scaling hypothesis hold true? Will large enough language models exhibit sentience? These are the possibly trillion dollar questions facing the AI industry today. As an AI researcher, I believe a weak version of the scaling hypothesis. That is, large models exhibit interesting properties which justify additional scientific exploration. These large models also seem to lead to multiple naturally commercializable products, such as the recently launched Github Copilot, which charges $10/month to provide programmers an autocomplete capability powered by large language models. At the same time, today’s large models are getting so large that they are nearing the peak of human compute capabilities. The plot below from the Economist shows how the amount of compute used has grown at a staggering rate.

Ever larger models are being trained by large corporate teams at a steady pace https://www.economist.com/interactive/briefing/2022/06/11/huge-foundation-models-are-turbo-charging-ai-progress.

As a result, while it may have been true to argue a few years ago that there was plenty of room to scale, I believe that we are only a few years from a point where the largest models will have maxed out the largest available supercomputers. At that point, progress in AI will turn into a hardware problem rather than a software problem which will naturally slow the rate of progress. Six years ago, I wrote an essay, The Ferocious Complexity of the Cell, which argued that about 30 orders of magnitude of scaling would be needed to understand the brain. Progress in intelligent systems has been much faster than I would have then expected, but I believe that we are still many orders of magnitude away from matching the human brain. The hard problems of physical foundry optimization remain a formidable barrier between today’s AI and potential superintelligence. The slowdown of Moore’s law acts as a formidable counterweight to emergent scaling curves. Some researchers suggest that scaling AI systems will require fundamental reworking of computer architecture paradigms

Twitter avatar for @tmramalho
Tiago Ramalho @tmramalho
The next big breakthrough in AI will come from hardware, not software. Training giant models like PaLM already require 1000s of chips consuming several MW, and we will probably want to keep scaling these up several orders of magnitude. How can we do it? a 🧵
8:55 AM ∙ Jun 16, 2022
1,419Likes239Retweets

We have covered the challenges of semiconductor scaling extensively on this newsletter. See for example:

Deep into the Forest
The Physics of EUV Lithography
Estimated Reading Time: 10 minutes Happy Friday! Last week we learned about ASML and its dominant position in the semiconductor industry. ASML is the world’s only supplier of EUV lithography machines and commands a market cap of $220B. We hinted at the difficult physics of EUV lithography in last week’s newsletter, but today we’ll explain more about the …
Read more
2 years ago · 10 likes · Bharath Ramsundar
Deep into the Forest
A Deeper Dive into Semiconductor Foundries
TL;DR In last week’s issue, we learned about the core machines that populate a semiconductor foundry. In this week’s issue, we will dive deeper into the construction of a foundry and learn about the actual physical edifice itself. Semiconductor foundries are large factories that place considerable burden on the surrounding community and infrastructure. W…
Read more
2 years ago · 9 likes · Bharath Ramsundar
Deep into the Forest
A Deep Dive into TSMC, Part II: The Future of Moore’s Law
TL;DR TSMC has bet on the continuation of Moore’s law as key to its position as the world’s leading edge foundry. At the same time, transistors have reached near atomic scales, calling into question existing strategies for continued transistor miniaturization. Today’s post analyzes TSMC’s research roadmap and explains why TSMC’s Chief Scientist thinks th…
Read more
2 years ago · 4 likes · Bharath Ramsundar

While improvements in AI chip design will lead to speedups, the fundamental physical challenges involved in shrinking transistors will likely eventually yoke AI progress to the progress of semiconductor physics.

For this reason, I don’t personally find work on AI alignment compelling, especially compared with more immediate problems such as profound misuses of AI for mass surveillance by authoritarian regimes.

Twitter avatar for @paulmozur
Paul Mozur 孟建國 @paulmozur
China is building a new modern marvel. It's not a dam or a high speed rail, it's the most sophisticated domestic surveillance system in the world. The scale of data collection is staggering. No biometric frontier is neglected. This is how it works:
nytimes.comVideo: China’s Surveillance State Is Growing. These Documents Reveal How.A New York Times analysis of over 100,000 government bidding documents found that China’s ambition to collect digital and biological data from its citizens is more expansive and invasive than previously known.
10:28 AM ∙ Jun 21, 2022
7,095Likes3,536Retweets

Malignant AI could pose a challenge one day, but rising fascism, climate change, and military threats by authoritarian powers such as Russia and China loom as much more serious short-term challenges.

Interesting Links from Around the Web

  • https://www.extremetech.com/gaming/337390-first-intel-arc-a380-desktop-benchmarks-are-disappointing: Intel’s first desktop GPUs show disappointing benchmarking results.

Feedback and Comments

Please feel free to email me directly (bharath@deepforestsci.com) with your feedback and comments! 

About

Deep Into the Forest is a newsletter by Deep Forest Sciences, Inc. We’re a deep tech R&D company building Chiron, an AI-powered scientific discovery engine. Deep Forest Sciences leads the development of the open source DeepChem ecosystem. Partner with us to apply our foundational AI technologies to hard real-world problems. Get in touch with us at partnerships@deepforestsci.com! 

Credits

Author: Bharath Ramsundar, Ph.D.

Editor: Sandya Subramanian, Ph.D.

4
Share
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Bharath Ramsundar
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing