Intelligence is not enough: The humanity of engineering

Intelligence is not enough
The humanity of engineering
Bryan Cantrill
Oxide Computer Company

View Slide

OXIDE
It always starts with a tweet…

View Slide

OXIDE
It always starts with a tweet being trolled…

View Slide

OXIDE
“Serious”?
• This tweet used the word “serious” three times, mainly to deride others
• Not clear what “serious” means in the context of an argument that
equates a computer program with nuclear weapons?
• Or accuses anyone who disagrees with this assessment of “just vibes”?
• Or one that puts the risk of human extinction at the (metaphorical!)
hands of a computer program to be 5% with zero methodology?
• So, a serious question: why treat this seriously at all?

View Slide

OXIDE
Reasons to treat this seriously
• Fear of technology isn’t new – and isn’t always poorly founded!
• New technologies often have unintended consequences and
externalities that merit consideration and discussion
• But in those who believe in AI-based extinction risk, the fear itself is
alarming – in part because of the actions that it would justify
• The “AI pause” – if implemented – would be brazenly authoritarian
• The accompanying rhetoric is often disturbingly violent

View Slide

OXIDE
Concrete extinction risk
• Most AGI-based extinction risk fears – when made concrete – hinge on:
○ A computer program getting ahold of nuclear weapons
○ A computer program making a novel bioweapon
○ A computer program developing novel molecular nanotechnology
• We are going to leave aside nuclear weapons, as indisputably serious
people have been thinking about it since the dawn of the atomic age
• But the latter two have something important in common…

View Slide

OXIDE
Superintelligent engineering?
• Whether stated explicitly or not, when we talk about the fear of a
superintelligent AI actively killing not just some humans but all of them,
we are talking about AI making weapons
• Let us leave aside many questions about such scenarios (e.g., AI’s
alignment, motivation, or means of production – and human adaptability,
countermeasures, and resilience), and focus on one pillar…
• It depends on AI making applying the constraints of physical and
mathematical reality to make new stuﬀ – which is to say, engineering

View Slide

OXIDE
Engineering and intelligence
• If our very existence is threatened by a superintelligence engaged in
engineering, it prompts an important question…
• Is engineering an act of intelligence alone?
• I can’t speak to building novel bioweapons or the signiﬁcant challenges
in reviving otherwise moribund molecular nanotechnology…
• …but we do have a bunch of recent experience building something big
and new that is surely simpler than these domains

View Slide

OXIDE
What we built!

View Slide

OXIDE
Building a computer
• In case it needs to be said: building a new computer + new network
switch + high-speed backplane + all software from lowest levels of
ﬁrmware to highest levels of control plane is hard and complicated
• It is still, however, engineering not science
• Engineering is the act of learning from failure: even when building anew,
there will be many occasions when the system does not, in fact, work!
• It is worth exploring a tiny fraction of the failures that we endured in
building, as they are instructive as to the nature of engineering…

View Slide

OXIDE
Failure to bring CPU out of reset
• Despite following the documented power sequencing to the CPU (AMD
Milan), it was refusing to come out of reset, simply reinitiating the
power-on sequence after 1.25 seconds of inactivity
• Natural assumption was that power was marginal – but the power
looked good (and making it extraordinary didn’t change anything)
• Went down any number of blind alleys, performing directed experiments
with respect to non-connected pins that shouldn’t make any diﬀerence
• These experiments weren’t easy!

View Slide

OXIDE

View Slide

OXIDE
• After several weeks of debugging, we discovered that our voltage
regulator had a ﬁrmware bug: it adjusted voltage as requested by the
CPU via SVI2 – but never sent a completion (VOTF Complete)
• The CPU had no way of knowing that the power was in fact correct
• AMD’s tool for verifying power (SDLE) did not check for this packet
• Corrected regulator ﬁrmware resulted in the CPU coming out of reset!

View Slide

OXIDE
Failure to bring NIC out of reset
• We could not get the Chelsio NIC to come out of reset
• Extensive validation did not reveal any signal that was out of spec
• Attempting to take a working add-in card (AIC) and destroy it revealed
that one of the pinstrap resistors (to select the clock source) was
incorrectly speciﬁed
• We had a 1K ohm pull-down resistor, but this was in fact too weak –
and a 499 ohm resistor was required to overcome an internal pull-up
• Reworking with the correct resistor resulted in the NIC correctly starting!

View Slide

OXIDE
NIC transiently failing to train all PCIe lanes
• We have our own platform enablement layer (i.e., no BIOS); we are
responsible for initializing devices at the lowest layer
• With disconcerting frequency, some number of Chelsio NIC links did not
train correctly for some of their lanes on boot
• Decoding the Link Status and Training State Machine (LSTSM) on the
CPU allowed us to better understand where it was failing, but not why
• Discovered that a second PERST resulted in correct training – and
moreover that this second PERST is present on legacy ﬁrmware!

View Slide

OXIDE
Failure to connect to U.2 NVMe drives
• In a revision of our PCIe-to-U.2 passthrough card (Sharkﬁn), we had I2C
connectivity – but no PCIe connectivity whatsoever
• A previous version of this card had worked, but little had changed in the
schematic and the layout – why were the new ones broken?!
• Physical inspection revealed that one of the parts was simply wrong!
• The wrong reel of parts had been loaded into a pick-and-place machine,
and an inverter had been laid down instead of an AND gate (!)
• Reworked ~1200 cards in ~96 hours!

View Slide

OXIDE
Random data corruption on software install
• When installing OS boot images, sporadic (!) corruption was seen
• Adding checksums to these images revealed corruption was rampant (!!)
• Microprocessor was speculatively loading through a stowaway mapping
from early boot, which was allocating in the TLB
• If application address conﬂicted with address of stowaway mapping,
kernel would incorrectly copy data from the wire to the wrong location
• Eliminating stowaway mapping eliminated the corruption – but
highlighted divergent perspectives on side-eﬀects of speculative loads

View Slide

OXIDE
What do these have in common?
• Each posed an existential risk for the artifact: without solving them, we
wouldn’t have something that’s impaired – we would have nothing
• Each revealed an emergent property, often at an interface boundary
• The breakthrough was often something that “shouldn’t” have worked
• Intelligence alone does not solve problems like this
• In all cases, we summoned other elements of our character: our
resilience, our teamwork, our rigor, our optimism, our curiosity

View Slide

OXIDE
Values in engineering
• These extra-intelligence values are so important to us, that we have
codiﬁed them – and use them very explicitly as a lens for hiring
• To be clear, we are certainly seeking capable, intelligent people – but
that intelligence is useless without these shared (human!) values
• We may be more explicit about it than others, but many engineering
teams are also implicitly hiring for shared values
• Viz.: It is comical to think of an engineering team hiring based only on
the results of a test – or any other linear measure of intelligence!

View Slide

OXIDE
The humanity in engineering
• This humanity necessary to understand and resolve failure – so essential
in designing and building – is hidden in the ﬁnal artifact
• This is the soul in Tracy Kidder’s Soul of a New Machine – and the
perspiration in Edison’s proverbial 99% perspiration
• Computer programs lack this humanity: they do not have willpower,
desire, or drive – let alone the deeper human qualities required
• Which doesn’t mean that AI can’t be useful to engineers, merely that it
cannot engineer autonomously

View Slide

OXIDE
So, should we worry about AI?
• Extinction risk due to AGI is de minimis – but we must not falsely
dichotomize AI into posing existential risk or no risk whatsoever!
• The risk that AI does pose may feel mundane – but it is much more
how it will be abused (deliberately or accidentally) by existing structures
• AI ethics is exceedingly important, especially when it is being used to
inform decisions that aﬀect people’s lives!
• By acknowledging that AI is and will be an important tool, we can move
beyond fear to focus on enforcing existing regulatory regimes

View Slide

OXIDE
Further wells to fall down information
• Richard Smalley/K. Eric Drexler debate on molecular nanotechnology
• Lex Friedman interview with Marc Andreessen
• Logan Bartlett interview with Eliezer Yudkowsky
• Oxide and Friends podcast, especially Okay Doomer, Tales From the
Bringup Lab and More Tales from the Bringup Lab

View Slide

Intelligence is not enough: The humanity of engineering

Intelligence is not enough: The humanity of engineering

Bryan Cantrill

More Decks by Bryan Cantrill

Featured

Transcript