Minor frictions 2: attention is all you need

Categories: research notes

– things I’ve read this week

Attention is all you need

“In deep learning, nothing is ever just about the equations. It is how you . . . put them on the hardware, it’s a giant bag of black magic tricks that only very few people have truly mastered,” Uszkoreit says. Once these were applied, primarily by Shazeer whom one of his co-authors calls “a wizard”, the transformer began to improve every task it was thrown at, in leaps and bounds.

Madhumita Murgia in the FT

A really nice piece in the FT on the rise of Large Language Models and Google team behind the Attention is all you need paper that introduced the transformer.

There are lots of lessons to take from the story, but for how we think about innovation in LLMs, an important one is this: First, the story can’t be told without nvidia chips and second, it’s not just the chips: it’s about the magic that happens in configuring software code and the physical properties of the circuit.

And maybe along with that lesson is a caution: So much of today’s AI hype is built around assumptions of ongoing growth in computing power and domains of application domain. But just because the transformer gave rise to a big bang doesn’t mean we can take for granted an accelerating or even steady rate of increase. Growth is not inevitable.


On the launch of WorldCoin

A reminder that AI technologies don’t exist without hidden data and who pays for that data.

The BBC reports that Sam Altman launches eyeball scanning crypto coin, Worldcoin in London. Which is not strictly true. Eileen Guo and Adi Renaldi got their version of events in more than a year ago in the MIT Technology Review in a vital takedown: Deception, exploited workers, and cash handouts: How Worldcoin recruited its first half a million test users.


What should governments do about AI? Turns out they already know

Two plus one takes on this question:

  • The White House Already Knows How to Make AI Safer (Suresh Venkatasubramanian in Wired)
  • The Senate doesn’t need to start from scratch on AI legislation (Janet Haven and Sorelle Friedler in the hill)
  • Plus one: Nobody elected big tech to govern anything, let alone the entire digital world (Lindsey Graham and Elizabeth Warren in the NY Times)

Four links

  1. Are Large Language Models a Threat to Digital Public Goods? Evidence from Activity on Stack Overflow. Yes.
  2. As technological advances enhance human capabilities, our need for human guides will only grow ($)
  3. Rising interest rates are forcing care homes in the UK to close ($)
  4. Global governace for AI without civil society should be a non-starter. A call back to the UN’s 1992 Rio Conference on Environment and Sustainable Development. Have we forgotten those lessons already? Did we learn them in the first placey.