<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Oly on AI]]></title><description><![CDATA[Which ideas give me an edge in predicting - and shaping - tech and AI progress? I share some of them with you here. AI is *the* frontier technology, and it's set to reshape our world. Let's make it for the better!]]></description><link>https://www.oliversourbut.net</link><image><url>https://substackcdn.com/image/fetch/$s_!eqDq!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6faaaa8-072a-4eb5-9ce9-b6b378c8044a_1098x1098.jpeg</url><title>Oly on AI</title><link>https://www.oliversourbut.net</link></image><generator>Substack</generator><lastBuildDate>Sun, 14 Jun 2026 16:08:21 GMT</lastBuildDate><atom:link href="https://www.oliversourbut.net/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Oliver Sourbut]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[oliversourbut@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[oliversourbut@substack.com]]></itunes:email><itunes:name><![CDATA[Oliver Sourbut]]></itunes:name></itunes:owner><itunes:author><![CDATA[Oliver Sourbut]]></itunes:author><googleplay:owner><![CDATA[oliversourbut@substack.com]]></googleplay:owner><googleplay:email><![CDATA[oliversourbut@substack.com]]></googleplay:email><googleplay:author><![CDATA[Oliver Sourbut]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Citations Needed]]></title><description><![CDATA[Magic Encyclopedias to Save the World]]></description><link>https://www.oliversourbut.net/p/citations-needed</link><guid isPermaLink="false">https://www.oliversourbut.net/p/citations-needed</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Fri, 12 Jun 2026 15:31:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7MwY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Last week <a href="http://flf.org">FLF</a> launched <a href="https://www.lesswrong.com/posts/frizRHnA6AZpJSDqw/lab-leaks-black-holes-and-eggs-epistemic-case-study">a competition</a> &#8220;<strong>to find the best workflows and methodologies for using AI to produce reliable, trustworthy knowledge bases</strong>&#8221;. I had (and have ongoing) a substantial role in that effort. Why do I think it&#8217;s so important? It&#8217;s a lot of reasons actually! I&#8217;ll gesture at a few here.</p><h2>Conjuring a magic encyclopedia</h2><p><strong>For now, assume with me that it </strong><em><strong>can be done</strong></em><strong>. Wish away with me the various technical and financial challenges. Great!</strong> Now we can rapidly conjure up a <em>deeply, fully researched </em>knowledge base on <em>any </em>topic. All claims point back to who&#8217;s said them, in what context, and (importantly) with what <em>justifications</em> and evidence (if any). Any quibbles or nuances which have been expressed on a point are similarly readily available. It&#8217;s not opinionated: all competing viewpoints with their associated justifications are associated and comparable.</p><p>That&#8217;s <em>way, way</em> too much information! Imagine trying to read everything ever about diet or shipping or taxes or microbes. <a href="https://en.wikipedia.org/wiki/Ain't_Nobody_Got_Time_for_That">It&#8217;s not happening</a>. So as well as this, we now magically have tools which gather similar points together, summarise, and can make a decent stab at which points we&#8217;ll consider most or least <em>relevant</em>. We can dig deeper (or send AI agents to scout deeper) as desired. And when <em>new</em> interesting and informative content arises, or in contexts where nuance and clarification are helpful, it can be bubbled up to our attention.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wefL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wefL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png 424w, https://substackcdn.com/image/fetch/$s_!wefL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png 848w, https://substackcdn.com/image/fetch/$s_!wefL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png 1272w, https://substackcdn.com/image/fetch/$s_!wefL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wefL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png" width="1456" height="804" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:804,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wefL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png 424w, https://substackcdn.com/image/fetch/$s_!wefL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png 848w, https://substackcdn.com/image/fetch/$s_!wefL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png 1272w, https://substackcdn.com/image/fetch/$s_!wefL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba200c26-2a0f-4fa1-89ab-34714f4e27a2_1600x883.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>All this is doable today: enough web searches, enough cross-referencing of tweets, articles, journals, following of citation chains, gathering and comparing of hypotheses and points of view, etc. will make progress. But it&#8217;s exhausting. When someone <em>does</em> go to those lengths, their partial &#8212; but heroic &#8212; efforts to map out what&#8217;s been said often languish either unpublished or unrecognised.</p><p>Don&#8217;t we already have this? A shining example is Wikipedia, where the collective curatorial effort of a wide range of editors gradually maps out an expanding core of topics and commentary. But Wikipedia has lags, biases, and (perhaps most importantly) <em>huge gaps</em>, especially on important frontier questions. (Let&#8217;s not talk about <a href="https://en.wikipedia.org/wiki/Grokipedia">Grokipedia</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>)</p><p>Meanwhile, the tech to &#8216;smartly browse&#8217; and bubble up informative pieces is nascent too, in bits and pieces like AI chatbots and community notes: already useful in their ways, but faltering, unreliable.</p><p><strong>It&#8217;s these comparison points and the early progress I see which gives me some excitement that the grander vision is viable and that we can take steps towards it now.</strong></p><h2>Who cares?</h2><p>I&#8217;m not naive. I know that many (most?) humans a lot of the time <a href="https://www.conspicuouscognition.com/p/misinformation-is-often-the-symptom">aren&#8217;t actually interested</a> in finding out or sharing what&#8217;s true; mainly they want to say what makes themselves and their friends seem popular and cool&#8230; and enemies seem dastardly and disgusting. We all have these impulses to greater or lesser extent. Yes, you too! Sometimes those impulses seem deranged (they&#8217;re not designed for the modern world); other times they might even make sense, at least selfishly.</p><p>Nevertheless (and <a href="https://www.conspicuouscognition.com/p/why-do-people-believe-true-things">perhaps mysteriously</a>!), a lot of the time, some people actually want to find out true things and share them. (I do! Do you?) Hence journalism, science&#8230; even hearsay and rumour (at their &#8212; perhaps rare &#8212; finest). We recognise that, when they&#8217;re <em>actually anchored</em> and doing their best to be right (or at least <a href="https://www.lesswrong.com/">less wrong</a>), those are absolutely foundational to wellbeing and prosperity in a modern society. <strong>Without (good) journalism, politics runs astray and tyrants abound. Without (grounded) science and technology, public health suffers, food supply and shelter and infrastructure decay, and progress falters.</strong></p><p>As <a href="https://www.oliversourbut.net/p/a-full-epistemic-stack">Ben Goldhaber and I previously wrote</a>:</p><blockquote><p>Knowledge is integral to living life well, at all scales:</p><ul><li><p><strong>Individuals manage their life choices</strong>: health, career, investment, and others on the basis of what they understand about themselves and their environments.</p></li><li><p><strong>Institutions and governments (ideally) regulate</strong> economies, provide security, and uphold the conditions for flourishing under their jurisdictions, only if they can make requisite sense of the systems involved.</p></li><li><p><strong>Technologists and scientists push the boundaries</strong> of the known, generating insights and techniques judged valuable by combining a vision for what is possible with a conception of what is desirable (or as proxy, demanded).</p></li><li><p>More broadly, <strong>societies negotiate their paths forward</strong> through discourse which rests on some reliable, broadly shared access to a body of knowledge and situational awareness about the biggest stakes, people&#8217;s varied interests in them, and our shared prospects.</p><ul><li><p>(We&#8217;re especially interested in how societies and humanity as a whole can navigate the many challenges of the 21st century, most immediately AI, automation, and biotechnology.)</p></li></ul></li></ul></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7MwY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7MwY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!7MwY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!7MwY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!7MwY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7MwY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png" width="1024" height="565" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:565,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7MwY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!7MwY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!7MwY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!7MwY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbca3328a-d8e8-44e7-bf5d-8c889e3d5c3d_1024x565.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But our knowledge-producing institutions are plagued by <a href="https://en.wikipedia.org/wiki/Publish_or_perish">publish-or-perish</a> and <a href="https://en.wikipedia.org/wiki/Clickbait">clickbait</a> incentives alike<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> &#8212; and the social media landscape is even worse, riddled with misinformation and brainrot from <a href="https://journals.sagepub.com/doi/10.1177/19401612241311886">all</a> <a href="https://www.conspicuouscognition.com/p/the-decline-of-legacy-media-rise">political</a> <a href="https://www.conspicuouscognition.com/p/on-highbrow-misinformation">quarters</a>. <strong>I care about this. So do you, I daresay.</strong></p><p>I <em>especially</em> care now, as society is <a href="https://www.oliversourbut.net/p/the-first-type-of-transformative">poised before a series of important decisions</a> about our future relationship with technology, especially AI. It could be ruinous, with tyranny, neo-feudalism, or extinction real prospects. Or it could be fantastic. <strong>Just </strong><em><strong>wanting</strong></em><strong> it to be OK isn&#8217;t enough</strong>: we have to seek, generate, share, and defend important knowledge &#8212; about developments in technology, as well as about trends in politics and power &#8212; and act on it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">We have to seek, generate, share, and defend <em>this very important blog</em>. Subscribe.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>How do we actually help?</h2><p>There&#8217;s no single path or silver bullet. But the incredibly high-level picture is: better communication of knowledge is usually good. It helps people be more informed and make better decisions according to their needs. A better <em>shared</em> understanding makes it easier for people to work <em>together</em> toward shared goals (even if they don&#8217;t agree on all priorities). On average if people make better decisions and can work together better, we&#8217;ll get <em>more flourishing</em> and <em>less catastrophic risk</em>.</p><p>We&#8217;re trying to stimulate one piece of this picture with the knowledge-base direction. Heavy-handedly adjudicating what&#8217;s true rarely works.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> Instead, equip people with the fullest picture possible, as accessibly as possible, and we find our way: evidence adds up, and when it doesn&#8217;t, that means we need to look for more.</p><p>As <a href="https://slatestarcodex.com/2017/03/24/guided-by-the-beauty-of-our-weapons/">Scott Alexander wrote</a> years ago (emphasis partly mine),</p><blockquote><p>Logical debate has one advantage over narrative, rhetoric, and violence: it&#8217;s an <em>asymmetric weapon</em>. That is, it&#8217;s a weapon which is <strong>stronger in the hands of the good guys than in the hands of the bad guys</strong>. In ideal conditions (which may or may not ever happen in real life)... when done right, it can only prove things that are true. &#8230; Unless you use asymmetric weapons, the best you can hope for is to win by coincidence.</p></blockquote><p>I&#8217;m not focused on logical debate per se, and in any case I wouldn&#8217;t be so Manichean about it &#8212; we&#8217;re all &#8216;good guys&#8217; sometimes and &#8216;bad guys&#8217; sometimes (whether we mean it or not) &#8212; but the articulation is compelling. Humanity has accrued a slowly-growing arsenal of these asymmetric weapons: libraries, citation, scientific review<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>, databases, encyclopedias, web search, to name a few. Today they&#8217;re creaking under the weight of a confusing information deluge and assaulted by powerfully-vested interests.</p><p><strong>I earnestly believe that an upgrade to truth-seekers&#8217; ability to find and scrutinise information, to build and share fuller pictures of topics at hand, can be &#8216;infectious&#8217;</strong>: <em>more</em> people <em>more</em> of the time can see a little further, pierce a little more of the fog of confusion and misinformation, be better epistemically defended, and embody &#8212; and exemplify &#8212; truth-seeking cognition. When people (through malice or negligence) spread confusion and falsehoods, they&#8217;re that bit more likely to face scrutiny and consequences. After all, we&#8217;re making that scrutiny cheaper, easier, and more accessible.</p><p>This applies whether the &#8216;wielders&#8217; of these new weapons are curious members of the public, scientists, analysts in public institutions, business leaders and technologists, or even the AI assistants those folks recruit to accelerate their work.</p><p>In politics, that can mean that people engage more often in collaborative, truth-seeking cognition and less in tribal cognition. And in technology, it can mean that more people can stay better abreast of the <a href="https://flf.org/timelines/">important shifts and prospects</a> that will shape our future &#8212; helping the public hold decisionmakers to account (and choose better ones), and helping those decisionmakers sincerely and deeply engage with the topics at hand. I want <a href="https://docs.google.com/document/d/1wtKAjpvEiMWn-RpFDi_2Vqcvt5i3sCFPmUt3MtsKOjo/edit?tab=t.ik0s2kqs0a0s">these kinds of epistemic heroics</a> to become commonplace, and I want the epistemic giants among us to stride further still.</p><p><a href="https://www.lesswrong.com/posts/frizRHnA6AZpJSDqw/lab-leaks-black-holes-and-eggs-epistemic-case-study">Let&#8217;s do it</a>!</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Though quite a flawed execution, I think the idea behind Grokipedia &#8212; namely, to get AI to substantially help with curating knowledge bases and to use that for collective epistemics &#8212; was in the right direction. Unfortunately it was mostly a vanity project and little thought appears to have been given to the grounding or validation, making it <em>less</em> useful than Wikipedia.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Do you remember the <a href="https://en.wikipedia.org/wiki/Replication_crisis">replication crisis</a>, which we&#8217;re still dragging ourselves out of? The new disease of <a href="https://replications.clearerthinking.org/is-tackling-importance-hacking-the-next-frontier-in-improving-psychology-research/">importance hacking</a>? Have you ever critically read a newspaper for rhetorical slant? Taking a more cynical stance, it&#8217;s not only <em>clickbait</em> and <em>publish-or-perish</em> (which are regrettable incentive pressures, but hardly attributable to malice). Science and journalism alike have deep <em>political</em> and <em>adversarial</em> infections as well.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>(And even if I wanted to, I don&#8217;t have <em>particularly </em>heavy hands, alas.)</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>I feel compelled to point out that the current state of &#8216;official&#8217; journal- and conference-managed scientific review is truly dire, especially in some fields including psychology and AI. I hold up the <em>ideal</em> of scientific review, not its pale and diseased shadow as sometimes charaded on Earth.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Lab Leaks, Black Holes, and Eggs]]></title><description><![CDATA[FLF's Epistemic Case Study Competition]]></description><link>https://www.oliversourbut.net/p/lab-leaks-black-holes-and-eggs</link><guid isPermaLink="false">https://www.oliversourbut.net/p/lab-leaks-black-holes-and-eggs</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Sun, 07 Jun 2026 12:14:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!iNvc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>At <a href="https://www.flf.org/">FLF</a>, we&#8217;re running a competition to find the best workflows and methodologies for using AI to produce reliable, trustworthy knowledge bases, grounded in real-world cases.</strong> We&#8217;re open-minded on the types of submissions we receive and on how they address the problem. We&#8217;ve set aside approximately $200k for prizes. Winning submissions may receive a prize from $5k-$50k and if submissions warrant, multiple $50k prizes are possible. Winners may be offered opportunities for further funded work.</p><p>You can <strong><a href="https://docs.google.com/forms/d/e/1FAIpQLSeBqNCI4Klaq6FO8CbhYCxr6cYAUMjeosExOjatfCHYfEvNVQ/viewform?usp=header">express interest</a></strong> right away to receive commentary, information, and updates &#8212; whether you&#8217;d like to participate or are just interested in the outcomes of the competition.</p><p>The heights of human epistemic investigation are impressive and valuable, but rare and difficult to reach &#8212; see our <a href="https://docs.google.com/document/d/1wtKAjpvEiMWn-RpFDi_2Vqcvt5i3sCFPmUt3MtsKOjo/edit?tab=t.ik0s2kqs0a0s">abridged collection of strong examples</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> The limiting factor is rarely exquisite insight (though this helps!), and more often diligence, a curious and open mindset, and the time and effort needed to do the thorough work investigating background on a topic: activities AI is well placed to assist with.</p><p>Existing AI-assisted knowledge base work demonstrates real pieces of this &#8212; agent memory (e.g., Claude Code&#8217;s memory and skills), LLM-curated personal wikis (<a href="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f">Karpathy&#8217;s</a> perhaps the highest-profile), and deep-research tools. But these mostly produce single-user artifacts tuned to one investigator&#8217;s context, not the kind that travel, combine, or survive (especially adversarial) scrutiny.</p><p>We&#8217;re particularly excited by the compounding potential &#8212; if structured analyses<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> become reusable, refineable artifacts, every serious investigation enables future work, on the same or related topics, and by the same or different people, to reach further from a more solid epistemic foundation. Who knows, you might even <a href="https://www.astralcodexten.com/p/your-attempt-to-solve-debate-will">solve debate</a>!</p><p>This<strong> competition provides three challenging case studies &#8212; with deliberately varied challenge profiles &#8212; and invites you to produce tooling and techniques to help people navigate them.</strong> First, the debated and impactful question of COVID-19 origins. Second, the risk that the Large Hadron Collider (LHC) creates synthetic black holes (perhaps destroying the Earth). Third, the health impact of eggs (as a human food source). The tooling should be general: we&#8217;ll judge against these and also other difficult case studies.</p><h2><strong>What we&#8217;re looking for</strong></h2><p>We want to see <strong>workflows and methodologies using AI</strong> that advance the state of the art in carrying out epistemic investigations and producing compounding knowledge bases. We aren&#8217;t asking you to build an entire, robust, fully-featured system. Instead, we&#8217;re excited by any submission that advances the state-of-the-art on a component.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><p>We&#8217;ve found it useful to think of these investigations as being split into several different layers: ingestion, structure, and assessment (<a href="https://www.lesswrong.com/posts/DMswzhPQqkqx2XAma/a-full-epistemic-stack-knowledge-commons-for-the-21st-1">more here</a>). When stacked together and operating in concert, they&#8217;d create useful trusted artifacts. Something like a superior deep research, generating and interacting with a structured knowledge base, aimed at the truly epistemically discerning consumer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iNvc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iNvc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iNvc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iNvc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iNvc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iNvc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg" width="1456" height="804" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:804,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iNvc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iNvc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iNvc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iNvc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e40e00-9c12-4b63-af9d-ffc42dfb0b16_1600x883.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Below are a set of ideas for potential desiderata for a workflow. We&#8217;d expect most submissions to not be solely focused on a single layer, as we&#8217;re guessing for something to be useful it needs to work across the layers &#8212; but some discipline in separating these responsibilities may be useful for producing interoperable, shareable, compounding benefits.</p><h3><strong>Ingestion</strong></h3><p>How do you take a messy, multi-source evidence base and turn it into something structured enough to reason over?</p><ul><li><p>Extract and attribute claims to specific sources, with provenance metadata (who said what, when, in what context).</p></li><li><p>Identify when the same claim appears across multiple sources in different forms.</p></li><li><p>Search for resources with bearing on topics and subtopics at hand.</p></li><li><p>Capture useful metadata tags. For example relating sources and claims to topics and other sources (toward structure) or about methodologies, deference, and assumptions (toward assessment).</p></li></ul><h3><strong>Structure</strong></h3><p>How do you document the relationships between claims so that the full shape of the argument becomes navigable?</p><ul><li><p>Resolve the inference structure: which claims and evidence are offered as support for which other claims.</p></li><li><p>Represent the discourse structure: where people are addressing different sub-questions and perhaps how they are tracking those relating to an overall inquiry &#8212; there may be explicit, and sometimes implicit, differences of emphasis.</p></li><li><p>Capture relationships regarding &#8220;similar but not identical&#8221; claims. These could be different ways of framing conditions or caveats to statements, or different estimates of uncertainty for quantities or propositions.</p></li><li><p>Track how the structure evolves over time.</p></li></ul><h3><strong>Assessment</strong></h3><p>How do you evaluate what to actually believe, or what to look at next, given everything above?</p><ul><li><p>Identify rhetorical moves that carry more persuasive weight than evidential weight.</p></li><li><p>Flag correlated evidence being treated as independent.</p></li><li><p>Identify cruxes, i.e. the specific factual or inferential disagreements that, if resolved, would most change the overall picture (perhaps drawing on debate <em>structure</em>).</p></li><li><p>Surface what&#8217;s <em>missing</em> &#8212; important sources or perspectives that aren&#8217;t represented in the working knowledge base, toward further data collection (or hinting at additional primary information collection and reasoning).</p></li><li><p>Provide frameworks for calibrating confidence that account for out-of-model error, adversarial information environments, and the limits of any single analyst&#8217;s expertise.</p></li><li><p>Distinguish what the debate <em>settled</em> from what it merely <em>performed settling</em>.</p></li></ul><h2><strong>What a good entry looks like</strong></h2><p>We&#8217;ll offer a minimum of $5k to entries which we judge to meaningfully improve on the state of the art in faithful, scalable AI-assisted investigations, and up to $50k for entries which are truly inspiring to us. This might be by (for example) reliably producing accessible, thorough, highly-interoperable knowledge-enabling content across diverse domains which is readily shared and expanded on by others.</p><p><strong>We aren&#8217;t prescribing a single, specific type of submission</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a><strong>.</strong> A couple shapes we&#8217;d be excited to see:</p><ul><li><p><strong>A spec</strong> describing a step-by-step process of a human-AI workflow for producing a structured epistemic analysis of a complex dispute. Demonstrate it on multiple part(s) of at least two cases. The workflow can incorporate human steering and be subjective in places, but should let others (even with differing beliefs and preferences) usefully pick up where another left off, and it should gracefully scale to mostly-or-entirely &#8216;hands free&#8217;. Make clear where your design choices are uncertain, and be transparent about where you&#8217;re making tradeoffs, and why.</p></li><li><p><strong>A prototype tool</strong> (most likely a pipeline involving LLMs) that implements one or multiple layers of the stack, demonstrated in a repeatable way on each of the case studies. Minimally, it should substantially accelerate users&#8217; investigation of a topic, and ideally it should produce reusable, shareable knowledge artefacts which stand up to adversarial pressure.</p></li><li><p><strong>A protocol</strong> enabling<strong> </strong>interoperability and compounding without flattening the underlying material, demonstrated with reference to our cases. How can we navigate the tension between interoperability and nuance? What does a format look like that&#8217;s flexible and general enough to link diverse subtopics and complex, multi-perspective investigations while preserving important detail? How can it be maintained over time in a way that plays well with newly-emerging sources, a diverse and changing user base, and an expanding frontier of AI capability for tooling?</p></li></ul><p>A submission might be of a different shape, look like one of these, or may combine these (for example a spec including protocol discussion and a reference prototype). Some stepping-stone alternatives which could contribute to putting a team in a great position to achieve the biggest wins (but which we expect are unlikely to win the biggest prizes without follow-up work):</p><ul><li><p><strong>A comparative analysis</strong> repeatably<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a><sup> </sup>applying two or more different AI assessment methodologies to the same (sub-)questions from the topic, with explicit discussion of where they agree and diverge. What downstream considerations do they best enable? What are their strengths and shortcomings? What kinds of supporting epistemic metadata would help them to work better?</p></li><li><p><strong>A critique with counterexamples</strong> of an otherwise promising approach, demonstrating the importance of further work or indicating less tractability than we might have thought.</p></li></ul><p>Optionally, submit a description of your plan or a briefer, less complete implementation of it by Jun 21, 2026, and we will weigh in on whether the work seems on track for a prize (and potentially provide feedback). Use <a href="https://docs.google.com/forms/d/e/1FAIpQLScHGLJRH5ex27i0hpL0wPqyZFqp1ykYqFmJxgvg_zYKD6g1mw/viewform">the main submission form</a> and check the <em>early feedback</em> box.</p><p>What we care about most: <strong>Would this actually help someone reason better about this case? Does it generalize? Does it scale with improvements to AI or more compute? Does it compound, with multiple people or teams building on each others&#8217; work?</strong></p><p>We&#8217;ll ask judges to use the following criteria when assessing submissions: <a href="https://docs.google.com/document/d/1wtKAjpvEiMWn-RpFDi_2Vqcvt5i3sCFPmUt3MtsKOjo/edit?tab=t.v8o9nnadfvtm">Epistemic Case Study Competition - Judging Criteria</a>.</p><p>In addition to the potential prizes, strong entries that demonstrate real promise may also lead to an offer for further funded work with us (we estimate an 75% chance that a $50k-winning entry receives an offer like this).<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a></p><h3><strong>Please use <a href="https://docs.google.com/forms/d/e/1FAIpQLScHGLJRH5ex27i0hpL0wPqyZFqp1ykYqFmJxgvg_zYKD6g1mw/viewform">this linked form</a> to submit your entry; entries are due by Jul 19, 2026.</strong></h3><p><a href="https://docs.google.com/document/d/1rTX-Z23HuR68E9CKn87FmrMin9w_rOFjStr3HgzwlXA/edit?tab=t.0#heading=h.9kv9jflko3ab">FLF&#8217;s general contest rules</a> apply.</p><h2><strong>Prize structure</strong></h2><p>We&#8217;ve allocated roughly $200k for this competition with the size of any individual award reflecting how much an entry moves us. We&#8217;d rather award fewer, larger prizes for entries that genuinely impress us than spread the pool out. If a wave of strong work arrives, we&#8217;ll happily expand the total prize pool.</p><p>Concretely, we expect to award up to:</p><ul><li><p><strong>$50k</strong> &#8594; for an entry or entries we find truly inspiring. The kind of submission that changes how we think about the problem or that we&#8217;d want to point to as a new reference point for AI-assisted epistemic work. We may not award it at all &#8211; or &#8211; we may also award it more than once if multiple entries clear that bar.</p></li><li><p><strong>$5k to $50k</strong> &#8594; for entries that meaningfully advance the state of the art, whether across the full stack or on a well-defined piece of it (ingestion, structure, or assessment). The size of each award reflects how far the work pushes the field and we expect several entries to land somewhere in this range.</p></li><li><p><strong>Continuation funding</strong> &#8594; beyond prize money, we expect to fund individuals or teams to keep building, on terms agreed case by case. For the strongest entries this may be the real prize: an ongoing relationship with FLF and a path to sustained work on the stack. We&#8217;ll raise this with finalists after judging.</p></li></ul><h2><strong>Interested in participating or following along?</strong></h2><p>Want to compete, follow along, or join the conversation? <strong><a href="https://docs.google.com/forms/d/e/1FAIpQLSeBqNCI4Klaq6FO8CbhYCxr6cYAUMjeosExOjatfCHYfEvNVQ/viewform?usp=header">Express interest</a></strong> to receive updates, commentary, and see how you can participate as the competition unfolds.</p><h2><strong>Why we&#8217;re doing this</strong></h2><p>We&#8217;re building toward what we call a<a href="https://flf.org/projects/epistack/"> full epistemic stack</a>, layered infrastructure for making the provenance, structure, and assessment of knowledge transparent and traversable at scale. We think recent AI advances make this newly tractable, but the hard problems are in methodology and workflow design, as well as usability, not just capability.</p><p>Not only do we expect these tools to be of widespread benefit, but we expect some organizations like ours to be <a href="https://flf.org/timelines/">eager early adopters</a>. FLF hopes to meaningfully inform its strategy and prioritisation based on insights from these tools, meaning that great work here could move millions of dollars per year and help us (and others) be more effective.</p><h2><strong>We&#8217;re excited to see what you build.</strong></h2><p><em>Much gratitude to Ben Goldhaber (formerly FLF), Joel Chan, Saif Haobsh, Austin Chen, Andreas Stuhlm&#252;ller, and Dustin Kimmel for contributions and feedback.</em></p><h2><strong>The case studies</strong></h2><h3><strong>COVID</strong></h3><p>In early 2024, a $100,000 judged debate took place between Saar Wilf (founder of Rootclaim) and Peter Miller on the origins of COVID-19. Over 15 hours of structured argument, two smart people marshalled epidemiological data, viral genetics, Bayesian inference, and institutional analysis to reach opposite conclusions. Two expert judges ruled decisively for zoonosis. Six independent Bayesian analyses of the same evidence spanned 23 orders of magnitude.</p><p>For more read Scott Alexander&#8217;s<a href="https://www.astralcodexten.com/p/practically-a-book-review-rootclaim"> detailed writeup</a>. We feel that the debate videos, judge decisions, and comment threads it links to form one of the richest publicly available records of a complex real-world epistemic dispute on an important issue.</p><p>And yet all this information is still incredibly difficult to navigate, interrogate, and use to inform one&#8217;s beliefs.</p><ul><li><p>It requires significant background expertise to understand the state of play of the debate and make a considered judgement. The debate was overseen by two judges with PhDs who work as a professional microbiologist and applied mathematician, respectively.</p></li><li><p>The format, a live video debate, may not be the optimal way for a judge to interact with the material.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a></p></li></ul><p>Further, this intense epistemic effort represents a point in time in a conversation which continues to evolve.</p><p>We feel this makes it a strong stress test for tools and methods that aim to make reasoning more transparent, traversable, updateable, and trustworthy.</p><p>Your job: craft the AI-assisted methodologies that build a structure to help people navigate this topic successfully.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a></p><h4><strong>Starting material</strong></h4><ul><li><p><a href="https://www.astralcodexten.com/p/practically-a-book-review-rootclaim">Scott Alexander&#8217;s writeup of the COVID origins debate</a> (the core case material)</p></li><li><p><a href="https://drive.google.com/file/d/1YhmkYB32RpGsXvQTsX4xZ0Yul1wiwh8Z/view">Judge Will&#8217;s decision</a> | <a href="https://drive.google.com/file/d/1aHlhPd-16EOabzXhiajT5PBm3uVCAG3T/view">Judge Eric&#8217;s decision</a></p></li><li><p><a href="https://michaelweissman.substack.com/p/an-inconvenient-probability-v57">Michael Weissman&#8217;s Bayesian analysis</a> (an example of a more rigorous independent analysis)</p></li><li><p><a href="https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/">Rootclaim&#8217;s response</a></p></li><li><p>The debate videos: <a href="https://www.youtube.com/watch?v=Y1vaooTKHCM">Session 1</a> | <a href="https://www.youtube.com/watch?v=KdORmvU8MLI">Session 2</a> | <a href="https://www.youtube.com/watch?v=d1dbfoK8nSE">Session 3</a></p></li></ul><h3><strong>Black holes</strong></h3><p>CERN, home of the world&#8217;s largest particle accelerator, the Large Hadron Collider (LHC), has a frequently asked question: <a href="https://home.cern/resources/faqs/will-cern-generate-black-hole">Will CERN generate a black hole?</a></p><p>What??</p><p>As in <a href="https://blog.nuclearsecrecy.com/wp-content/uploads/2018/06/1946-LA-602-Konopinski-Marvin-Teller-Ignition-fo-the-Atmsophere.pdf">some previous science experiments</a>, noting that novel circumstances might produce unprecedented outcomes, some participants had apocalyptic concerns. How were these put to rest? (Were they truly? What does that hinge on?)</p><p>Unlike COVID, this is (we hope!) essentially a closed case, and uncontested. It nevertheless rests on a huge body of accumulated and interacting knowledge which enabled scientists (and the officials and public supporting them) to move forward with confidence.</p><p>The key challenge here may be in probing this argument for its dependencies and key considerations, and perhaps noting the weakest or most speculative points &#8212; all in an accessible way.</p><h3><strong>Eggs</strong></h3><p>Are eggs good to eat? Bad to eat? Great in moderation? How can we tell? Does it vary across people, and what predicts this? What else should we be paying attention to here?</p><p>This vague and open-ended topic, though mundane, is representative of a huge number of everyday questions &#8212; and hopefully also a microcosm of many more impactful debates. Sometimes getting resolution on <em>what are the important things to answer</em> and <em>what are the appropriate ways of knowing</em> is (more than) half of the challenge.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Forecasting the shape and capability of future AI is difficult, but we are excited to imagine a world where epistemic investigations of this (and greater!) quality are commonplace. We&#8217;re aiming to catalyse that path through activities like this competition.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>By <em>structure</em>, we mean capturing the relations between different sources, claims, authors, and so on. Who said what and when? What evidence or reasons support that? What counterarguments exist or reasons for doubt? We give further ideas below. Keeping this structure alive means less loss by compression, and preserving <a href="https://aiprospects.substack.com/p/when-ideas-round-to-false">space for nuance</a> &#8212; even if we don&#8217;t consume it right away.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>The human urge to apply technology to knowledge-provision isn&#8217;t new: consider libraries, citations, indexes, encyclopedias (including Wikipedia), databases, web search &#8212; all of which push the frontier in this space.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Written discussions should aim to not exceed 10 pages, not including appendix-like material and worked examples. Worked examples and fully-fledged example knowledge bases can be arbitrarily sized (within reason) but should be navigable. Consider including curated pointers to particularly effective regions of worked examples. Code should either be brief, legible (pseudo)code or well-documented and ready to install and run with close to a single click. <a href="https://docs.google.com/document/d/1wtKAjpvEiMWn-RpFDi_2Vqcvt5i3sCFPmUt3MtsKOjo/edit?tab=t.s5p8ga2p1drq">See here</a> for more detail.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>Ideally such that judges can easily reimplement on a new case.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>One type of further work might be incorporating workflows into forecasting and prediction &#8212; perhaps grounded in forecasting bot competitions.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>Rootclaim thinks that one reason they lost the debate was that the &#8220;structure provided a major advantage to the debater with more memorized knowledge of the issue&#8221;.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>We envision these as acting as eventually becoming living knowledge bases, not merely snapshots in time.</p></div></div>]]></content:encoded></item><item><title><![CDATA[The main impact from automated AI production: concentration of power?]]></title><description><![CDATA[Not all at once, and perhaps not kept in the same hands that first held it]]></description><link>https://www.oliversourbut.net/p/the-main-impact-from-automated-ai</link><guid isPermaLink="false">https://www.oliversourbut.net/p/the-main-impact-from-automated-ai</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Sun, 31 May 2026 20:39:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!n09L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There&#8217;s a lot of talk about <em>automated AI R&amp;D</em> and the like. It&#8217;s been discussed since <a href="https://intelligence.org/ie-faq/#elementor-toc__heading-anchor-1">at least 1965 when statistician I.J. Good coined the term &#8216;intelligence explosion&#8217;</a>:</p><blockquote><p>an ultraintelligent machine could design even better machines; there would then unquestionably be an &#8216;intelligence explosion&#8217;, and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make. &#8212; <em>Good, 1965, Speculations concerning the first ultraintelligent machine.</em></p></blockquote><p>Since the 2000s, a few researchers (recently increasingly many) have taken seriously the prospect that some kind of <em>intelligence feedback</em> mechanism could kick in around the time that AI becomes competent enough to contribute to its &#8216;own&#8217; (or rather, its successors&#8217;) development.</p><p>This is increasingly recognised as a prospect we should take seriously. It&#8217;s been discussed somewhat furtively by leading AI developers for many years, but in late 2025 it became a public talking point. Sam Altman, CEO of OpenAI, recently <a href="https://x.com/olysourbut/status/1991504547831800102">set a target of fully automated AI-improving-AI by 2028</a>. Jack Clark, co-founder of Anthropic, writes &#8216;<a href="https://importai.substack.com/p/import-ai-455-automating-ai-research">AI systems are about to start building themselves</a>&#8217;.</p><p>I&#8217;m not going to address the resulting prospect of <em>speedup</em> here, which I think is real, though perhaps surprisingly modest.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> (If, like some of my friends, you think I&#8217;m underweighting this prospect, you should probably be <em>even more</em> concerned about concentration of influence.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>)</p><p>Rather, this note is briefly making the rather simple (but I think rather overlooked) observation: <strong>plausibly automated AI R&amp;D would enable a (perhaps severe) </strong><em><strong>concentration of influence</strong></em><strong>&#8230; and this could be the </strong><em><strong>most important</strong></em><strong> effect to be paying attention to.</strong></p><p>It&#8217;s a fairly obvious point once made. A more thorough analysis might shed light on how likely this is, the most important consequences, and any leading indicators to pay attention to. I&#8217;ll only make some starting gestures toward these.</p><p>As I presaged in <em><a href="https://www.oliversourbut.net/p/engineering-a-safer-world">Engineering a Safer World</a></em>,</p><blockquote><p><strong>R&amp;D automation</strong>&#8230;: fewer human participants means fewer whistleblowers, less internal scrutiny, less governance and decisionmaking robustness. More concentration of that influence. That could mean more <em>single points of failure</em>. It&#8217;s famously difficult to maintain a conspiracy of more than one or two people: &#8216;two can keep a secret if one is dead&#8217;, as they say! And besides conspiracy, compared to larger teams, individuals and small groups may be far more susceptible to capture, coercion, corruption, or plain foolishness and rash decisionmaking.</p></blockquote><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe &#8212; don&#8217;t miss out due to foolishness and rash decisionmaking!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><h2>Whose influence?</h2><p>To be clear, you don&#8217;t need to be concerned about any <em>specific</em> people currently in positions of particular influence to think that concentration could be very impactful (and concerning).</p><p>The wisest supreme leader is still a single point of failure.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> Especially during a period of tumult, leaders can be displaced and reporting structures revolutionised. Concentration of influence &#8212; even the <em>prospect</em> of that concentration &#8212; simultaneously raises the incentives for corruption and power struggle (while raising the stakes for the rest of us). Meanwhile, power corrupts and a centralised decisionmaker can rarely account for all the relevant considerations, even if they mean to. In AI, we&#8217;ve already seen company leadership <a href="https://www.ft.com/content/8de92f3a-228e-4bb8-961f-96f2dce70ebb">falling out over differences</a> of opinion, fighting it out <a href="https://techcrunch.com/2025/06/03/the-openai-board-drama-is-reportedly-turning-into-a-movie/">in the boardroom</a> and <a href="https://en.wikipedia.org/wiki/Musk_v._Altman">in court</a>, and even <a href="https://en.wikipedia.org/wiki/Anthropic%E2%80%93United_States_Department_of_Defense_dispute">clashing with governments</a>. Turf war over the means of production of AI (which may increasingly equal the means of production full stop).</p><p>The point is that losing checks and balances can be a problem whoever is at the top.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> As in politics, so in the heights of industry &#8212; which, today, and increasingly, firmly includes AI.</p><p>Suffice to say that I imagine the kinds of roles in the AI oversight chain which might be (nominally, initially) preserved longest are senior research directors, company executives, and closely-involved regulatory or executive government officials.</p><p>Perhaps more important: whose influence <em>lost</em>? Directly, the skilled human labour which today is involved in researching, engineering, and deploying frontier AI systems. Via them, their wider research networks, social ties, and any societal oversight they might have provided (e.g. whistleblowing to journalists or authorities). Plausibly also the marketers, account managers, even lawyers and others who interface with what a company is <em>doing</em> with its AI products, and <em>their</em> networks. (If not themselves replaced, they&#8217;d lose collegial understanding and influence via the replaced researcher-implementers). In short: everyone else.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n09L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n09L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png 424w, https://substackcdn.com/image/fetch/$s_!n09L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png 848w, https://substackcdn.com/image/fetch/$s_!n09L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png 1272w, https://substackcdn.com/image/fetch/$s_!n09L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n09L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png" width="1456" height="741" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:741,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n09L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png 424w, https://substackcdn.com/image/fetch/$s_!n09L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png 848w, https://substackcdn.com/image/fetch/$s_!n09L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png 1272w, https://substackcdn.com/image/fetch/$s_!n09L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34c9f40c-a2a8-4160-acd0-baef73f1676e_2040x1038.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Sketched using claude.ai</em></figcaption></figure></div><h2>Concentration without (overt) malice</h2><p>Why shed the researcher-implementers, the deployment engineers, and others (or sideline them from the important &#8216;real&#8217; responsibilities)? It doesn&#8217;t have to be a power-grab, at first (or even at all) &#8212; it&#8217;s just good business! Human employees (especially software experts) are incredibly expensive compared with AI.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a></p><p>Beyond this, analogously to areas where AI already surpasses humans (like chess), human participants may positively <em>get in the way</em>, slowing and compromising what would not only be a <em>cheaper</em>, but a <em>more effective</em> AI-only workflow.</p><p>A company might choose to instead eat this cost &#8212; perhaps it prefers to exhibit loyalty to current staff, acknowledges concerns around erosion of human oversight, or is inherently conservative in that way. But in a competitive environment (either commercially or in strategic and security terms), that may be tantamount to inviting defeat or irrelevance, if competitors are charging ahead.</p><p>At some point shareholders and other stakeholders might think it irresponsible <em>not</em> to embrace a replacement AI workforce. Ultimately this could add up to far fewer people having insight into what&#8217;s going on or a say in its direction. Even if you can&#8217;t (or don&#8217;t want to) immediately fire them, you might deliberately or even passively sideline your human employees as more and more work is carried out by machines.</p><h2>Influence over what?</h2><p>Notably, board, investors, auditors, journalists, and regulators (who already often suffer an information scarcity) &#8212; and the broader society they represent &#8212; could also be substantially cut out if logging, analysis, PR, and incident reporting are automated&#8230; This might only require the usual levels of corporate paltering rather than special malice. It might even be prosocially-motivated if the relevant actors are (justifiably or not) concerned about a power-grab <em>by government</em>. (It could of course also be deliberately selfish.)</p><p>Conversely, a government or regulator which strong-arms its way into the midst of the development process might <em>itself</em> become the locus of concentrated influence. (This in turn might be superficially or genuinely motivated by national security concerns.) The more automated a firm, the easier it may be to seize, either overtly or through more subtle channels. When employees are critical pieces in a production process, that&#8217;s a taller order.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qhgA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qhgA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png 424w, https://substackcdn.com/image/fetch/$s_!qhgA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png 848w, https://substackcdn.com/image/fetch/$s_!qhgA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png 1272w, https://substackcdn.com/image/fetch/$s_!qhgA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qhgA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png" width="1456" height="741" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:741,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qhgA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png 424w, https://substackcdn.com/image/fetch/$s_!qhgA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png 848w, https://substackcdn.com/image/fetch/$s_!qhgA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png 1272w, https://substackcdn.com/image/fetch/$s_!qhgA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3750279-11af-4ac2-951a-3edaa3a71b12_2040x1038.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Sketched using claude.ai</em></figcaption></figure></div><p>I&#8217;m not suggesting any of this leads (immediately) to influence <em>over all the world</em> (though if you <em>do</em> take over the world, please consider <a href="https://strangecities.substack.com/p/so-youve-taken-over-the-world">my friend Owen&#8217;s advice</a> for what to do if you find yourself in that situation). Nor even over an entire economy or state. But minimally, this looks like highly concentrated influence over a <em>frontier AI company</em>: presently a rare, increasingly lucrative, and burgeoningly politically-influential machine.</p><p>Activity at that frontier &#8212; the decisionmaking by leading AI developers &#8212; stands a chance of being among the most economically, politically, and strategically important activity in the coming years. If the <a href="https://www.oliversourbut.net/p/is-the-cat-out-of-the-bag">cat is out of the bag</a> with AI, that frontier, <em>steered wisely</em>, might be an important part of <a href="https://helentoner.substack.com/p/nonproliferation-is-the-wrong-approach">a societal resilience and readiness</a> process (compare the recent <a href="https://www.anthropic.com/glasswing">Project Glasswing</a> in cybersecurity). Used foolishly or even maliciously, the frontier <a href="https://www.oliversourbut.net/p/engineering-a-safer-world?open=false#%C2%A7vicious-cycles">might degrade societal capacity</a> right at a time when it could be most important.</p><h3>Influence&#8230; or power?</h3><p>To repeat: I don&#8217;t think it&#8217;s given (or even particularly likely) that the <em>starting</em> possessors of a hypothetical concentration of influence would remain in control. From the point of view of a <em>rogue AI</em>, for example, control over a frontier AI company might be a delicious opportunity for expansion, defence, and societal dependency &#8212; ideal routes to <a href="https://www.oliversourbut.net/p/un-unpluggability-can-t-we-just-unplug-it">escalated un-unpluggability</a>. Likewise from the point of view of a power-hungry politician or corporate psychopath. Notice that corruptibility and coercibility are perhaps more plausible with fewer hands: it&#8217;s less practical to threaten, buy off, or muscle out a whole team or company than a single person.</p><p>So if this hypothetical automation-driven concentration of influence is a concentration of <em>power</em>, I don&#8217;t expect the ultimate wielder to be worthy of it. This marks a true difference of opinion with some others I know, who might think it more likely that, say, some current AI company leader could both accrue and keep hold of this level of power, and also use it wisely.</p><p>The obvious paths to <em>hard</em> power via AI are in surveillance &#8212; newly scalable with AI analysts scrutinising as many people&#8217;s movements as desired &#8212; and in development and automated production of flexible, high-range tools of coercion like weaponised drones. Other kinds of power might come from positions of economic leverage or narrative and propaganda dominance, conceivable with advanced-enough AI and without <a href="https://www.forethought.org/research/design-sketches-collective-epistemics">societal defences</a>. That's one point at which <em>concentration</em> could become <em>entrenchment</em> of power.</p><h3>Concretely, what might be different?</h3><p>Let&#8217;s really quickly gesture at two cases which might be substantially different in a world where AI production is largely automated.</p><p>Imagine a near future: frontier AI company revenues are in the <a href="https://epoch.ai/data/ai-companies?view=graph&amp;tab=revenue">100s of $billions or $trillions</a>, AI services are used in civil services, executive decisionmaking, <a href="https://linch.substack.com/p/claude-author-of-the-humanitas">faith leadership</a>, &#8230; That&#8217;s a lot of points of leverage for subtle manipulation. Suppose a leader (selfishly or under pressure) intends to inject <a href="https://newsletter.forethought.org/p/a-research-agenda-for-secret-loyalties">secret loyalties</a> into their products. If they&#8217;re sole controller without checks and balances, that could be trivial, with obvious concerning consequences. <em>Without</em> automation-replacement, there are perhaps dozens or hundreds of people with reasonable oversight per major release, and several people even for any given small change. That&#8217;s a far more difficult system to twist.</p><p>Alternatively, consider a company on the brink of <a href="https://keepthefuturehuman.ai/chapter-5-at-the-threshold/">truly autonomous, general, human-surpassing AI</a>: a huge responsibility. Many ambitious but underwrought <em>alignment targets</em> have been suggested for such a project, many of them <a href="https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities#Section_B_2___Central_difficulties_of_outer_and_inner_alignment_">foolish, perhaps fatally so</a>. A leader hubristic without serious checks might withdraw from critique, double down (perhaps sycophantically egged on by AI toadies), and drive through a ruinous agenda. Society at large might be none the wiser (or at least insufficiently alert and able to intervene). A leader <em>willingly</em> taking risks (perhaps with a selfish or misguided vision of unfathomable rewards) would be even more of a problem. On the other hand, a leadership more dependent on staff and teams may be subject to more psychological and cognitive support, scrutiny, moral encouragement, and so on. A literal healthcheck. Of course, human teams can be subject to concerning groupthink (not unheard of at AI development companies!), but the more off-base a suggestion is (ethically or pragmatically), the more likely it is that the collective intelligence of the team corrects it.</p><p>Any number of scenarios can be considered; rarely do they appear more promising from a societal perspective if a single or a few leaders are insulated from scrutiny and critique.</p><h2>What&#8217;s the outlook?</h2><p>It looks very unclear to me. Plausibly it&#8217;s in the balance and decisions and conversations now will shape how this plays out!</p><p>There are some reasons to think that (near-) automation of AI production could instead produce <em>equal</em> or even <em>improved</em> societal oversight.</p><p>For one, the sometimes unintuitive effect of automation is to increase per-worker productivity, making it more worthwhile to bring <em>more</em> humans into the process (even on myopic financial grounds). Historically this effect has anyway often presaged a sharp decline once automation reaches a sufficient level. This might make concentration actually <em>more difficult</em> for a time. I don&#8217;t expect this to play out for very long in AI frontier research<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>, but I&#8217;m somewhat uncertain about this.</p><p>Directed sensibly, a glut of researcher-grade AI worker-equivalents <em>could</em> be tasked with analysis, scrutiny, decisionmaking robustness, even auditing and whistleblowing. There&#8217;s nothing in principle preventing this, for any given level of trustability in our AI. Perhaps we&#8217;ll end up bringing <em>more</em> people into the process in the increasingly important, complementary roles of oversight, decisionmaking, and direction-setting. These folks could be <a href="https://www.oliversourbut.net/p/ai-for-human-reasoning-for-you">made more effective with AI tools</a> and assistance. That&#8217;s a decision to be made.</p><p>As I suggested earlier, leaders perhaps soon faced with these temptations may balk, either at the moral prospect&#8230; or at the pragmatic prospect of being subsequently replaced, captured, or corrupted in the ways I hinted at.</p><p>Perhaps it&#8217;s a tenuous balancing act. <em>Internally</em> to a given AI producer, we might want to guarantee a large enough &#8216;in the loop&#8217; workforce that we can trust internal deliberation and whistleblowing and so on to keep things on the rails. <em>Outside</em> that, we might want to avoid <a href="https://www.oliversourbut.net/p/is-the-cat-out-of-the-bag">no-holds-barred proliferation</a> of AI-producing capability &#8212; but we probably want at least a few developers to keep each other in check and reduce monopoly or kingmaker dynamics.</p><p>I don&#8217;t expect any such concentrations of influence to play out overnight. <a href="https://www.oliversourbut.net/p/engineering-a-safer-world">Loss of control is a process, not a moment</a>. AI development organisations &#8212; and society at large &#8212; should be paying attention and having the necessary conversations. Much to look out for, and perhaps much to look forward to!</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p><a href="https://www.oliversourbut.net/p/a-slightly-mechanistic-theory-for">Task-relevant data and compute</a> look like the more biting bottlenecks, and <a href="https://www.oliversourbut.net/p/you-cant-skip-exploration">research taste accrues mainly through expensive experimentation</a>.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>(unless you expect such accelerated singularity that influence prior is all that matters)</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>And may I be forgiven for slandering the current crop of potential supreme leaders as <em>not very wise</em></p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>I would note that we usually need <em>some</em> concentrated executive and representative power in various roles in society &#8212; but we want those roles to be sufficiently monitored, and we want the processes which move people into (and out of) them to be sufficient to produce worthy selections.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>This &#8216;humans expensive, AI cheap&#8217; point is worth taking some care over. Contemporary frontier AI training is getting more expensive by the year. <em>Running</em> the best AI can be increasingly expensive, because &#8216;thinking harder&#8217; is an effective way to scale capabilities on current margins. But once a capability is unlocked at the frontier, it typically becomes rapidly and radically cheaper, due to ongoing rises in compute efficiency and the ability to distill long thoughts into quicker reflexes. Further, you only need to train AI to do something once, and then it can be copied and run as many times as you want, it doesn&#8217;t need to sleep or take family time or get sick etc.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>For example, witness the periodic furore among tech company employees when their work is used toward military ends (<a href="https://en.wikipedia.org/wiki/Anthropic%E2%80%93United_States_Department_of_Defense_dispute">latest</a>).</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>One reason is that this research does not scale very well in parallel, so returns diminish steeply per worker &#8212; and we&#8217;re already imagining AI that can substantially slot in <em>as a cheaper replacement worker</em> in most cases anyway.</p></div></div>]]></content:encoded></item><item><title><![CDATA[A (Slightly) Mechanistic Theory for Exponentially Increasing AI Time Horizons?]]></title><description><![CDATA[AI &#8216;time horizons&#8217; are mostly not about time (I think it&#8217;s mostly &#8216;data&#8217;, but you&#8217;ll see where I&#8217;m unsure)]]></description><link>https://www.oliversourbut.net/p/a-slightly-mechanistic-theory-for</link><guid isPermaLink="false">https://www.oliversourbut.net/p/a-slightly-mechanistic-theory-for</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Sun, 24 May 2026 15:50:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!WkdD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>One chart from 2025 has become perhaps the most (in)famous in modern AI commentary.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WkdD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WkdD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png 424w, https://substackcdn.com/image/fetch/$s_!WkdD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png 848w, https://substackcdn.com/image/fetch/$s_!WkdD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png 1272w, https://substackcdn.com/image/fetch/$s_!WkdD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WkdD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png" width="1300" height="776" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:776,&quot;width&quot;:1300,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WkdD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png 424w, https://substackcdn.com/image/fetch/$s_!WkdD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png 848w, https://substackcdn.com/image/fetch/$s_!WkdD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png 1272w, https://substackcdn.com/image/fetch/$s_!WkdD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F461157c5-eced-4827-b347-4d2d7a00d371_1300x776.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For those in the know, &#8216;<a href="https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/">the METR graph</a>&#8217;<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> is unusually compelling because it achieves what so few measures of AI progress have achieved: a <em>somewhat</em> meaningful Y axis (&#8216;time horizon&#8217;<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>) <em>as well as</em> a somewhat predictable trend over time! (This is <a href="https://forum.effectivealtruism.org/posts/P8jsAySQzfgkeoDgb/ai-benchmarking-has-a-y-axis-problem">remarkably rare</a>!)</p><p>Frustratingly, the only <em>superficially</em> available takeaway is something like, &#8216;the line goes up straight-ish over time&#8217;. This is better than nothing, but it&#8217;s very dissatisfactory from the point of view of getting confidence in the predictions, because it exposes no deeper mechanism. This drives a lot of confusion and argument about the implications.</p><p>A deeper mechanism would be good for two reasons:</p><ul><li><p>It enables a sanity check on the trend, perhaps enabling more confidence in its predictions than we would sensibly allow with only the surface understanding.</p></li><li><p>It gives some way to interrogate when and how the trend might <em>change</em> (because if the deeper mechanism gets deflected, the superficial projection would be broken, but a prediction based on the deeper mechanism might stay viable for longer).</p><ul><li><p>(A sub-reason: if we <em>want</em> the trend to change, knowing some more mechanism might shed light on some levers to pull rather than sitting around to wait and see.)</p></li></ul></li></ul><p>As an analogy, a similarly superficial trend, <a href="https://ourworldindata.org/moores-law">Moore&#8217;s Law</a>, can be a little better mechanistically explained by the more general <a href="https://ourworldindata.org/learning-curve">Wright&#8217;s Law</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>. This is great, because that law covers more cases, and it can handle some deflection from the trend, or give some idea of when (and under what conditions) the trend might break. Important when looking at plausible futures, and how to steer toward desirable ones!</p><h2>Attempting to find some mechanism in the METR graph</h2><p><em>Warning: mild maths incoming (consider reading <a href="https://www.lesswrong.com/posts/zT76JcomKkdqo8tC6/a-slightly-mechanistic-theory-for-exponentially-increasing">on LessWrong</a> for better rendering)</em></p><h3>Task &#8216;length&#8217; and success modelling</h3><p>Why did METR focus on &#8216;task length&#8217;?</p><p>First, it&#8217;s not how long the AI agent takes. It&#8217;s how long the task in question takes a panel of sampled human experts, on average.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> So in their &#8216;time horizon&#8217; measurements, METR is capturing the <em>effective hours of human-expert-equivalent activity</em> that AI agents can carry out.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a></p><p>One way to think about the time it takes human experts to complete a task is that, for each <em>subtask</em> they had to know how to do (or be able to figure out how to do) and then successfully execute, the overall task takes incrementally longer. By how much? That depends on exactly what &#8216;subtasks&#8217; we&#8217;re imagining breaking things down into.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a> But <em>on average</em> longer tasks correspond to more distinct challenges, all else equal.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SVRC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SVRC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png 424w, https://substackcdn.com/image/fetch/$s_!SVRC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png 848w, https://substackcdn.com/image/fetch/$s_!SVRC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png 1272w, https://substackcdn.com/image/fetch/$s_!SVRC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SVRC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png" width="1456" height="944" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:944,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SVRC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png 424w, https://substackcdn.com/image/fetch/$s_!SVRC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png 848w, https://substackcdn.com/image/fetch/$s_!SVRC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png 1272w, https://substackcdn.com/image/fetch/$s_!SVRC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e971a8e-4d8e-47db-971c-71bdcd973b6e_1480x960.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A random generation of tasks (rows) with &#8216;subtasks&#8217; as segments, sorted by subtask count from least to most. You can see that the more subtasks, the longer, on average. It&#8217;s a little ragged &#8212; not all subtasks are the same length, so occasionally fewer, longer subtasks add up to more overall time than more, shorter subtasks. What METR can easily measure is the overall duration. Even if the subtask division is somewhat subjectively defined, duration stands as a reasonable proxy for it. Note that the vertical subtask count axis is sorted but not uniformly spaced. (Created with claude.ai.)</figcaption></figure></div><p><strong>This is the first piece of mechanism we should take into account.</strong> &#8216;Time&#8217; is not <em>agent time</em>: it&#8217;s a noisy estimate for &#8216;number of somewhat challenging requirements necessary to complete the task&#8217;.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a></p><p>This is treating <em>overall</em> tasks as formed by something like drawing &#8216;subtasks&#8217; out of a large collection of possible requirements. Given the agent&#8217;s general competence, specific knowledge, tools available, and opportunity to retry or learn on the fly, sometimes the agent can meet these requirements. Other times it can&#8217;t.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a> &#8216;Longer&#8217; tasks simply draw more subtasks (that&#8217;s why they&#8217;re &#8216;longer&#8217;, in this model: expert humans had more subtasks they needed to carry out).<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a></p><p>Toby Ord <a href="https://www.tobyord.com/writing/half-life">demonstrates one way to take this intuition further</a>, noting that if we explicitly model overall success <em>S</em> according to a simple model where chance of failure compounds with task &#8216;length&#8217;, <em>t</em>, we get a reasonable fit for the data METR collected. (Interestingly Toby mainly seems to continue treating this as &#8216;agent time&#8217;. I&#8217;ll instead take as given that we&#8217;re talking about a proxy for number of subtasks.)</p><p>In other words, for a given AI agent and task domain, there&#8217;s something like a &#8216;hazard rate&#8217;, <em>P</em> (per-subtask probability of failure), which reasonably well summarises (and predicts) the AI&#8217;s level of success in that domain:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S(t) = (1-P)^t&quot;,&quot;id&quot;:&quot;MGVXSTPEAH&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>(i.e. to succeed at a <em>t</em>-step task, the agent must <em>not fail</em> &#8212; must <em>avoid</em> the &#8216;hazard&#8217; <em>P</em> &#8212; <em>t</em> times.)</p><p>This enables us to translate back and forth between an estimate of this hazard rate <em>P</em> and an estimate of a &#8216;half-life&#8217; or 50% success horizon &#8212; how &#8216;long&#8217; (i.e. complex) a task needs to be before the agent fails more often than not &#8212; and also to extrapolate to &#8216;durations&#8217; corresponding to other reliability levels, like 99% or 99.9%.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a></p><p>In this formulation, the hazard rate, <em>P</em>, stands in for what fraction of our &#8216;subtask&#8217; pool the agent can&#8217;t (yet) succeed at, which ends up being a reasonable summary of the agent&#8217;s competence in this domain.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-i9y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-i9y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png 424w, https://substackcdn.com/image/fetch/$s_!-i9y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png 848w, https://substackcdn.com/image/fetch/$s_!-i9y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!-i9y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-i9y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png" width="1456" height="992" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:992,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-i9y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png 424w, https://substackcdn.com/image/fetch/$s_!-i9y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png 848w, https://substackcdn.com/image/fetch/$s_!-i9y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!-i9y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a81aba4-8733-4848-9f0b-8dc1a84ae8ee_1480x1008.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This time, we&#8217;re looking at overall task success as if the agent has a 98% chance of meeting any particular subtask&#8217;s requirements. Sometimes a shorter task will happen to have one of the difficult subtasks &#8212; but usually they&#8217;re overall successful. As tasks get longer, there&#8217;s a greater chance that at least one subtask requirement is insurmountable at this reliability level. Among longer tasks, overall success becomes fewer and farther between. This agent can&#8217;t expect to often succeed on tasks longer than 50 or so subtasks.</figcaption></figure></div><p>If you have a new task, you don&#8217;t know if the agent has all it needs to complete it. But the task &#8216;length&#8217; is an indicator of how many tricky subtasks it has, and similar-lengthed tasks will have similar numbers of such subtasks &#8212; so their average success rate is a good estimate for how likely the agent is to succeed at this new task.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">There&#8217;s a hazard rate that you might forget to subscribe for more insight!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>Relating hazard rate with frontier AI development</h3><p>METR&#8217;s graph is compelling because it suggests a steadily increasing frontier of success horizon as AI developers produce new agents over time.</p><p>What does this imply if we interrogate our hazard rate model? Well, &#8216;half-life&#8217; (and indeed various success-level horizons) is observed apparently growing exponentially with date <em>D</em>:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;t_{1/2} = \\alpha 2^{\\beta D}&quot;,&quot;id&quot;:&quot;NFBVIAZAUH&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>This is the central striking takeaway from the METR graph (modulo their measurement uncertainty). Half-life go up!</p><p>But half-life according to our model has:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;t_{1/2} = \\frac {1} {- \\log_2 \\left( {1-P} \\right)}&quot;,&quot;id&quot;:&quot;AOMLSUXVVC&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>where <em>P</em> is the per-step hazard rate from before. When this <em>P</em> is not too close to 1, that half-life is, fairly intuitively, approximately proportional to the reciprocal of the hazard rate:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;t_{1/2} \\propto \\frac 1 P&quot;,&quot;id&quot;:&quot;RBSNLXGXLG&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>So <strong>METR&#8217;s observation of rising time horizons is equivalent to saying that the frontier </strong><em><strong>hazard rate</strong></em><strong> is </strong><em><strong>shrinking</strong></em><strong> exponentially over time</strong>.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;P \\propto 2^{-\\beta D}&quot;,&quot;id&quot;:&quot;WSEZQMQFVN&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>Recall that this hazard rate corresponds with the fraction of &#8216;subtasks&#8217; in a domain that an agent doesn&#8217;t yet know how to complete. So this fraction is presumed to shrink roughly exponentially with date, in turn driving the observed &#8216;longer&#8217; success horizons.</p><h3>Why does hazard rate shrink with date?</h3><p>Here&#8217;s where to look for the next bit of mechanism. Why would the hazard rate, the fraction of &#8216;subtasks&#8217; which remain out of reach, shrink in that way?</p><p>It goes without saying that AI developers are chasing after increasing competence in their products, so (if they are doing anything at all right!) the direction of movement is unsurprising. Why that particular roughly-exponential form, though?</p><p>I confess here I&#8217;m uncertain and the quest for more mechanism continues.</p><p><strong>My best guess is that it&#8217;s about the </strong><em><strong>effective evidence</strong></em><strong> available to the agent toward subtask solution strategy.</strong> Intuitively, if you&#8217;ve seen <em>very similar</em> subtasks <em>many</em> times before, it&#8217;s hard to go too wrong. If you&#8217;ve only seen <em>vaguely</em> similar subtasks once or twice, you&#8217;re in much less familiar territory and stand a good chance of stalling. Suggestively, effective evidence and training data are both <em><a href="https://en.wikipedia.org/wiki/Information_content">information-like</a></em> quantities, but I don&#8217;t want to make too much of that without a crisper connection. Formally, we could consider how many <em>bits</em> of evidence the agent can muster about how to proceed (either from past learning or by <a href="https://www.oliversourbut.net/p/you-cant-skip-exploration">exploring in context</a>).</p><p>In other words, <em>training</em> produces <em>learnings</em>. These range from broad, <em>generally-applicable</em> heuristics for adaptable, effective behaviour (experiment, test your work, notice when something surprising happens, read the manual if you can find one, accrue power and resources at any opportunity, ...), to narrow <em>specific</em> details about particular situations and activities (Earth&#8217;s radius is roughly 6.4 megameters, detonating TNT yields roughly 4.2 kJ/g, humans succumb to oxygen deprivation after around 5 minutes, &#8230;). Ahem.</p><p>Empirically, AI developers have historically poured <em>something like</em> exponentially increasing &#8216;quantities&#8217; of &#8216;data&#8217; into their machine learning pipelines.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a> Mathematically, that implies a <em><a href="https://en.wikipedia.org/wiki/Power_law">power law</a></em>: data inputs <em>n_train</em> rising at one exponential rate, matched by hazard rate <em>P</em> decaying at another exponential rate.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\begin{gather}\nn_{\\text{train}} \\propto 2^{\\gamma D} \\\\\n\nP \\propto n_{\\text{train}}^{-\\beta/\\gamma}\n\\end{gather}&quot;,&quot;id&quot;:&quot;HZNGQVDKCQ&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>Power laws aren&#8217;t <em>deeply</em> mechanically explanatory, but they&#8217;re often the <a href="https://en.wikipedia.org/wiki/Neural_scaling_law">best we have</a> in machine learning, and are at least more predictable than mere date-based trends. Under the simple subtask model described here, this power law translates directly into a power law between &#8216;time horizon&#8217; and data. This is actually the same level of explanatory improvement offered by Wright&#8217;s Law over Moore&#8217;s: not fully mechanistic, but an extra layer of detail which offers firmer purchase on what&#8217;s going on.</p><p>What this doesn&#8217;t straightforwardly account for is the benefit to success rates of increased <em>in-context reasoning</em>, which is exhibited according to METR&#8217;s estimates. I expect this is operating on those borderline subtasks &#8212; where the agent would have some slim chance of satisfying them if it &#8216;rushed&#8217;. In those cases, &#8216;<a href="https://www.oliversourbut.net/p/better-than-logarithmic-returns-to">thinking harder</a>&#8217; may more effectively recall and combine the relevant learned knowledge, and allow better choices for <a href="https://www.oliversourbut.net/p/you-cant-skip-exploration">exploratory discovery in situ</a>. In any case, changing the thinking budget of an otherwise similar existing system certainly calls for a more mechanistic understanding than mere date-based trend extrapolation!</p><p>I would be thrilled if someone with more smarts, time to experiment, and access to data were to dig into ways we could match up various AI production inputs (especially &#8216;data&#8217; in various forms) with observed outputs like &#8216;time horizon&#8217;. One of the more difficult pieces might be quantifying &#8216;data&#8217;, especially teasing apart what types of evidence are &#8216;relevant&#8217; for the domain and tasks at hand.</p><h2>Upshot</h2><p>The kind-of-boring upshot of this is that data and &#8216;practice&#8217; on related tasks makes AI better at those tasks! This is boring because, <em>well obviously!,</em> we already basically knew that. But it&#8217;s encouraging because we can say a little more than that, which gives us some better grasp on what&#8217;s driving &#8216;time horizon&#8217; progress in particular domains &#8212; and it can help get more precise about predictions.</p><p>The fact that the &#8216;subtask&#8217; model &#8212; with a &#8216;hazard rate&#8217; of subtasks currently out of reach &#8212; is a fairly explanatory fit for capability profiles of individual agents is evidence that there&#8217;re not <em>unusual</em> amounts of generalisation capability in AI. As with humans, they can extrapolate a bit, but need &#8216;experience&#8217; and examples to succeed.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-14" href="#footnote-14" target="_self">14</a> Importantly, this means that <strong>vast </strong><em><strong>in silico</strong></em><strong> training ranges for software, cyber, and mathematics very likely </strong><em><strong>won&#8217;t</strong></em><strong> transfer much to other domains of interest</strong>, like interpersonal intelligence, medical discovery, bioweapons development, intelligence analysis, and robotic manipulation. Of course, like with every domain of human experience and activity, we have <em>some</em> relevantly-similar data already collected, and schemes can be devised to more rapidly <em>expand</em> that digitised experience bank for AI to learn from. Increasing adoption of AI in task-integrated contexts, industrial deployment, and even explicit approaches to gathering example data such as &#8216;<a href="https://www.technologyreview.com/2026/04/01/1134863/humanoid-data-training-gig-economy-2026-breakthrough-technology/">hand movement farming</a>&#8217; are the leading indicators to watch for progress in particular domains &#8212; not just the headline benchmark metrics in software-like tasks.</p><p>For some types of activity, developers are probably &#8216;running out&#8217; of raw example data to scrape from the internet. The <a href="https://www.oliversourbut.net/p/how-did-large-language-models-get">era of mostly-pretraining</a> is over. For domains which can be relatively easily verified, like mathematics and coding, this is <a href="https://helentoner.substack.com/i/161853197/what-makes-reasoning-models-so-good-at-math-and-code">very surmountable</a> &#8212; you can just run drills galore on a computer and get data that way. But this costs extra compute and <a href="https://epoch.ai/gradient-updates/how-far-can-reasoning-models-scale">doesn&#8217;t scale at the </a><em><a href="https://epoch.ai/gradient-updates/how-far-can-reasoning-models-scale">same</a></em><a href="https://epoch.ai/gradient-updates/how-far-can-reasoning-models-scale"> exponential rate for long</a> (perhaps 10x/year presently). As soon as this year, developers could be back to &#8216;only&#8217; <a href="https://epoch.ai/blog/training-compute-of-frontier-ai-models-grows-by-4-5x-per-year">scaling compute around 4x per year</a> (and a bit after that they might have <a href="https://epochai.substack.com/p/frontier-labs-dont-use-most-ai-compute">bought most of the compute</a>! &#8212; and will only be able to scale at the positively sloth-like <a href="https://ourworldindata.org/moores-law">1.5x-ish a year of underlying hardware progress</a>). I don&#8217;t feel confident extrapolating exactly where that cashes out, but if the data-driven subtask-learning model is right, it would imply <strong>we should see less steepness to the time horizon growth quite soon</strong>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-15" href="#footnote-15" target="_self">15</a></p><p>Some commentaries project that, once AI can autonomously do software and machine learning work reliably, it will thereafter enter a &#8216;recursive self-improvement&#8217; phase and rapidly colonise <em>all</em> capabilities. I don&#8217;t think this is missing the point entirely: there will be modest multipliers on the speed of the AI development pipeline, and we might see an &#8216;explosion&#8217; in the <em>speed</em> and <em>cost-effectiveness</em> of AI (because they are among the most immediately-verifiable properties to iterate on). But <strong>generalisation doesn&#8217;t come for free, so on-task data and compute will remain crucial to </strong><em><strong>broadening</strong></em><strong> the frontier of autonomous capabilities.</strong> <em>Collecting</em> that data and <em>manufacturing</em> that compute look to me like the rate-limiting steps, and therefore the major leading indicators to use in foresight. The best case I can make for a much more general explosion is if the speed and cost-effectiveness explosions rapidly accelerate the gathering and digestion of diverse task data &#8212; but I think that remains mostly rate-limited in the familiar ways: some domains easy and some more difficult. Don&#8217;t mistake me for ruling out across-the-board AI capability! Companies are charging ahead with data collection and set on automating much of their AI production pipeline. It just won&#8217;t happen overnight.</p><p><em>Thanks to Coz Ududec for a conversation prompting me to think about this.</em></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Produced by AI monitoring non-profit <a href="https://metr.org/">METR</a></p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Very importantly, it&#8217;s measured <em>within a particular collection of challenges/tasks</em> which are mostly associated with software development, especially ML engineering. METR also has<a href="https://metr.org/blog/2025-07-14-how-does-time-horizon-vary-across-domains/"> a great preliminary study</a> of some <em>other</em> domains, finding differing, but perhaps also somewhat predictable trends.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><a href="https://ourworldindata.org/moores-law">Moore&#8217;s Law</a> is the very superficial observation that, over time, the number of transistors per chip doubles roughly every two years. (More recently, it&#8217;s been more clearly expressed as the price per transistor<a href="https://ourworldindata.org/grapher/costs-of-66-different-technologies-over-time?country=~Transistor"> halving every year-or-two</a>.)</p><p><a href="https://ourworldindata.org/learning-curve">Wright&#8217;s Law</a> is the slightly more mechanistic and general observation that production of many commodities follows &#8216;learning curves&#8217;, such that each doubling of cumulative production produces roughly similar relative cost savings. (We can in turn attempt to explain this in yet more mechanistic terms, pointing to the insight gained from observing and recording many trials and experiments, with suitably diminishing returns.)</p><p>Now, <em>if</em> the quantity demanded and produced grows exponentially over time (as it has for computer chips), <em>then</em> Wright&#8217;s Law predicts comparable cost savings each year: Moore&#8217;s Law. If the quantity produced grows (or shrinks) in some <em>other</em> pattern over time, Wright&#8217;s Law, by accounting for this mechanistic detail, can<a href="https://web.mit.edu/mitssrc/nsf/papers/Nagy_Farmer_Bui_Trancik_2013.pdf">often forecast cost trends more reliably</a> than Moore&#8217;s.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Also note that the estimation of &#8216;task length&#8217; according to human experts was quite crude (naturally, humans are the most expensive part of most experiments!), and there are good reasons to<a href="https://www.transformernews.ai/p/against-the-metr-graph-coding-capabilities-software-jobs-task-ai"> treat the reported error bars as much too narrow</a>, i.e. misleadingly confident. I&#8217;ll use quotes around &#8216;time&#8217; related quantities in this post as a reminder that it&#8217;s a loose estimate of a crudely human-performer-derived time-to-completion for tasks, and doesn&#8217;t correspond well to real time as such.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>I don&#8217;t know if METR publishes how long the agents themselves take at these tasks &#8212; I don&#8217;t think so, and it&#8217;d arguably be ill-defined anyway since it would depend in part on how fast a computer you ran the agent on.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>If we conceptually carve up subtasks into smaller pieces, they'll be quicker per piece, but there are commensurably more of them, and vice versa.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>This could come apart if longer tasks are systematically more likely to include repetitive <em>similar</em> activities rather than a series of distinct ones, for example. Or longer tasks might tend to admit more truly alternative pathways. Both these effects could make longer tasks slightly easier than the naive picture. There are also higher-level &#8216;orchestration&#8217; tasks i.e. coherently coming up with (and executing and adapting) an appropriate sequential plan: perhaps these might be systematically more <em>difficult</em> for longer tasks.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>Notably, agents sometimes take a (relatively) longer time to do something that&#8217;s quicker for humans, and vice versa.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>Incidentally, success (or not) here already accounts for the agent attempting and re-attempting steps or fixing earlier mistakes, which might take variable amounts of time: another reason not to treat this as <em>agent</em> time. Some subtasks might be intermediate and succeed <em>sometimes</em> (for example if the agent can&#8217;t easily choose the best approach but sometimes hits on the right one, or sometimes gets stuck in a terminal cycle but sometimes makes lucky progress.)</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p>This is throwing away some detail: obviously not all subtasks are equally likely to follow from each other! There&#8217;s some correspondence between on-task sequences. But within a particular domain (like software engineering), this naive model of overall tasks combining subtasks somewhat randomly seems to do OK.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p>By the way, the <a href="https://en.wikipedia.org/wiki/Rule_of_72">rule of 72</a> provides a really quick mental approximation for the higher-reliability &#8216;time&#8217; horizons, depending on the &#8216;half-life&#8217; (the 50% &#8216;time&#8217; horizon).</p><p>Divide the &#8216;half-life&#8217; by 72. That&#8217;s the 1% <em>failure</em> horizon (equivalent to the 99% <em>success</em> horizon). Multiply by your target <em>failure</em> rate in percent, and you&#8217;re done: that&#8217;s your target success &#8216;time&#8217; horizon. E.g. if &#8216;half-life&#8217; is 1h, the &#8216;time&#8217; horizon at 99.9% is (1h/72)*(0.1) i.e. 5 seconds.</p><p>(This also reveals that cutting the &#8216;time&#8217; horizon tenfold cuts the average failure rate tenfold and so on.)</p><p>Going the other way, estimating long-horizon success rates, divide your target horizon by the &#8216;half-life&#8217;. That&#8217;s how many halvings of success to expect: raise one half to that power for your success rate. E.g. if &#8216;half-life&#8217; is 1h, your 24h success rate is (&#189;)^24 i.e. one in sixteen million.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-12" href="#footnote-anchor-12" class="footnote-number" contenteditable="false" target="_self">12</a><div class="footnote-content"><p>It didn&#8217;t have to be that way! A single number which manages to explain a lot of variation in agent capability is very suggestive of an underlying mechanism something like the &#8216;fraction of subtasks&#8217; model I&#8217;ve described here. Of course there is still some residual uncertainty and there may be better summaries available with a more detailed model or epicycles on this one.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-13" href="#footnote-anchor-13" class="footnote-number" contenteditable="false" target="_self">13</a><div class="footnote-content"><p>This may recently be trickier to measure as training pipelines have adapted to <a href="https://www.oliversourbut.net/p/reinforcement-learning-scaling-might">incorporate more reinforcement learning</a>, which means these experience data are less &#8216;homogeneously slurped up from the internet&#8217; and increasingly &#8216;proactively curated from in-domain training curricula&#8217;. So the mere quantity of data isn&#8217;t like-for-like over time.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-14" href="#footnote-anchor-14" class="footnote-number" contenteditable="false" target="_self">14</a><div class="footnote-content"><p>In fact contemporary AI is perhaps substantially <em>less</em> good at generalisation than humans, though I&#8217;d like to be better informed about how factors like sample efficiency of AI learning (including in-context learning) stack up.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-15" href="#footnote-anchor-15" class="footnote-number" contenteditable="false" target="_self">15</a><div class="footnote-content"><p>Actually saying something so bearish about AI makes me nervous, as there is a venerable history of people boldly declaring AI is <a href="https://x.com/peterwildeford/status/2021214568891154722">about to hit a wall</a>! But I think it&#8217;s borne out. I&#8217;m not saying progress <em>stops</em>, I&#8217;m saying it probably gets slower (in exponential terms).</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Engineering a Safer World]]></title><description><![CDATA[Risk Modelling &#8212; and Safety Engineering? &#8212; for AI Loss of Control]]></description><link>https://www.oliversourbut.net/p/engineering-a-safer-world</link><guid isPermaLink="false">https://www.oliversourbut.net/p/engineering-a-safer-world</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Sun, 17 May 2026 15:55:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Y1JJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Engineering a safer world.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y1JJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png 424w, https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png 848w, https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png 1272w, https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png" width="1456" height="947" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:947,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:409852,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/198135307?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png 424w, https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png 848w, https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png 1272w, https://substackcdn.com/image/fetch/$s_!Y1JJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa029c9-417f-4378-87f3-3a089345dc20_2208x1436.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I, and I imagine many of my readers, are eager to contribute to that effort<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>.</p><p>It&#8217;s also, conveniently, the title of a book!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!to9R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!to9R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png 424w, https://substackcdn.com/image/fetch/$s_!to9R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png 848w, https://substackcdn.com/image/fetch/$s_!to9R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png 1272w, https://substackcdn.com/image/fetch/$s_!to9R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!to9R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png" width="1456" height="1120" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1120,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1139336,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/198135307?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!to9R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png 424w, https://substackcdn.com/image/fetch/$s_!to9R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png 848w, https://substackcdn.com/image/fetch/$s_!to9R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png 1272w, https://substackcdn.com/image/fetch/$s_!to9R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63fccef-b184-4e5c-9a8b-7f958d93a995_2208x1698.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Engineering a safer world</em>, by Nancy Leveson</figcaption></figure></div><p>It&#8217;s not a book by me&#8230; but nevertheless I recommend it. You should consider reading it, especially the first chapters. You can find it <a href="https://direct.mit.edu/books/oa-monograph/2908/Engineering-a-Safer-WorldSystems-Thinking-Applied">online free from MIT</a>. It&#8217;s by MIT professor Nancy Leveson, computer scientist turned innovator in safety science, safety engineering, and software safety.</p><p>Generally, this book is a particularly approachable and competent example of <em>systems thinking</em> or what I might call a <em>cybernetic</em> perspective &#8212; applied to safety science. This essay is in part a brief book review of Leveson&#8217;s <em>Engineering a Safer World</em>.</p><p>I care about this systems perspective in part because... <strong>loss of control</strong>. It&#8217;s something which, I think, is quite intuitive for many people at a very abstract level. Of course we could lose control of &#8216;systems smarter than us&#8217;. But in my experience it tends to be quite slippery. It&#8217;s difficult to get a proper analytical grip on &#8216;loss of control&#8217;, and it&#8217;s difficult to communicate about.</p><p><strong>The systems thinking or cybernetic framing is one I find particularly helpful for getting to grips with &#8216;loss of control&#8217;</strong>, and it has other benefits besides. That&#8217;s the rest of what this essay is doing: how does systems thinking apply to AI and loss of control specifically?</p><p>I did some of this loss of control risk modelling at the UK government&#8217;s <a href="http://aisi.gov.uk/">AI Safety/Security Institute (AISI)</a> in 2024, and now this kind of risk modelling is part of the background strategy, prioritisation, and foresight work we do at the Future of Life Foundation<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>.</p><p><em>This essay corresponds closely to a talk I delivered at the <a href="https://tais2026.cc/">Technical AI Safety Conference 2026</a>.</em></p><h2><strong>Safety and systems</strong></h2><p>The first lesson from this perspective is: <strong>safety is a </strong><em><strong>system</strong></em><strong> property</strong>.</p><p>There&#8217;s never a single &#8216;root cause&#8217; of a disaster.</p><p>Even acute disasters have a <em>buildup</em>. The bit you see &#8212; the tip of the iceberg &#8212; is the acute disaster. An explosion. A crash. A fire. Human extinction. But the lead up to that always has systemic, process failures.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q0ga!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q0ga!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png 424w, https://substackcdn.com/image/fetch/$s_!q0ga!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png 848w, https://substackcdn.com/image/fetch/$s_!q0ga!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png 1272w, https://substackcdn.com/image/fetch/$s_!q0ga!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q0ga!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png" width="1456" height="868" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:868,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:322492,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/198135307?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q0ga!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png 424w, https://substackcdn.com/image/fetch/$s_!q0ga!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png 848w, https://substackcdn.com/image/fetch/$s_!q0ga!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png 1272w, https://substackcdn.com/image/fetch/$s_!q0ga!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab97705-b51e-46a1-97cc-4050bcf91b78_2760x1646.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Disasters might be <em>precipitated</em> by some particular hazardous activity, or even malicious action. But behind all of those is a network of interrelated systems and processes which could have &#8212; and we might say <em>should have</em> &#8212; done something to prevent the disaster.</p><p>These systemic weaknesses are like <em>health indicators</em> for the systems in question. Leveson encourages us to consider these <em>processes</em> and <em>systems</em> as the central objects of consideration when trying to engineer a safer world.</p><p><strong>&#8216;Loss of control&#8217; on this perspective is less of a single decisive </strong><em><strong>moment</strong></em><strong> and more of an unfolding process, or even a vicious cycle.</strong> We&#8217;ll return to the vicious cycles later.</p><h2><strong>Which systems?</strong></h2><p>OK, we need to pay attention to some systems. But there are lots of systems! Which systems?</p><p>There&#8217;s definitely an art to this &#8212; not one I claim particularly to possess. It&#8217;s here we need to apply contextual knowledge, analyst&#8217;s judgement, and whatever expertise we can muster. But there are some general principles the systems view can offer.</p><p>My statement of a principle I like: <strong>think up the chain</strong>.</p><p>In engineering, consider:</p><ul><li><p>A deployment. It goes with monitoring systems, maintenance systems, iteration and evolution processes.</p></li><li><p>Design, development, perhaps manufacturing. Research.</p></li><li><p>Those systems are in turn overseen by management which should include communication, reporting, decision-making systems. Passing down design constraints, objectives, resource allocation, prioritisation. Getting status updates, reports, outcomes.</p></li><li><p>...Even above that, we can start to talk about &#8216;governance&#8217;:</p><ul><li><p>Literal <em>governments</em>: regulatory activity</p></li><li><p>Also company boards and other governance commitments and structures</p></li><li><p>Courts</p></li></ul></li><li><p>We can also include &#8212; I think it&#8217;s important to include &#8212; broader societal sensemaking, democratic deliberation, how we are making sense of what&#8217;s happening as a society and controlling the processes unfolding around us.</p></li></ul><p>This is a very &#8216;cybernetic&#8217; perspective, one which I think is quite powerful.</p><p><strong>When these get corrupted, things get </strong><em><strong>out of control</strong></em><strong>.</strong></p><p>In her book, Leveson gives various worked examples. She also offers some generic examples as starting places for analysts like me.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CQmp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CQmp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png 424w, https://substackcdn.com/image/fetch/$s_!CQmp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png 848w, https://substackcdn.com/image/fetch/$s_!CQmp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!CQmp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CQmp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png" width="1152" height="1238" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1238,&quot;width&quot;:1152,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CQmp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png 424w, https://substackcdn.com/image/fetch/$s_!CQmp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png 848w, https://substackcdn.com/image/fetch/$s_!CQmp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!CQmp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96e11a0-ff06-4348-be19-5f1a04ed9373_1152x1238.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s one I find quite instructive. It&#8217;s actually one we used as a starting seed for some risk modelling at AISI.</p><p>In general, this systems perspective offers a really expressive <em>language</em> both for communicating and for analysing or troubleshooting. One of the beauties of this language is you can conceptually &#8216;zoom in and out&#8217;, asking for example which people and processes (subsystems) <em>make up</em> or <em>implement</em> a given system or relationship, and <em>their</em> requirements and so on. Or draw a bigger box abstracting around a collection of closely related systems for a more birds-eye view.</p><h3><strong>The regulatory lag dilemma</strong></h3><p>The accompanying text from this point in the book is sobering reading. Leveson writes,</p><blockquote><p>The only requirement is that responsibility for safety is distributed in an appropriate way throughout the sociotechnical system&#8230; <strong>If companies or industries are unwilling or incapable</strong> of performing their public safety responsibilities, then <strong>government has to step in</strong> to achieve the overall public safety goals. But a much better solution is for company management to take responsibility &#8212; <em>Leveson, Engineering a Safer World (4.2, The Hierarchical Safety Control Structure)</em></p></blockquote><p>but elsewhere in the same chapter,</p><blockquote><p>As in any control loop, time lags may affect the flow of control actions and feedback&#8230; For example, <strong>standards can take years to develop or change &#8212; a time scale that may keep them behind current technology and practice</strong>. <em>&#8212; Leveson, Engineering a Safer World (4.2, The Hierarchical Safety Control Structure)</em></p></blockquote><p>(emphases mine).</p><p>This dilemma of slow-moving, ponderous regulatory oversight is particularly pernicious in an area as fast-moving as AI. More on this dilemma later.</p><h2><strong>Which health indicators?</strong></h2><p>We have something like a picture of which systems might be important for safety.</p><p>Where do we look for health indicators? Relatedly, where are the hazards? And the flipside, where do we look for interventions for robustness?</p><p>A systems-theoretic lens has a few general suggestions.</p><h3><strong>&#8216;Control&#8217; systems</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WYWj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WYWj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png 424w, https://substackcdn.com/image/fetch/$s_!WYWj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png 848w, https://substackcdn.com/image/fetch/$s_!WYWj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png 1272w, https://substackcdn.com/image/fetch/$s_!WYWj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WYWj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WYWj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png 424w, https://substackcdn.com/image/fetch/$s_!WYWj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png 848w, https://substackcdn.com/image/fetch/$s_!WYWj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png 1272w, https://substackcdn.com/image/fetch/$s_!WYWj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0916a8b9-df0c-42f8-abbe-bf8bc933563c_1490x994.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Simplified from Figure 3.2, <em>Engineering a Safer World</em></figcaption></figure></div><p>This is a very generic picture of a control system. You&#8217;ll see pictures like it across biology, computer science, cybernetics, reinforcement learning, military strategy, engineering, control theory. Terminology might vary but it&#8217;s the same general picture, very straightforward and explanatory.</p><p>What&#8217;s a controller? It might be me riding my unicycle, it might be a government department attempting to understand and nudge and industry or even an economy, it might be a mouse looking for food, or an electronic component of an autonomous vehicle.</p><p>What does such a controller need to do a good job? It needs <em>sensing</em>: some way of getting observations, feedback about the relevant aspects of its situation and environment. It needs <em>actuation</em>: some way of acting back on the world, taking actions, applying influence. Between those, it needs <em>understanding</em> &#8212; something like a model, perhaps a &#8216;world model&#8217; &#8212; interpreting its observations and their implications, and the implications of its available options. And it needs a way of <em>deciding</em> appropriately what to do in order to carry out whatever responsibilities or agenda it has.</p><p>In the context of safety, we&#8217;re talking about <em>safety controllers</em>, applying safety constraints. It doesn&#8217;t mean <em>dictating</em> what happens in any &#8216;controlled&#8217; processes, just applying enough constraints that we can be confident that safety is maintained.</p><p>So: sensing, understanding, deciding, acting.</p><p>When analysing a given system, we can ask what it has by way of sensing, understanding, deciding, and acting. Are they adequate? Degradations to these look like hazards, or even attack surfaces in an adversarial setting. And improvements look like safety opportunities.</p><h2><strong>Control examples in AI safety</strong></h2><p>Let&#8217;s get more concrete.</p><p>Take the example of an AI deployer, and their opportunities for <em>sensing</em>. How do they &#8216;see&#8217; what&#8217;s going on? Logs! So <em>spoofed logs</em> or <em>tampered evaluations</em> or something like that would substantially compromise their ability to understand what&#8217;s going on. So this is potentially cutting off that sensing relationship. Alternatively, <em><a href="https://www.oliversourbut.net/i/136696970/expansionism-replication-propagation-growth">rogue replication</a></em> of an agent would also be a compromise to that sensing, because it would create an agent <em>outside</em> of the nominal sensing channels.</p><p>Now take an AI regulator, and their <em>understanding</em> and <em>deciding</em>. How might those be inadequate? They might simply have <em>insufficient capacity</em> for the needed analysis and foresight: I think a quite common failure mode for regulators. On the more adversarial side, they could be subject to <em>lobbying</em> or <em>capture</em> &#8212; and this could be compromising their understanding or even their decision-making process. <em>Even if</em> they have perfectly adequate sensing and actuation affordances in principle, they might not choose to use them in the way that we&#8217;d hope.</p><p>For a final example, take the slightly esoteric system &#8216;society&#8217; or &#8216;the public at large&#8217;, and its ability to <em>act</em>. Well, generic <em>economic or political disempowerment</em> of course (perhaps tautologically) means that even if there were actions we might have wanted to use to apply pressure to something we think is unsafe, those actions might not be available or effective any more. Also, <em>rogue replication</em>! Whatever actions society might have wanted to do to get an AI system under control, <a href="https://www.oliversourbut.net/i/136696970/expansionism-replication-propagation-growth">if it&#8217;s replicating that&#8217;s much harder</a>. Replication is a big deal, and we <a href="https://arxiv.org/abs/2504.18565">wrote a paper on it at AISI</a>.</p><h2><strong>An abstract picture of AI development and operation</strong></h2><p>Let&#8217;s really simplify and adapt Leveson&#8217;s generic development and operations diagram for the case of AI. This is sweeping huge amounts of detail under the rug for now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QB0U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QB0U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png 424w, https://substackcdn.com/image/fetch/$s_!QB0U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png 848w, https://substackcdn.com/image/fetch/$s_!QB0U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png 1272w, https://substackcdn.com/image/fetch/$s_!QB0U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QB0U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png" width="1456" height="1079" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1079,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QB0U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png 424w, https://substackcdn.com/image/fetch/$s_!QB0U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png 848w, https://substackcdn.com/image/fetch/$s_!QB0U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png 1272w, https://substackcdn.com/image/fetch/$s_!QB0U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F013de82d-b9ff-49a8-a369-4a42ac20f528_2030x1504.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Ridiculously simplified sociotechnical system for AI development and deployment</figcaption></figure></div><p>There&#8217;s an <em>agent</em>. That&#8217;s important. There&#8217;s a <em>user</em>. (That might be a person or an organisation or some other system, perhaps automated.)</p><p>The agent gets there by a combination of the output of a <em>research and development</em> process and control by a <em>deployment and monitoring</em> system. Those are in turn controlled or overseen by <em>company leadership</em>. (There might be more than one company involved.)</p><p>These companies are <em>meant to be</em> somewhat kept in order by <em>&#8216;society&#8217;</em> &#8212; think regulators, courts, insurance, public associations, and so on.</p><p>Already, even at this very coarse level of granularity, there&#8217;s a lot we could say by inspecting and enumerating all these relationships. What sensing and acting affordances are there (or might we want there to be) here? How are understanding and decision-making happening: could they be improved? And of course one of the nice things about this language is that we can recursively unpack things if we want, most obviously here in &#8216;R&amp;D&#8217; and &#8216;Society&#8217; which of course have far more detail to them.</p><h2><strong>What makes AI special</strong></h2><p>For now, I want to focus on something which makes AI a bit different.</p><p>Agents are increasingly capable and have increasingly general affordances. That means, <strong>whether directed to &#8212; or whether acting out a misaligned objective &#8212; AI can in principle </strong><em><strong>act back on</strong></em><strong> any level of this supposed control hierarchy</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KcsF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KcsF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png 424w, https://substackcdn.com/image/fetch/$s_!KcsF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png 848w, https://substackcdn.com/image/fetch/$s_!KcsF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!KcsF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KcsF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png" width="1456" height="887" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:887,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KcsF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png 424w, https://substackcdn.com/image/fetch/$s_!KcsF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png 848w, https://substackcdn.com/image/fetch/$s_!KcsF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!KcsF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95fc401e-855a-44a0-bab0-fa107ea2d3eb_2048x1248.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That turns this into potentially an <em>adversarial</em>, <em>security</em> engineering problem, and it brings us to the vicious cycles I mentioned earlier.</p><p>For me, it&#8217;s right to think of &#8216;loss of control&#8217; as often a matter of a <em>breach</em> or <em>compromise</em> at one or more of these levels, followed by <em>escalations</em>. Not just a singular event, but unfolding in potentially vicious cycles. Even if we can identify some final &#8216;point of no return&#8217;, preceding that there was some series of events that precipitated it.</p><p>There might be several acute events: perhaps it&#8217;s a breach of containment by an agent, or a passing into law of a bill harmful to oversight. Or the <em>failure</em> to pass a particular law, perhaps missing an opportunity window to do that. It might be a particularly ill-advised deployment decision. It might be a very consequential court ruling. <strong>We can think of these as the overall system migrating towards an increasingly, and eventually </strong><em><strong>entirely</strong></em><strong> irrecoverable total loss of control.</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">This blog offers you you an increasingly irrecoverable gain in insight</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><h2><strong>Vicious cycles</strong></h2><p>Some of these might be obvious, but let&#8217;s enumerate a few of those ways AI systems could act back on the oversight hierarchy.</p><p><strong>Propaganda, epistemic disruption, and political influence</strong> &#8212; harm democratic deliberation and political decisionmaking.</p><p><strong>Lobbying and legal influence</strong> &#8212; could erode regulatory and court oversight.</p><p><strong>R&amp;D automation</strong> &#8212; potentially interrupts quite a few feedback and oversight relationships! One which I think is underdiscussed: fewer human participants means fewer whistleblowers, less internal scrutiny, less governance and decisionmaking robustness. More concentration of that influence. That could mean more <em>single points of failure</em>. It&#8217;s famously difficult to maintain a conspiracy of more than one or two people: &#8216;two can keep a secret if one is dead&#8217;, as they say! And besides conspiracy, compared to larger teams, individuals and small groups may be far more susceptible to capture, coercion, corruption, or plain foolishness and rash decisionmaking.</p><p><strong>Backdooring, sabotage, poisoning</strong> &#8212; all great ways to break oversight in R&amp;D.</p><p><strong>Exfiltration, privilege escalation</strong> &#8212; these are bread and butter for compromising monitoring oversight, whether from &#8216;inside&#8217; or &#8216;outside&#8217;.</p><p><strong>Trickery and coercion</strong> &#8212; most tools <em>don&#8217;t</em> do this. AI <em>does</em>. Already this is sometimes breaking some people&#8217;s model of their relationship with AI and sometimes their wider situation.</p><h2><strong>Virtuous cycles?</strong></h2><p>BUT. Something else we can also do is ask how the new AI building blocks in our repertoire could be used to <em>fortify</em> some of these situations.</p><p>This is an area the Future of Life Foundation is really interested in, along with some collaborators at Forethought.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mqfe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mqfe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png 424w, https://substackcdn.com/image/fetch/$s_!mqfe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png 848w, https://substackcdn.com/image/fetch/$s_!mqfe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!mqfe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mqfe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png" width="1456" height="887" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:887,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mqfe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png 424w, https://substackcdn.com/image/fetch/$s_!mqfe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png 848w, https://substackcdn.com/image/fetch/$s_!mqfe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!mqfe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26275a71-3e8d-4bab-a527-353f3302f784_2048x1248.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Really briefly, some things that immediately stand out when looking at this picture.</p><p>There are so many tantalising opportunities for <a href="https://www.oliversourbut.net/p/ai-for-human-reasoning-for-you">improving how societies communicate and coordinate</a>! Tools supporting <a href="https://www.forethought.org/research/design-sketches-collective-epistemics">collective epistemics</a>, democratic discourse, conflict resolution, coordination.</p><p>AI tech itself is moving fast, and <em>other</em> tech may be accelerated and disrupted soon &#8212; but <a href="https://www.forethought.org/research/design-sketches-tools-for-strategic-awareness">&#8216;open source intelligence&#8217; and more targeted foresight</a> can both benefit from well-designed AI-powered applications. That could be one of the more promising ways to improve agility of societal decision-making &#8212; recall the governance dilemma of slow-moving government and regulatory oversight.</p><p>Ironically, R&amp;D automation or acceleration, <em>directed sensibly</em>, could provide an antidote to some risks. It&#8217;s double-edged. There are all kinds of possibilities here, including perhaps development of <a href="https://www.forethought.org/research/design-sketches-defense-favoured-coordination-tech#confidential-monitoring-and-verification">hardware and support for confidential oversight of high-stakes systems</a>. Applying AI-assisted efforts here might make a difference between having these useful aids at an important time or not.</p><p>Talk of <a href="https://www.lesswrong.com/w/scalable-oversight">scalable oversight</a> and <a href="https://www.technologyreview.com/2026/04/30/1136721/this-startups-new-mechanistic-interpretability-tool-lets-you-debug-llms/">automation of AI interpretability</a> inherently depend on the right capabilities being reliably usable from AI.</p><p>Being able to attribute AI outputs, and perhaps more importantly AI <em>activity</em>, could be crucial to keeping the running of economic, industrial, cultural, and other societal functions understandable and monitorable. It might be that AI-powered &#8216;forensics&#8217; and provenance investigations end up making this more achievable.</p><p>Counteracting deception and other manipulations, we may be in a position to develop <em><a href="https://www.forethought.org/research/design-sketches-angels-on-the-shoulder">fiduciary</a></em><a href="https://www.forethought.org/research/design-sketches-angels-on-the-shoulder"> or </a><em><a href="https://www.forethought.org/research/design-sketches-angels-on-the-shoulder">guardian</a></em><a href="https://www.forethought.org/research/design-sketches-angels-on-the-shoulder"> AI systems</a> &#8212; tools and assistants which people can slot into their workflows to protect them from trickery and other confusions as the world gets more complex around us.</p><h2><strong>Summing up</strong></h2><p><a href="https://direct.mit.edu/books/oa-monograph/2908/Engineering-a-Safer-WorldSystems-Thinking-Applied">Engineering a Safer World</a> &#8212; worth a read! Especially the first chapters.</p><p>Safety is a <em>system</em> property: even acute disasters have systemic failures we can make healthier.</p><p>Think <em>up the chain</em>: safety in something as encompassing as AI is a societal-level sociotechnical problem. Compromise at any of these levels could <em>initiate</em> or <em>escalate</em> loss of control.</p><p>AI is very special; it can &#8216;act back&#8217; on the nominal control hierarchy. That could give rise to vicious cycles, one of the main classes of mechanisms by which AI loss of control could escalate to truly terminal states.</p><p>Not all cycles are vicious! AI, <em>used well</em>, could also help robustify and improve the health of some of the mechanisms and systems we use to live and work together safely.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Modulo concerns about <em>over-eagerly</em> optimising for safety, especially where that cuts into other aspects of flourishing.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Not to be confused with the Future of Life <em>Institute</em>! We're good friends, but different organisations with different approaches.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Reinforcement learning scaling might incentivise hidden reasoning architectures for AI]]></title><description><![CDATA[Another reason really competent AI might lie to you]]></description><link>https://www.oliversourbut.net/p/reinforcement-learning-scaling-might</link><guid isPermaLink="false">https://www.oliversourbut.net/p/reinforcement-learning-scaling-might</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Sun, 10 May 2026 14:45:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Iv6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In short: the <em>transformer</em> architecture brought massive scale to AI, and <em>also</em> provided partial guarantees of &#8216;reasoning out loud&#8217;, an unprecedentedly interpretable situation for AI. Reinforcement learning (RL) may be less compatible with the transformer architecture, and RL is being scaled up at the frontier of AI. So we might see the end of the &#8216;reasoning out loud&#8217; era for AI.</p><p>This is part 2 of a series about LLM architecture and some implications for reasoning and transparency. <a href="https://www.oliversourbut.net/p/how-did-large-language-models-get">Part 1 explained</a> how we got where we are. Here, we look at where we might be going soon &#8212; does visible reasoning go away?</p><h2>Hidden reasoning</h2><p>There&#8217;s a term that&#8217;s caught on somewhat to describe deep learning: <em>inscrutable</em>. Deep learning centrally relies on (large) neural networks, whose workings are famously impenetrable to those training them or operating them alike. I won&#8217;t expand much on that here.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><p>The key thing is: a neural network takes inputs, converts them into an opaque and idiosyncratic internal language, then computes outputs. The arcane bit in the middle goes by many names: deep embeddings, neuralese, activation space, latent space. I&#8217;ll call it <em>hidden reasoning</em>.</p><p>Humans do this too, of course! You can&#8217;t tell what someone&#8217;s thinking or planning unless they tell you <em>and</em> they&#8217;re honest about it. (And apart from the surface thoughts we&#8217;re aware of, we don&#8217;t even know most of what&#8217;s going on in <em>our own</em> subconscious.)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Iv6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Iv6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png 424w, https://substackcdn.com/image/fetch/$s_!-Iv6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png 848w, https://substackcdn.com/image/fetch/$s_!-Iv6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png 1272w, https://substackcdn.com/image/fetch/$s_!-Iv6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Iv6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png" width="1024" height="572" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:572,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Iv6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png 424w, https://substackcdn.com/image/fetch/$s_!-Iv6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png 848w, https://substackcdn.com/image/fetch/$s_!-Iv6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png 1272w, https://substackcdn.com/image/fetch/$s_!-Iv6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6af9c1-6fdb-4289-a7b1-cc6201c16f28_1024x572.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Widget Corp Corp! Honest, earnest, helpful? Who knows? (Image gen using <a href="http://gemini.google.com">gemini.google.com</a>)</figcaption></figure></div><h2>A dash (em-dash?) of luck: &#8216;thinking out loud&#8217;</h2><p>Hidden reasoning is concerning. Worst case, we have <a href="https://www.cold-takes.com/why-ai-alignment-could-be-hard-with-modern-deep-learning/">scheming AIs</a>. <em>Best</em> case, it&#8217;s hard to know how much to trust the outputs &#8212; not because you think the AI is deceiving you, but because you just don&#8217;t know what it&#8217;s taking account of, and how, in its decisions.</p><p>But we got quite lucky for a spell! Scaling language models turned out to be the easiest path to general purpose reasoning AI, and not only that, but the famed &#8216;<a href="https://arxiv.org/abs/1706.03762">attention is all you need</a>&#8217; transformer architecture <a href="https://www.oliversourbut.net/p/how-did-large-language-models-get">which got us here</a> places <em>substantial limits</em> on the amount of <em>hidden</em> reasoning the AI does.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iGPH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iGPH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png 424w, https://substackcdn.com/image/fetch/$s_!iGPH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png 848w, https://substackcdn.com/image/fetch/$s_!iGPH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png 1272w, https://substackcdn.com/image/fetch/$s_!iGPH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iGPH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png" width="1456" height="502" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:502,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iGPH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png 424w, https://substackcdn.com/image/fetch/$s_!iGPH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png 848w, https://substackcdn.com/image/fetch/$s_!iGPH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png 1272w, https://substackcdn.com/image/fetch/$s_!iGPH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96d5f274-018b-4efd-affa-26c44ee84043_2048x706.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The longest possible hidden chain of reasoning in a transformer attention architecture is fixed by the network depth. It <em>can</em> perform arbitrarily-long reasoning &#8212; but only by emitting outputs piece by piece in a chain of thought.</figcaption></figure></div><p>But limited reasoning is limiting! People want AI to be able to do things. The obvious solution is to <a href="https://research.google/blog/language-models-perform-reasoning-via-chain-of-thought/">employ </a><em><a href="https://research.google/blog/language-models-perform-reasoning-via-chain-of-thought/">visible</a></em><a href="https://research.google/blog/language-models-perform-reasoning-via-chain-of-thought/"> reasoning</a>. This is already something humans do all the time &#8212; from &#8216;thinking aloud&#8217; to writing notes (or whole textbooks!), sharing thoughts with others, brainstorming, using computers, databases, &#8216;tools for thought&#8217;, and so on.</p><p>One of the most effective classes of improvements to AI reasoning in this paradigm is <a href="https://arxiv.org/abs/2501.12948">encouraging it</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a><a href="https://arxiv.org/abs/2501.12948"> to &#8216;think aloud&#8217;</a> things like, &#8216;wait, is that right?&#8217;, &#8216;let me consider more possibilities&#8217;, &#8216;I should double check that&#8217;, and the like.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LWCY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LWCY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png 424w, https://substackcdn.com/image/fetch/$s_!LWCY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png 848w, https://substackcdn.com/image/fetch/$s_!LWCY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png 1272w, https://substackcdn.com/image/fetch/$s_!LWCY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LWCY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png" width="1456" height="564" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:564,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LWCY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png 424w, https://substackcdn.com/image/fetch/$s_!LWCY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png 848w, https://substackcdn.com/image/fetch/$s_!LWCY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png 1272w, https://substackcdn.com/image/fetch/$s_!LWCY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75f703be-cfdb-4640-8553-e72a6200d605_2048x794.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">On the left, a conversation with <a href="http://claude.ai">claude.ai</a>. On the right, with &#8216;reasoning&#8217; expanded, showing some extra thinking. (NB Anthropic, creator of Claude, doesn&#8217;t display the &#8216;unfiltered&#8217; reasoning; this is actually a summary paraphrase of the raw reasoning produced by another language model for user consumption.)</figcaption></figure></div><p>So, status quo, frontier AI is increasingly good at reasoning, largely as a consequence of <a href="https://www.tobyord.com/writing/mostly-inference-scaling">&#8216;thinking out loud&#8217; more and more</a>. There&#8217;s a <a href="https://www.lesswrong.com/posts/bwyKCQD7PFWKhELMr/by-default-gpts-think-in-plain-sight">mostly human-readable trace</a> of what it&#8217;s thinking.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> Although we don&#8217;t know <em>exactly</em> how much reasoning can be crammed into the hidden parts,<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> it&#8217;s a matter of sheer structural message-passing necessity that any more than that <em>must</em> go via the &#8216;out loud&#8217; pathway.</p><p>This is a good thing for oversight of AI, because they <a href="https://arxiv.org/abs/2412.04984">are capable of scheming</a> under the right conditions, but so far usually only by blurting out all their plans as they think aloud!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe to this blog, where I <s>blurt out all my plans for world domination</s> explain carefully how we can build toward a better world</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><h3>Aside: keeping visible reasoning faithful</h3><p>Of course, even if the AI <em>does</em> most of its reasoning in externalised text, there&#8217;s <a href="https://arxiv.org/abs/2310.18512">no guarantee in theory that it&#8217;s using human language the way we expect</a>.</p><p>This is why researchers call this kind of monitorability a &#8216;<a href="https://arxiv.org/abs/2507.11473">Fragile Opportunity for AI Safety</a>&#8217;, noting with concern that reasoning models &#8216;<a href="https://www.anthropic.com/research/reasoning-models-dont-say-think">don&#8217;t always say what they think</a>&#8217; and proposing <a href="https://www.lesswrong.com/posts/TecsCZ7w8s4e2umm4/5-ways-to-improve-cot-faithfulness">ways to protect and improve</a> this fragile opportunity.</p><h2>The cake is a lie: reinforcement learning back in style</h2><p>Recall <a href="https://www.oliversourbut.net/p/how-did-large-language-models-get">LeCun&#8217;s &#8216;cake&#8217; analogy</a> for AI training. The vast majority goes to self-supervised &#8216;what comes next&#8217; prediction, with character- and rules-tuning relegated mostly to a &#8216;cherry on top&#8217;. To terribly oversimplify, this gives some reason to expect most &#8216;out loud&#8217; reasoning to be essentially faithful at this stage. Why? Most of the training examples of human-looking reasoning are records of actual humans reasoning!<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a></p><p>Those of us with background in reinforcement learning and agent foundations always considered the relegation of RL to a &#8216;cherry on top&#8217; somewhat suspect. LeCun&#8217;s prediction held for a few years &#8212; but by 2024 general AI developers were starting to really scale up the RL, and here in 2026 it <a href="https://epoch.ai/gradient-updates/how-far-can-reasoning-models-scale">may be reaching on par with self-supervised pretraining</a>. Some cherry!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6RRA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6RRA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png 424w, https://substackcdn.com/image/fetch/$s_!6RRA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png 848w, https://substackcdn.com/image/fetch/$s_!6RRA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png 1272w, https://substackcdn.com/image/fetch/$s_!6RRA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6RRA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png" width="1024" height="656" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:656,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6RRA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png 424w, https://substackcdn.com/image/fetch/$s_!6RRA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png 848w, https://substackcdn.com/image/fetch/$s_!6RRA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png 1272w, https://substackcdn.com/image/fetch/$s_!6RRA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e2326c1-cc79-4e50-8340-7cac4f16fca4_1024x656.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Oof. (Image edited using gemini.)</figcaption></figure></div><p>This abundance of RL training over diverse tasks is probably most of what&#8217;s differentiating competitive frontier capabilities in AI today.</p><h2>Reinforcement learning&#8217;s serial training penalty</h2><p>When you&#8217;re <em>running</em> a machine language model, you generally have to wait for it to say one thing before it can say the next.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a> <strong>Running a model through RL training is even more demanding: you have to wait for it to </strong><em><strong>do</strong></em><strong> something and gather the result before it can do the next.</strong> No shortcuts.</p><p>This means RL <em>on attention-only transformer language models</em> is dramatically handicapped compared to self-supervised pretraining. (Recall that <a href="https://www.oliversourbut.net/p/how-did-large-language-models-get">attention-only bursts open the serial bottleneck for self-supervised learning in particular</a>, kicking off the deep learning x big data language model revolution.)</p><p>Said another way, if most of your training is self-supervised and only a <em>little</em> is RL, attention-only architecture is an incredible efficiency boon.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a> But <strong>once RL is a meaningful fraction of training, the scalability benefits of attention-only self-supervised pretraining begin to dwindle in comparison</strong>.</p><p>This is a large part of why companies were slow, reluctant to scale RL until they &#8216;had to&#8217;. But it now looks like they probably have to.</p><h2>Return of the recurrent network?</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kk4a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kk4a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png 424w, https://substackcdn.com/image/fetch/$s_!kk4a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png 848w, https://substackcdn.com/image/fetch/$s_!kk4a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png 1272w, https://substackcdn.com/image/fetch/$s_!kk4a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kk4a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png" width="1456" height="530" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:530,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kk4a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png 424w, https://substackcdn.com/image/fetch/$s_!kk4a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png 848w, https://substackcdn.com/image/fetch/$s_!kk4a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png 1272w, https://substackcdn.com/image/fetch/$s_!kk4a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b179166-a5fd-4d93-8ca2-28979f9ced34_2048x746.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Compare with the attention-only version, above. There, sequential reasoning had to pass through the readable output sequence (y1, y2 etc.). Here in the recurrent setting, sequential reasoning can pass uninterrupted for arbitrarily long without touching the output path.</figcaption></figure></div><p>Because RL <em>has to</em> &#8216;wait&#8217; for things to unfold sequentially, it &#8216;might as well&#8217; make use of fully recurrent messages (&#8216;hidden reasoning&#8217;) if it can.</p><p>In fact, because the hidden reasoning occurs in neural activations rather than mere human-readable text output, it&#8217;s actually got far <em>more</em> information capacity. <strong>That&#8217;s a further point in favour of recurrent connections.</strong></p><p>Speculatively, developers might somehow get the &#8216;best&#8217; of both worlds by introducing recurrent messages for RL training stages but skipping them for self-supervised learning.</p><h2>Does &#8216;thinking out loud&#8217; go away?</h2><p>Putting things together: developers are turning to RL at serious scale &#8212; this is a novel demand on frontier general AI. Previous developments leant on massive self-supervised pretraining and little else. That made the attention-only transformer architecture <a href="https://www.oliversourbut.net/p/how-did-large-language-models-get">indispensable for training efficiency</a>. The transformer also very conveniently gave us &#8216;reasoning out loud&#8217; as the closest thing to <em>interpretable AI</em> we&#8217;ve perhaps ever had. That&#8217;s generally considered convenient for safety: a &#8216;fragile opportunity&#8217;.</p><p>If RL scale eats up the efficiencies of attention-only (non-recurrent) training, and companies continue to be willing to pay increasing costs for frontier training, they may turn to &#8216;hidden reasoning&#8217; architectures, including fully recurrent hidden messages &#8212; after all, these plausibly benefit the AI&#8217;s competence.</p><p>This isn&#8217;t purely speculative: &#8216;latent reasoning&#8217; is an active research area in AI, enough so that <a href="https://arxiv.org/abs/2507.06203">a 30-author, 40-page survey</a> of the field was published in 2025.</p><h3>Potential saving benefits of visible reasoning</h3><p>There are obvious benefits to sharing some reasoning &#8216;out loud&#8217;.</p><p>Sharing what you&#8217;re thinking about &#8212; in the AI case, not only with &#8216;copies of yourself&#8217; which have the same internal thought-language, but also with heterogeneous AI systems and humans &#8212; is often beneficial, at least with friendly collaborators. Curiously, it might be especially useful to be <em>provably largely honest</em> in a way that humans can&#8217;t achieve. Structural, architectural constraints which rule out certain kinds (or degrees) of conniving may point in this direction. That&#8217;s fairly niche, though, and has various other difficult dependencies.</p><p>The case of humans is instructive: to boost our reasoning, we don&#8217;t <em>just</em> &#8216;think out loud&#8217;; we often use reams of scratch paper, digital notes, archives, databases, and so on. AI systems with the right tools can make use of systems like this too. Now, for AI, such systems <em>could</em> be written entirely in idiosyncratic <a href="https://www.artificial-intelligence.blog/terminology/neuralese">neuralese</a> or <a href="https://en.wikipedia.org/wiki/Steganography">stegotext</a>: this might have some capacity and representational benefits, but would act heavily against compounding and maintenance and sharing. I wouldn&#8217;t be surprised to see some of both: &#8216;personal&#8217; dense idiosyncratic jottings, and &#8216;external&#8217; or &#8216;long-term&#8217; more natural records.</p><p>I don&#8217;t think these benefits say you should <em>only</em> reason out loud.</p><p>There are also the aforementioned safety and oversight benefits. While &#8216;capability&#8217; and &#8216;safety&#8217; tradeoffs are often a little murky in AI, this might be one of the most clear cut cases where they act against each other. <strong>I&#8217;d advise frontier AI developers to refrain from throwing away the benefits of overseeable AI reasoning without understanding the implications very clearly.</strong></p><h2>Does it even matter?</h2><p>I&#8217;m speculating somewhat here &#8212; in particular, besides the crude gross comparison between self-supervised and RL expenditure, I haven&#8217;t &#8216;run the numbers&#8217; on the specific implications of different language model architecture setups. The serial training penalty of RL may bite imminently, or it may be some distance out. Someone more familiar with precise details of AI training expenditure or training regimes might have more to say there. I expect that within AI development companies there are people already researching these things.</p><p>More generally, I have tended to be somewhat cynical about deployers of AI systems <em>actually bothering</em> to look at the bounteous reasoning traces their AI systems produce anyway! Surely the responsible ones at least try to, or (of course) they get other AI systems to do so. But <em>if</em> the <a href="https://www.oliversourbut.net/p/is-the-cat-out-of-the-bag">cat is out of the bag</a> with AI development, it&#8217;s difficult to imagine all such developers and deployers being responsible.</p><p>Conversely, maybe we can do much better than merely relying on <em>architectural</em> semi-guarantees on reasoning transparency! <a href="https://www.forethought.org/research/design-sketches-collective-epistemics#epistemic-virtue-evals">Epistemic virtue evaluations</a> (and associated training) may be able to drive much more clarity and honesty than mere thinking out loud. And <a href="https://www.oliversourbut.net/p/a-full-epistemic-stack">structural, compounding knowledge-bases</a> <em>curated </em>with help from AI and <em>grounding</em> more AI&#8217;s outputs might be a path to truly trustworthy communication and learning.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Note that whole fields of <a href="https://en.wikipedia.org/wiki/Explainable_artificial_intelligence">explainable AI and AI interpretability</a> exist, with many open <a href="https://www.anthropic.com/research/team/interpretability">research</a> <a href="https://www.lesswrong.com/posts/qHCDysDnvhteW7kRd/arc-s-first-technical-report-eliciting-latent-knowledge">agendas</a>. People are trying, bless them! They have made nonzero progress! But neural networks are still basically impenetrable.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>I say &#8216;encouraging&#8217; to encompass all of training, prompting via input, forcing via injected context, or other steering injections.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>The <em>company serving</em> the AI might hide it from you. Whoever <em>runs</em> the AI may hide it from you (perhaps just showing you final outputs, or even passing them off as their own production). But at least the reasoning is out there, externalised for <a href="https://www.lesswrong.com/posts/FRRb6Gqem8k69ocbi/externalized-reasoning-oversight-a-research-direction-for">someone to look over</a> <em>in principle</em>&#8230; if they can be bothered. (Maybe they&#8217;ll get their AI to do it!)</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Perhaps <a href="https://www.lesswrong.com/posts/Ty5Bmg7P6Tciy2uj2/measuring-no-cot-math-time-horizon-single-forward-pass">something like a few minutes&#8217;-worth</a> of mathematical reasoning without making notes, and perhaps <a href="https://arxiv.org/abs/2411.16353">a few &#8216;steps&#8217; of logic</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>There are reasons even this comes apart &#8212; records of human writing and speech are not usually records of human <em>thought</em>. Nowadays, some of the records are of AI writing from earlier generations! But there are enough basically faithful examples of &#8216;thinking out loud&#8217; and &#8216;reasoning clearly&#8217; that when you encourage a mostly-pretrained AI to reason out loud really really comprehensively, it seems to mostly do that in a human-readable way.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>There are some partial ways around this, but on the whole it&#8217;s right.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>Consider <a href="https://en.wikipedia.org/wiki/Amdahl%27s_law">Amdahl&#8217;s law</a>: the self-supervised part is ridiculously parallelisable &#8212; it&#8217;s &#8216;optimised&#8217; &#8212; and the RL part isn&#8217;t. When RL is small, the overall boost is very large.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[How did ‘large’ language models get that way?]]></title><description><![CDATA[The role of Transformers and Pretraining in GPT]]></description><link>https://www.oliversourbut.net/p/how-did-large-language-models-get</link><guid isPermaLink="false">https://www.oliversourbut.net/p/how-did-large-language-models-get</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Sun, 03 May 2026 21:30:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z1xg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Large language models are <em>really</em> large. They&#8217;re among the largest machine learning projects ever, and set to be (perhaps already are by some measures) some of the <a href="https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/the-cost-of-compute-a-7-trillion-dollar-race-to-scale-data-centers">largest </a><em><a href="https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/the-cost-of-compute-a-7-trillion-dollar-race-to-scale-data-centers">computing</a></em> and even <a href="https://epochai.substack.com/p/the-epoch-ai-brief-november-2025">largest </a><em><a href="https://epochai.substack.com/p/the-epoch-ai-brief-november-2025">infrastructure</a></em> projects ever.</p><p>But how did LMs actually <em>get </em>so<em> </em>large as to warrant the title &#8216;<em>large</em> language model (LLM)&#8217;? A large part of the answer is in the P ('pretrained') and the T ('transformer') of GPT.</p><p>This is part 1 of a series about LLM architecture and some implications, past and future, for reasoning. Part 1 is &#8216;how we got here&#8217; &#8212; what was so impactful about the <em>transformer</em> architecture for LLMs. Some readers may prefer to skip this. <a href="https://www.oliversourbut.net/p/reinforcement-learning-scaling-might">Part 2</a> points at an unexpected benefit &#8212; the surprising explainability of contemporary AI reasoning &#8212; and why new trends might erode that. It&#8217;s a novel point as far as I know.</p><h2>Self-supervised learning: most of the cake</h2><p>In 2016, early deep learning practitioner <a href="https://youtu.be/Ount2Y4qxQo?si=0hS_73cqiivrrPEC&amp;t=1156">Yann LeCun introduced a famous analogy</a> for learning intelligent systems:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z1xg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z1xg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png 424w, https://substackcdn.com/image/fetch/$s_!Z1xg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png 848w, https://substackcdn.com/image/fetch/$s_!Z1xg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png 1272w, https://substackcdn.com/image/fetch/$s_!Z1xg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z1xg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png" width="784" height="502" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:502,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z1xg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png 424w, https://substackcdn.com/image/fetch/$s_!Z1xg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png 848w, https://substackcdn.com/image/fetch/$s_!Z1xg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png 1272w, https://substackcdn.com/image/fetch/$s_!Z1xg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ecec949-463c-4a69-8d54-d3386149ee6a_784x502.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>LeCun&#8217;s cake analogy (&#8216;unsupervised&#8217;, &#8216;predictive&#8217;, and &#8216;self-supervised&#8217; are used somewhat interchangeably)</em></figcaption></figure></div><blockquote><p>If intelligence is a cake, the bulk of the cake is unsupervised learning, the icing on the cake is supervised learning, and the cherry on the cake is reinforcement learning (RL).</p></blockquote><p>This correctly described the sometime prevailing practice for training LLMs, before the first LLMs were created! (Though by 2025, certainly 2026, this has become outdated &#8212; more on that later.)</p><p>What does this mean, and how did LeCun come to this position?</p><p>Machine learning, especially deep learning, needs example data.</p><h3>Old-school supervising learning</h3><p>You might think to start (and historically, researchers mostly did) by curating labelled examples: &#8216;this image is a cat, this one a dog; the French for &#8220;a bed&#8221; is &#8220;un lit&#8221;; &#8230;&#8217;. The machine learns these labels, and (hopefully) <em>also</em> learns how to generalise correctly, responding appropriately to similar-enough previously-unseen cases &#8212; which is what you actually want.</p><p>This <em>supervised</em> approach can be excellent: you get (ideally) reasoned, expert judgement as the sole curriculum for your nascent learning machine. But this is also painfully expensive, because you have to pay or otherwise entice humans to look over all your examples (or worse yet, create gold standard examples from scratch)!</p><h3>The hands-off approach</h3><p>Fortunately for developers, there is a stupendous amount of (language and other) data on the internet. That&#8217;s one very big reason, if you can, to pursue <em>self-supervised</em> training targets for your machine learning project: rather than &#8216;here is my carefully (and expensively!) curated dataset of labelled examples, learn them&#8217;, it&#8217;s &#8216;here is a <a href="https://en.wikipedia.org/wiki/The_Pile_(dataset)">gigantic pile</a> of text, learn to predict whatever comes next&#8217;.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><p>As in supervised learning, where the goal is not simply to memorise the provided examples, the eventual goal of self-supervised learning is rarely simply to teach the machine to carry out this specific defined prediction activity &#8212; rather, in the process of learning how to do <em>that</em>, the machine is forced to learn <em>generalisable concepts and features</em> which can then be turned to a wide range of tasks.</p><p>More on RL (the cherry) later.</p><h3>Self-supervised learning in language models</h3><p>Language comes in <em>long sequences</em>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> So language learning systems need to be able to consume sequences. Consider: a large part of how we track what&#8217;s going on in a play, a textbook, an argument, a novel, and so on, is <em>by reference</em> and <em>by recollection</em> of context far back in the sequence: a previous dialogue, an earlier concept or experiment, a foundational point of information, a character&#8217;s exposition and background. The foreshadowing of a <a href="https://en.wikipedia.org/wiki/Chekhov's_gun">Chekhov&#8217;s gun</a> can&#8217;t be understood if you&#8217;ve lost the plot in the meanwhile! So too with machine language models.</p><p>For the &#8216;predict what comes next&#8217; self-supervised target, that looks like consuming in principle <em>everything</em> that&#8217;s come so far, then using that context to output a distribution over possible continuations.</p><blockquote><p>Chapter 1:... The old revolver sat on the kitchen table&#8230;</p><p>&#8230;</p><p>Chapter 20:... Alice had run out of patience. Without warning, she fired the [gun/chef/kiln/pigeon]</p></blockquote><p>Which one is right? Well, actually it depends a lot on what happens in the intervening scenes I&#8217;ve skipped! But assuming the narrative promise made in scene 1 is kept (and no other overriding promises are made in between), a good prediction is &#8216;gun&#8217;. On the other hand, if scene 1 instead introduced a sous-chef with a hygiene problem, a pottery studio with a design dispute, or an experimental pigeon-launching apparatus, the answers might be different.</p><p>The especially useful property of &#8216;what comes next&#8217; self-supervision is that you can run this test on <em>every single</em> position. So a reasonable-length text might give you thousands or more such training examples&#8230; some easier than others.</p><p>&#8216;Simply predicting what comes next&#8217; clearly necessitates tracking a lot about what&#8217;s going on (across all kinds of scenarios) and how things work&#8230; at least if you want to predict well.</p><h2>What are we waiting for? How transformers overcame a scaling problem</h2><p>The benefit of self-supervised learning &#8212; a huge, largely automatically-generated collection of training targets (&#8216;predict what comes next&#8217;) &#8212; also raises its own challenges. If much of the capability of your system arises from the sheer <em>quantity</em> of these training targets, you run into computational challenges: you want to be pumping more and more of these examples through your machine, but each example takes some number crunching, which costs compute time. Crucially, if you can crunch through more examples <em>at once in parallel</em>, you&#8217;re in a far better spot.</p><p>We saw how text prediction is a sequence problem. Naively, this means that, to make the prediction at the <em>last</em> position in the text, your process has to <em>first</em> read the first token, then the second, the third, and so on. To make the guess at the last position, you&#8217;ve had to &#8216;wait&#8217; for absolutely all of the rest to be done processing. At least, that&#8217;s how humans read, and it&#8217;s how neural network machine language models did their processing until 2017.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><p>It&#8217;s implemented as a &#8216;message&#8217; (neural activation) passed forward from position to position (which needs to capture what might be relevant to &#8216;remember&#8217;), a structure known as a <em>recurrent</em> network. (Messages are additionally passed &#8216;depthwise&#8217; in deep neural networks.) The <em>n</em>th position needs <em>n</em> steps to proceed <em>before</em> it can compute anything. For long texts, that&#8217;s crippling.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s-WA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s-WA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png 424w, https://substackcdn.com/image/fetch/$s_!s-WA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png 848w, https://substackcdn.com/image/fetch/$s_!s-WA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png 1272w, https://substackcdn.com/image/fetch/$s_!s-WA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s-WA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png" width="1456" height="489" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf938895-1326-4143-95e9-636da01e6409_2048x688.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:489,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s-WA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png 424w, https://substackcdn.com/image/fetch/$s_!s-WA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png 848w, https://substackcdn.com/image/fetch/$s_!s-WA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png 1272w, https://substackcdn.com/image/fetch/$s_!s-WA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf938895-1326-4143-95e9-636da01e6409_2048x688.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>When later positions have to wait for earlier computations, it really adds up for long texts.</em></figcaption></figure></div><p>The milestone <em>transformer</em> architecture, introduced in the 2017 paper <em><a href="https://arxiv.org/abs/1706.03762">Attention Is All You Need</a></em>, totally upended this constraint, making far larger-scale self-supervised training practical. Neural attention mechanisms were introduced long before this paper &#8212; the real innovation is to <em>drop</em> the recurrent connections entirely. <em>You Don&#8217;t Need Recurrent Connections</em> is the logical corollary! &#8212; and is what enables the entirely more scalable training of these architectures on long sequences. Attention still passes messages forward, but never <em>only</em> forward, always &#8216;diagonally&#8217;: making depthwise steps through the structure at the same time. This means <em>no</em> position, however long the text, need wait for <em>any</em> previous positions to compute. It does demand a highly parallel computation, but modern compute resources are nothing if not highly parallel.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZLDH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZLDH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png 424w, https://substackcdn.com/image/fetch/$s_!ZLDH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png 848w, https://substackcdn.com/image/fetch/$s_!ZLDH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png 1272w, https://substackcdn.com/image/fetch/$s_!ZLDH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZLDH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png" width="1456" height="587" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:587,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96610,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/196217222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZLDH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png 424w, https://substackcdn.com/image/fetch/$s_!ZLDH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png 848w, https://substackcdn.com/image/fetch/$s_!ZLDH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png 1272w, https://substackcdn.com/image/fetch/$s_!ZLDH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf90ea-243d-49fa-8745-173125cfde9d_2100x846.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">By doing away with all fully recurrent message pathways, attention-only processing can proceed absurdly faster on long texts. (Both diagrams created with claude.ai)</figcaption></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe, and proceed absurdly faster on long texts.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><h2>Self-supervised training is only the start</h2><p>Self-supervised training targets are great for cheaply (low human effort!) learning all kinds of features and concepts from big data. The well self-supervised neural network now &#8216;gets what is going on&#8217; in all kinds of texts, well enough to sensibly predict what might come next, even on new content.</p><p>Sometimes that kind of prediction is already exactly what you want, but usually there are other things you&#8217;d like your neural network to do. Enter <em><a href="https://en.wikipedia.org/wiki/Transfer_learning">transfer learning</a></em>, any number of approaches to tapping into these learnings to accomplish the actual task of interest.</p><h3>Chatbot as playscript</h3><p>For language models, a common basic approach is arranging for the model to &#8216;predict&#8217; the responses of a helpful ASSISTANT character, in a dialogue with a USER character (played by you).</p><blockquote><p>The following is a dialogue between a computer USER and a helpful, knowledgeable ASSISTANT.</p><p>USER: Which is better, a Chekhov&#8217;s gun or a Maxim gun?</p><p>ASSISTANT:</p></blockquote><p>Because the assistant in this context is plausibly actually helpful and knowledgeable, there&#8217;s some reasonable chance that a well-trained language model produces a helpful answer here with no further tweaks. Of course this is flimsy &#8212; <a href="https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)">hallucination</a> of plausible-but-incorrect responses or provision of unhelpful responses is common, or even entirely off-track diversions, like introducing new &#8216;characters&#8217; or pivoting away from the established playscript structure.</p><h3>Post-training</h3><p>Various approaches make more involved attempts to effectively make use of the knowledge acquired in self-supervised pretraining. Those which involve further training, not just prompting, are often called &#8216;post-training&#8217;.</p><h4>Supervised learning: the icing</h4><p>Recall LeCun&#8217;s cake: with self-supervised (&#8216;unsupervised&#8217;) the cake, <em>supervised</em> learning is the icing. A common pattern for LLMs intended for chatbot use is to have humans generate examples of how the ASSISTANT <em>should</em> respond to various queries. Should it ask clarifying questions sometimes? How verbose should it be? Should it provide references? Should it <a href="https://www.oliversourbut.net/p/pets-friends-partners">respond in first person</a>? Should it respond in the register and dialect of the USER, or have its own? With training examples provided, <em>sometimes</em> principles like these can be learned and generalised.</p><p>Because <em>almost all </em>of the knowledge, language understanding, common sense, and <a href="https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators">character information</a> comes from self-supervised <em>pretraining</em>, supervised learning <em>as <a href="https://en.wikipedia.org/wiki/Fine-tuning_(deep_learning)">fine-tuning</a></em> can get away with radically fewer examples than would be needed to train this way from scratch. Less human effort. Icing indeed!</p><h4>Reinforcement learning: the cherry on top</h4><p>LeCun&#8217;s cake analogy underestimates and misunderstands reinforcement learning &#8212; more on that in <a href="https://www.oliversourbut.net/p/reinforcement-learning-scaling-might">part 2</a>. But for a few years, his relegation of RL to a final (but important) stylish flourish was a fairly good description of how LLM-based AI assistants were trained.</p><p>Rather than getting human contributors to produce exemplary data (as in supervised learning), RL has human engineers specifying <em>how to grade</em> outputs as good or bad, and sets the machine up to <a href="https://www.oliversourbut.net/p/you-cant-skip-exploration">explore possible behaviours</a>. Those outputs get graded, and the training encourages more of whatever computations produced the good ones (and less of whatever produced the bad ones).</p><p>This is very fraught, because <em>accurately specifying</em> what counts as good behaviour ahead of time is notoriously difficult, and if you&#8217;re not careful, exploratory AI will <a href="https://vkrakovna.wordpress.com/2018/04/02/specification-gaming-examples-in-ai/">absolutely find</a> the <a href="https://deepmind.google/blog/specification-gaming-the-flip-side-of-ai-ingenuity/">strange cases</a> you didn&#8217;t think of. It can also fall completely flat if the AI in training has no idea how to achieve good outputs at all: it just gets stuck flailing.</p><p>This makes <em>post-training of reasonably competent pretrained base models</em> a sweet spot for RL: the model has enough background competence to at least occasionally succeed, can learn from success, and keep getting better and better. RL (whether as post-training or from scratch) also has the most potential to scale <em>past</em> human expert level, because we only need to be able to say which outcomes are better, not how to achieve them.</p><p>The earliest widespread application of RL to LLM post-training was in <em>reinforcement learning from human feedback (RLHF)</em>. At its core, this asks humans to rate which answers are better or worse, then has the machine try out various ways of achieving better and better answers (according to the human raters). Typically this is repeated for a few rounds. In 2023-4, this is how companies got AI chatbots to <em>mostly</em> respond politely, <em>usually</em> refuse to describe how to make bombs, and <em>often</em> <a href="https://www.oliversourbut.net/p/pets-friends-partners">stick to character</a>. It&#8217;s also a great way to train AI to be obsequious, disingenuous, flattering, and <a href="https://time.com/7346052/problem-ai-flattering-us/">sycophantic</a>. Oops!</p><h2>Times changing</h2><p>In frontier AI systems, reinforcement learning is no longer a lightweight (albeit effective) post-training flourish on top of a primarily self-supervised pretrained LM. Since late 2024, reaching for expert-and-beyond capabilities has brought RL back into the spotlight. That has a few implications, one of which &#8212; reasoning and its transparency (or otherwise) &#8212; we&#8217;ll look at in the <a href="https://www.oliversourbut.net/p/reinforcement-learning-scaling-might">next essay</a>.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>More generally, self-supervised targets take existing data from the target domain (images, language, code, &#8230;) and, in various ways, corrupt or distort it, tasking the machine with recovering the original as accurately as possible from the clues in context. For language, &#8216;here is a text fragment; predict what comes next&#8217; is a natural version of this, as is &#8216;here is a fragment of censored text; fill the blanks&#8217;. In the image domain, images might be censored in patches, pixellated, or otherwise distorted.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Other rich and important formats also have at least one open-ended dimension to them: audio, video, computer code (just a kind of language), DNA, drone and vehicle control logs, robotic action sequences, &#8230; So machine learning approaches for this range of formats can benefit from similar architectures.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>I&#8217;m fairly sure of this, at least I&#8217;m not aware of approaches which escaped this sequencing constraint. There&#8217;s a <a href="https://distill.pub/2016/augmented-rnns/">curious 2016 post by Google researchers</a> which looks at various modified approaches &#8212; notably including attention mechanisms, which are what the later transformer architecture are centred on &#8212; but in all cases based on a fully recurrent backbone. Some hierarchical sequence processing approaches use convolutions instead of recurrence, but this too requires scanning the sequence in order to process later elements.</p></div></div>]]></content:encoded></item><item><title><![CDATA[“Best humans still outperform”]]></title><description><![CDATA[One turning point in the history of cope around artificial intelligence]]></description><link>https://www.oliversourbut.net/p/best-humans-still-outperform</link><guid isPermaLink="false">https://www.oliversourbut.net/p/best-humans-still-outperform</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Fri, 17 Apr 2026 12:59:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-eAP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few years ago I was tickled by an article headline in a serious <a href="https://www.nature.com/articles/s41598-023-40858-3">academic journal</a>:</p><blockquote><p><strong>Best humans still outperform artificial intelligence in a creative divergent thinking task</strong></p></blockquote><p>Remarkable! <a href="https://en.wikipedia.org/wiki/Man_bites_dog">Man bites dog</a>! It had <em>become newsworthy</em>, it was <em>worth checking</em> (and, I perceive, worth a little self-congratulatory celebration) that there remained any domain where mere man could still hope to possibly contend with the machines &#8212; at least, the <em>best</em> humans still could! (Could you?)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-eAP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-eAP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png 424w, https://substackcdn.com/image/fetch/$s_!-eAP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png 848w, https://substackcdn.com/image/fetch/$s_!-eAP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png 1272w, https://substackcdn.com/image/fetch/$s_!-eAP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-eAP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png" width="500" height="756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-eAP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png 424w, https://substackcdn.com/image/fetch/$s_!-eAP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png 848w, https://substackcdn.com/image/fetch/$s_!-eAP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png 1272w, https://substackcdn.com/image/fetch/$s_!-eAP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51424e1e-ee7b-4cd1-bb15-cc77db060e8d_500x756.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em><a href="https://knowyourmeme.com/memes/can-a-robot-write-a-symphony">Source</a></em></figcaption></figure></div><h2>A message from the future</h2><p>That was 2023. I think what stood out to me at the time<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> was that this was in some sense <em>early</em>. Not early in the <em>story of AI</em> &#8212; although <a href="https://en.wikipedia.org/wiki/ChatGPT">ChatGPT</a> and <a href="https://en.wikipedia.org/wiki/Stable_Diffusion">StableDiffusion</a>, each less than a year old, had captured the public attention in a way which earlier AI hadn&#8217;t, these were merely the latest in a long lineage of gradual developments &#8212; but an early sign of a reckoning, an <em>attitude shift</em> in how humanity would grapple with these new machine capabilities we were conjuring fitfully into being.</p><p>I&#8217;d already been worrying for years that things might get out of hand with AI (and had even started <a href="https://www.oliversourbut.net/p/emergent-misaligned-outcomes">writing about it</a>). I was hardly the first!<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> But this had felt almost like a perversely secret concern (how can people not <em>see</em> what&#8217;s coming?? &#8212; but they didn&#8217;t), one which humanity at large appeared destined to ignore until either it was <a href="https://www.oliversourbut.net/p/un-unpluggability-can-t-we-just-unplug-it">too late</a>&#8230; or, if we somehow played it right, until <a href="https://www.darioamodei.com/essay/machines-of-loving-grace">a splendid apotheosis</a> of world peace, unlimited bounty, health and longevity delivered by machine intellect. (In fact I think those <a href="https://www.oliversourbut.net/p/the-first-type-of-transformative">remain real prospects</a>, and <a href="https://www.oliversourbut.net/p/ai-for-human-reasoning-for-you">it&#8217;s absolutely in our hands to determine</a> which outcomes we get.)</p><p>What this headline implicitly spoke of, the subtextual worldview shift belied by the phrasing &#8212; &#8220;Best humans <em>still</em> outperform&#8221; &#8212; was that we had woken up and viscerally felt the reality that even the &#8216;best&#8217; humans might genuinely need to watch their backs. The machines were coming. It was no longer (had never been) a joke or a fairy tale.</p><p>This headline, seemingly from a near future in which it was taken for granted that machines, in general, dominated human capabilities, showed what was coming. Headlines like it are now commonplace &#8212; perhaps more common than those (now almost boring!) headlines adding to the litany of tasks AI now outcompetes human experts at.</p><h2>The world changed</h2><p>The world changed. Not because the world had <em>actually yet</em> changed (much), but because humanity, in our limited and faltering foresight, had noticed that, soon, it might. That murky perception of the future, humanity&#8217;s near-unique hallmark and blessing, memetically reverberated and has worked its way into our collective discourse.</p><p>In this way, I&#8217;m incredibly grateful to the &#8216;ChatGPT moment&#8217;. Rather than implicitly relying on a plucky band of vaguely foresighted but ultimately underpowered &#8216;sci-fi weirdos&#8217;, humanity as a whole is entering the conversation. We&#8217;re all stakeholders in the trajectory of this world-transforming sphere of technology, and all kinds of people are beginning to act like it: people with skillsets and perspectives which we&#8217;ll need, which had been lacking, in earlier debates. Law theorists, philosophers, engineers, anthropologists, economists, statespeople. It&#8217;s a thickly textured problem. It&#8217;ll need more than people like me (aspiring polymath though I may be) to solve it!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mmrs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mmrs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png 424w, https://substackcdn.com/image/fetch/$s_!mmrs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png 848w, https://substackcdn.com/image/fetch/$s_!mmrs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png 1272w, https://substackcdn.com/image/fetch/$s_!mmrs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mmrs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png" width="997" height="589" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:589,&quot;width&quot;:997,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mmrs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png 424w, https://substackcdn.com/image/fetch/$s_!mmrs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png 848w, https://substackcdn.com/image/fetch/$s_!mmrs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png 1272w, https://substackcdn.com/image/fetch/$s_!mmrs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a73c7e3-9a6b-4a1a-acf8-350d73236518_997x589.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em><a href="https://www.dallasfed.org/research/economics/2025/0624">Source</a></em></figcaption></figure></div><p>These cultural conversation shifts are fickle but surely incredibly consequential. 2025 felt like another shift, to me, and 2026 so far &#8212; with AI producing <a href="https://www.anthropic.com/glasswing">genuine national security implications</a> and at the centre of <a href="https://en.wikipedia.org/wiki/Anthropic%E2%80%93United_States_Department_of_Defense_dispute">dirty political manoeuvring</a> &#8212; seems to suggest that both the training wheels and the gloves are off, as <a href="https://www.hyperdimensional.co/p/new-sages-unrivalled">Dean Ball recently put it</a>. It&#8217;s a little scary: <a href="https://www.oliversourbut.net/p/emergent-misaligned-outcomes">powerful and not altogether friendly forces</a> have turned their eye to the potential potency of emerging tech, and they<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> may wrestle for it, even under the risk that they destroy much in the process or that the tech spills entirely out of their control.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe, lest this blog spill entirely out of your control.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The world, changed</h2><p>We can be doing better! People can get curious, find out what&#8217;s what, consider stakes and what realistic paths we might prefer. Don&#8217;t make the mistake of &#8216;nowsight&#8217; bias &#8212; today&#8217;s AI are the least capable there will ever be! Take seriously where things might go, and notice if the conversation seems to miss something important that you understand well: it&#8217;s still early and the &#8216;experts&#8217; are mainly that by virtue of noticing the importance of AI a little sooner than everyone else<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>. Let&#8217;s also <a href="https://www.oliversourbut.net/p/ai-for-human-reasoning-for-you">grab the new tech building blocks we have and bootstrap</a> the way we do foresight, collective intelligence, and coordination.</p><p>Don&#8217;t mistake me for naively assuming machines will blast through every bottleneck in short order. There&#8217;s a lot of adaptability, dexterity, and generality bottlenecks between here and <a href="https://www.planned-obsolescence.org/p/self-sufficient-ai">self-sufficient machines</a>. Perhaps I&#8217;ll write something about that soon.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>(I intended to blog about it at the time, but&#8230; you know how it is with drafts.)</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p><a href="https://www.goodreads.com/quotes/664365-it-seems-probable-that-once-the-machine-thinking-method-had">Quoth Turing, some time in the 1950s</a>:</p><blockquote><p>once the machine thinking method had started, it would not take long to outstrip our feeble powers&#8230; At some stage therefore, we should have to expect the machines to take control.</p></blockquote><p>Even Turing was not first to perceive that thinking machines could pose takeover hazards.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><a href="https://www.oliversourbut.net/p/us-vs-china-vs-me">I&#8217;m not only (or even mainly) talking about countries</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>I&#8217;ve been bemused several times recently upon being referred to as an &#8216;expert&#8217;, that mythical breed.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Orders of magnitude: use semitones, not decibels]]></title><description><![CDATA[How I use a secret part of my brain for forbidden mathematics]]></description><link>https://www.oliversourbut.net/p/orders-of-magnitude-use-semitones</link><guid isPermaLink="false">https://www.oliversourbut.net/p/orders-of-magnitude-use-semitones</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Wed, 01 Apr 2026 10:21:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!g3Vl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I'm going to teach you a secret. It's a secret known to few, a secret way of using parts of your brain <em>not meant for mathematics</em>... for mathematics. It's part of how I (sort of) do logarithms in my head. This is a nearly purposeless skill.</p><p>What's the growth rate? What's the doubling time? How many orders of magnitude bigger is it? How many years at this rate until it's quintupled?</p><p>All questions of ratios and scale.</p><p>Scale... hmm.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g3Vl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g3Vl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg 424w, https://substackcdn.com/image/fetch/$s_!g3Vl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg 848w, https://substackcdn.com/image/fetch/$s_!g3Vl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!g3Vl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g3Vl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg" width="1150" height="492" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:492,&quot;width&quot;:1150,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55049,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/192919127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g3Vl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg 424w, https://substackcdn.com/image/fetch/$s_!g3Vl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg 848w, https://substackcdn.com/image/fetch/$s_!g3Vl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!g3Vl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8a9af0e-5490-488d-9ce1-ef94b96b730b_1150x492.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>'Wait', you're thinking, 'let me check the date...'. Indeed. But please, stay with me for the logarithms.</p><h2>Musical intervals as ratios, and God's joke</h2><p>If you're a music nerd like me, you'll know that an octave (abbreviated 8ve), the fundamental musical interval, represents a doubling of vibration frequency. So if <a href="https://en.wikipedia.org/wiki/A440_(pitch_standard)">A440</a> is at 440Hz, then 220Hz and 880Hz are also 'A'. Our ears tend to hear this as 'the same note, only higher'.</p><p>That means the 'same' interval, an octave, corresponds to successively greater gaps in frequency. First a doubling, then a quadrupling, an octupling, and so on. Our perception, and musical notation, maps the space of frequencies logarithmically.</p><p>You'll also know that a '<a href="https://en.wikipedia.org/wiki/Perfect_fifth">perfect fifth</a>' is a ratio of . A to the E above it, C# to the G# above it, etc. Consonance is <em>all about nice ratios</em>! (Ask Pythagoras.)</p><p>At least, the really sweet, in tune fifths are this ratio. Because God is an absolute wheeze, you can keep moving in fifths (3:2) and octaves and get 'new notes' eleven times. That's where we get our Western scale from, originally (except it's <em>originally</em> originally Mesopotamian probably). The twelfth time ((3:2)^12) gets you to a ratio of roughly 129.7:1. That's <em>almost exactly</em> seven doublings, seven octaves (7 * 8ve)! That'd be 128:1. God's joke is in <a href="https://youtu.be/1DUZsQ2by2s?si=4fGwAq6dub-eyV2T&amp;t=96">that roughly 1% margin</a>, and musicians have been arguing about what to do about it for centuries. <a href="https://en.wikipedia.org/wiki/Musical_temperament">It's a whole thing</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><p>Cutting a long story short, that leaves us with twelve different notes dividing up the octave. They 'repeat', with 'the same' note again and again at either higher or lower octaves (a full doubling of frequency).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A0Vd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A0Vd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg 424w, https://substackcdn.com/image/fetch/$s_!A0Vd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg 848w, https://substackcdn.com/image/fetch/$s_!A0Vd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!A0Vd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A0Vd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg" width="1456" height="388" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:388,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:77422,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/192919127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A0Vd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg 424w, https://substackcdn.com/image/fetch/$s_!A0Vd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg 848w, https://substackcdn.com/image/fetch/$s_!A0Vd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!A0Vd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fece88ba8-042f-477e-8704-f581052233ce_1615x430.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe for free to receive new secret knowledge direct to your brain.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>In between octaves, those twelve divisions need to 'add up to' a doubling. For reasons, two steps (a sixth of the overall scale) is referred to as a 'tone', and a single step (a twelfth of the scale) is thus a 'semitone'. That means each semitone corresponds to a ratio of the twelfth root of two. (It's about 1.06, i.e. a ratio increase of about 6%.) The full scale as shown above is called 'chromatic' (because it has every 'colour'...).</p><p>This means that <strong>neat fractional powers of two map cleanly onto musical intervals</strong>. God was generous in giving twelve many factors, so we have musical intervals for the square, cube, fourth, sixth, and twelfth roots of two which come for free.</p><p>So far, no logarithms. But we have <em>musical powers</em> of two: give me a fraction and I can tell you the musical interval. That means we also have <em>musical logarithm</em>: give me a musical interval and I can tell you the power of two! e.g. C to G# is eight semitones. So</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\log_2(\\mathrm C: \\mathrm G) = \\frac 8 {12} = \\frac 2 3&quot;,&quot;id&quot;:&quot;QXXNGYWLEG&quot;}" data-component-name="LatexBlockToDOM"></div><p>Musical logarithms? What is he talking about? Surely this is pointless. Yes, it is! Hold on!</p><h2>Harmonic series</h2><p>If you're a <em>brass music</em> nerd like me, you'll know that the 'overtones' of most natural vibrations correspond to the 'harmonic series' (no, not <em><a href="https://en.wikipedia.org/wiki/Harmonic_series_(mathematics)">that</a></em><a href="https://en.wikipedia.org/wiki/Harmonic_series_(mathematics)"> harmonic series</a>, the <em><a href="https://en.wikipedia.org/wiki/Harmonic_series_(music)">actually harmonic</a></em><a href="https://en.wikipedia.org/wiki/Harmonic_series_(music)"> harmonic series</a>), which are the different pitches you can get a big metal tube to vibrate at if you give it <a href="https://www.youtube.com/watch?v=wM1vOAz0_Gc">the right encouragement</a>. Incidentally this is how brass players get dozens of different notes out of an instrument having (usually) only three valves<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>.</p><p>This harmonic series is generated by lovely integer ratios! Why? The physics of oscillators. Integer multiples are the only frequencies which can support a standing wave on the same vibrating object (air column, string, membrane).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uik0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uik0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uik0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uik0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uik0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uik0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg" width="1000" height="1110" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1110,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:112707,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/192919127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uik0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uik0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uik0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uik0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d3e41e4-ccea-46ec-9588-ffb8de8793d2_1000x1110.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Brass players spend hours and hours sliding and jumping between these harmonics as a matter of sheer necessity. Only three valves!<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> So we know them by heart, by fingers, and by ears.</p><h2>Combining the harmonic series with the chromatic scale: magic</h2><p>So we have <em>integer multiples</em>, the harmonic series, laid over a <em>fundamentally logarithmic scale</em>, the chromatic scale consisting of twelve semitones.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EfoP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EfoP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EfoP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EfoP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EfoP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EfoP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg" width="1456" height="198" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:198,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59244,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/192919127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EfoP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EfoP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EfoP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EfoP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b57db8f-c2aa-45fe-87a3-6e2819cf8051_2094x285.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><em>Numbers above the notes correspond to small adjustments vs the equally-spaced semitones which are usually used today to deal with God's joke. Ignore them if you don't care about small percentage errors. This is the harmonic series on C; you can have a series on any starting note with the same intervals</em>.</p><p>Here's the magic trick. Now we can go from arbitrary ratios to musical intervals!</p><p>Start with an easy one, 1.25. That's a ratio 5:4. Fifth harmonic is E (+ 2 8ve). Fourth is C (+2 8ve). The octaves cancel. That's an interval E:C, or four semitones. So 1.25 <em>is four semitones</em>. We already know the 'musical logarithm' of four semitones, it's 4/12 = 1/3. Check on a calculator: log_2(1.25) = 0.32193&#8230;. I promised close, not perfect!</p><p>A slightly trickier one, 1.8. That's a ratio of 9:5. The ninth harmonic is D (+3 8ve), and the fifth is E (+2 8ve). The octaves partly cancel (leaving a single octave). The interval D:E is <em>minus two semitones</em>. Taken off the residual octave, that leaves ten semitones. So log_2(1.8) = 10/12 = 5/6. Calculator check: log_2(1.8) = 0.848&#8230;. Not bad!</p><p>It turns out that the musical harmonic series is secretly a mini table of base 2 logarithms.</p><h2>Base 10, if we really have to</h2><p>The unit that mainstream sheeple often use for fractional logarithms is the <a href="https://en.wikipedia.org/wiki/Decibel">decibel</a>. A decibel divides a base ten order of magnitude in ten. So ten decibels is a dectupling, twenty is a hundredfold, and so on.</p><p>Stated similarly, a semitone divices a base two order of magnitude in twelve.</p><p>In another cosmic whimsy, 2^10 = 1024 ~= 1000 = 10^3. So 120 semitones are essentially equal to 30 decibels, for an easy exchange rate of four semitones per decibel.</p><h2>What</h2><p>Well, look. It's fun, and it gets me logarithms to pretty good approximation. It's good enough for <s>jazz</s> Fermi estimation, as they say. Who is this even good for? I maintain that the intersection between music and mathematics nerds is surprisingly well populated. If that's you, you're welcome. If not, I'm pretty unsure how easy it is to get the harmonic series installed in your brain. Maybe it's only available to the warped few who train in childhood.</p><p>There are some other fun tricks with powers and logarithms of two. For example, if you know your <a href="https://en.wikipedia.org/wiki/Binary_number">binary place values</a>, you can figure out logarithms of very big numbers (and the  trick comes in handy here too).</p><p>There's also a '<a href="https://en.wikipedia.org/wiki/Rule_of_72">rule of 72</a>' which helps when dealing with small percentage growth rates and doubling times.</p><p>I aesthetically like this neat division of doublings into twelve parts, and it's fun to invoke musical intuitions that really have no right to help with mathematics.</p><p>You might complain that twelfths are faffy. Who uses twelfths anyway? Everyone everywhere has used decimal for goodness' sake! Well, I have <a href="https://en.wikipedia.org/wiki/Sexagesimal">something</a> <a href="https://www.scientificamerican.com/blog/roots-of-unity/the-joy-of-sexagesimal-floating-point-arithmetic/">else</a> to share with you...</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe to become more like me, which is definitely something you want after reading this far.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p><a href="https://en.wikipedia.org/wiki/Equal_temperament">Usually nowadays</a> we squish all the fifths a tiny bit so that when stacked up they get to that delicious 128:1.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Three valves independently up or down is a total of eight configurations. Because the third valve is usually set to be redundant with the combination of the first two (which aids fluent finger movement), there are usually only seven practically-distinguishable combinations.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Other wind players, who have the benefit of many more, but not infinitely many keys and buttons, often encounter one or two of these harmonics.</p></div></div>]]></content:encoded></item><item><title><![CDATA[AI for Human Reasoning for You]]></title><description><![CDATA[Let's harness AI building blocks to build civilisation-invigorating tools]]></description><link>https://www.oliversourbut.net/p/ai-for-human-reasoning-for-you</link><guid isPermaLink="false">https://www.oliversourbut.net/p/ai-for-human-reasoning-for-you</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Tue, 03 Feb 2026 13:22:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!eqDq!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6faaaa8-072a-4eb5-9ce9-b6b378c8044a_1098x1098.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today&#8217;s humanity faces many high stakes and even existential challenges; many of the largest are generated or exacerbated by AI. Meanwhile, humans individually and humanity collectively appear distressingly underequipped.</p><p>Lots of folks naturally recognise that this implies a general strategy: make humans individually &#8212; and humanity collectively &#8212; better able to solve problems. Very good! (Complementary strategies look like: make progress directly, raise awareness of the challenges, recruit problem solvers, &#8230;)</p><p>One popular approach is to &#8216;raise the sanity waterline&#8217; in the most oldschool and traditional way: have a community of best practice, exemplify and proselytise, make <em>people</em> wiser one by one and <em>society</em> wiser by virtue of that. There&#8217;ve been some recent successes, not least the existence of communities and forums like <a href="https://www.lesswrong.com/">LessWrong</a> and <a href="https://forum.effectivealtruism.org/">Effective Altruism</a>, and some older philosophies and movements.</p><p>Another popular approach is to imagine augmenting ourselves in the most futuristic and radical ways: genetic engineering, selective breeding, brain-augmenting implants, brain emulation. Go for it, I suppose (mindful of the potential backfires and hazards). But these probably won&#8217;t pan out on what look like the necessary timelines.</p><p>There is a middle ground! Use tech to uplift ourselves, yes &#8212; but don&#8217;t wait for medical marvels and wholesale self-reauthorship. Just use the building blocks we have, anticipate the pieces we might have soon, and address our individual and collective shortcomings one low-hanging fruit at a time.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><p>The most exciting part is that we&#8217;ve got some nifty new building blocks to play with: big data, big compute, ML, and (most novel of all) foundation models and limited agentic AI.</p><h2>How to generate useful ideas in human reasoning</h2><p>One place people fall down here is getting locked into asking: &#8216;<em>OK, what can I usefully ask this AI to do?</em>&#8217;. Sometimes this is helpful. But usually it&#8217;s missing the majority of the design space: agentic form factors are only a very narrow slice of what we can do with technology, and for many purposes they&#8217;re not even especially desirable.</p><p>Think about human reasoning. &#8216;Human&#8217; as in individuals, groups, teams, society, humanity at large. &#8216;Reasoning&#8217; as in the full decision-making cycle, from sensing and understanding through to planning and acting, including <em>acting together</em>.</p><p>I like to first ask: &#8216;What human reasoning activities are in bad shape?&#8217;</p><ul><li><p><a href="https://en.wikipedia.org/wiki/OODA_loop">OODA</a> is one good frame:  </p><ul><li><p>For a given important (type of) decision, what are people observing?  </p></li><li><p>How are they orienting and deciding?  </p></li><li><p>What actions do they have available and do they know how to do them well?  </p></li><li><p>What about the case of teams, groups, institutions: how do their OODAs work (and how do they fail)?  </p></li></ul></li><li><p>Also think about <em>development</em>: how do individuals learn and grow? What about groups and communities, how do they form, grow, connect?  </p></li><li><p>In foresight,  </p><ul><li><p>What features are we even paying attention to in the first place?  </p></li><li><p>What prospects are under consideration?  </p></li><li><p>What affordances are we aware of?  </p></li><li><p>How are we strategically creating sensing opportunities and the means to adapt plans?  </p></li><li><p>How do our forecasts achieve precision and calibration?  </p></li></ul></li><li><p>In epistemics, think about the message-passing nature of most human knowledge processes.  </p><ul><li><p>How do we assess the nodes (communicators)?  </p></li><li><p>How do we assess, digest, and compile the messages (claims, evidence, proposals, &#8230;)?  </p></li><li><p>How do we understand and manage the structure of the network itself (communication relationships, broadcasts and other topologies, &#8230;)?  </p></li><li><p>What about the traffic (message rates, density distribution, repeated and relayed transmissions, &#8230;)?  </p></li><li><p>What messages <em>ought</em> to be routed where, when and on behalf of whom?  </p></li></ul></li><li><p>In coordination, what are the conditions for success?  </p><ul><li><p>We need to find or recognise potential counterparties.  </p></li><li><p>We might need the charters, norms, or institutions to condition and frame interaction productively &#8212; ones which don&#8217;t fail or fall to corruption or capture.  </p></li><li><p>We need to surface enough mutually-compatible intent or outcome preference.  </p></li><li><p>Our ensembled group wisdom might be a necessary source of insight or agility.  </p></li><li><p>We need to survive the tug of war of negotiation (which can dissolve into antagonism, even when there&#8217;s common knowledge of win-win possibilities).  </p></li><li><p>Means of verification and enforcement may be needed to access good options.</p></li></ul></li></ul><p>Think of a particular audience with either the scale or the special influence to make a difference (this can include &#8216;the general public&#8217;), and the deficits they have in these reasoning activities. Now ask: &#8216;What kinds of software<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> might help and encourage people to do those better?&#8217;.</p><ul><li><p>Is there an edge to be gained by unlocking big (or even medium) data (which can often be more living and queryable than ever before thanks to LMs)?  </p></li><li><p>Can large amounts of clerical labour (again LMs) per capita make something newly feasible?  </p></li><li><p>Can big compute and simulation (including multi persona simulation: LMs again!) drive better understanding of an important dynamic?  </p></li><li><p>Can extensive background exploration, search, or &#8216;brainstorming&#8217; by AI surface important opportunities or considerations?  </p></li><li><p>Can always-on, flexibly-semantically-sensitive sensing and monitoring bring attention where it&#8217;s needed faster than before (or at all)?  </p></li><li><p>Could facilitation and translation bring forth, and synergise, the best array of human capabilities in a given context?  </p></li><li><p>Could software&#8217;s repeatability, auditability, and privacy (in principle), combined with the context and semantic sensitivity of AI, unlock new frontiers of trustable human scaffolding?  </p></li><li><p>&#8230;</p></li></ul><h3>Finding flaws and avoiding backfire</h3><p>Think seriously about backfire: we don&#8217;t want to differentially enable bad human actors or rogue AI to reason and coordinate! As Richard Rumelt, author of Good Strategy/Bad Strategy observes,</p><blockquote><p>The idea that coordination, by itself, can be a source of advantage is a very deep principle.</p></blockquote><p>Coordination&#8217;s dark side is collusion, including cartels, oligarchy, and concentration of power, in imaginable extreme cases cutting out most or even all humans.</p><p>Similarly, epistemic advantage (in foresight and strategy, say) can be parlayed into resource or influence advantage. If those can be converted in turn into greater epistemic advantage (by employing compute and position for epistemic attacks or in further private epistemic advancement) without commensurate counterweights or defences, this could be quite problematic.</p><p>Part of it is about choosing distribution strategies which reduce misuse surface area (or provide antidotes), and part of it is about preferring tech which asymmetrically supports (and perhaps encourages) &#8216;good&#8217; use and behaviour.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe to this blog, a tech which asymmetrically supports (and perhaps encourages) &#8216;good&#8217; use and behaviour!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Do it</h2><p><a href="https://aiforhumanreasoning.com/">FLF&#8217;s fellows</a>, and <a href="https://www.oliversourbut.net/p/a-full-epistemic-stack">I and others</a> have been <a href="https://www.forethought.org/research/design-sketches-collective-epistemics">doing some</a> of <a href="https://www.forethought.org/research/design-sketches-angels-on-the-shoulder">this exploration</a> recently. Stay tuned for more. Meanwhile, join in! <a href="https://www.forethought.org/research/the-first-type-of-transformative-ai">We&#8217;re early in a critical period</a> where much is up for grabs and what we build now <em>might</em> help shape and inform the choices humanity makes about its future (or whether it makes much choice at all). Try things, see what kinds of tools earn the attention and adoption that matters, and share what you learn. Consider principles to apply, especially for minimising backfire risks, and share particular considerations for or against certain kinds of tech and audience targets.</p><p><em>Thanks to Owen Cotton-Barratt and Ben Goldhaber for helpful comments, and to Lizka Vaintrob for recent relevant conversations</em></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>A close relative of this strategy is <a href="https://www.lesswrong.com/posts/bxt7uCiHam4QXrQAA/cyborgism">cyborgism</a>. I might contrast what I&#8217;m centrally describing as being more outward-looking, asking how we can uplift the most important sensemaking and wisdom apparatus of <em>humanity in general</em>, whereas cyborgism maybe looks centrally more like a bet on <em>becoming</em> the uplifted paragons (optionally thence, and thereby, saving the world). I&#8217;d say these are complementary on the whole.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>This is better than asking &#8216;What kinds of AI&#8230;&#8217;. Software is the general, capability-unlocking and -enhancing artefact. AI components and form-factors are novel, powerful, sometimes indispensable building blocks in our inventory to compose software capabilities out of.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[The First Type of Transformative AI?]]></title><description><![CDATA[Ask 'what will AI transform?', not 'when?'... and what choices do we have in that?]]></description><link>https://www.oliversourbut.net/p/the-first-type-of-transformative</link><guid isPermaLink="false">https://www.oliversourbut.net/p/the-first-type-of-transformative</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Tue, 06 Jan 2026 17:31:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ak5T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I recently contributed to a discussion of <a href="https://www.forethought.org/research/the-first-type-of-transformative-ai">the first type of transformative AI</a> with Owen Cotton-Barrat and Lizka Vaintrob. It&#8217;s part of a series those two (primarily, with some input from me and others) are working on, which asks, expanding on <a href="https://www.forethought.org/research/ai-tools-for-existential-security">their agenda from last year</a>:</p><p><em>AI is not just one, big, singular thing. What are the ways we can bring forward the beneficial possibilities, while delaying or defending against the harmful ones?</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ak5T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ak5T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp 424w, https://substackcdn.com/image/fetch/$s_!ak5T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp 848w, https://substackcdn.com/image/fetch/$s_!ak5T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp 1272w, https://substackcdn.com/image/fetch/$s_!ak5T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ak5T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp" width="1456" height="938" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:938,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Hand-drawn graph of AI-driven change over time, rising gradually then accelerating before AGI. Highlights uncertainty about how much the world changes before AGI and why the sequence of pre-AGI transformations matters.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Hand-drawn graph of AI-driven change over time, rising gradually then accelerating before AGI. Highlights uncertainty about how much the world changes before AGI and why the sequence of pre-AGI transformations matters." title="Hand-drawn graph of AI-driven change over time, rising gradually then accelerating before AGI. Highlights uncertainty about how much the world changes before AGI and why the sequence of pre-AGI transformations matters." srcset="https://substackcdn.com/image/fetch/$s_!ak5T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp 424w, https://substackcdn.com/image/fetch/$s_!ak5T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp 848w, https://substackcdn.com/image/fetch/$s_!ak5T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp 1272w, https://substackcdn.com/image/fetch/$s_!ak5T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a3f0a2-bdc5-46e3-89e0-f939c455db7c_2624x1691.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AI tools can <em>already</em> produce large changes, and the potentials there will only increase. &#8216;AGI&#8217; is a moving goalpost, but judicious work now can make sure that society is better positioned to deal with later developments and risks (e.g. by avoiding or defending).</figcaption></figure></div><p>As I repeatedly emphasise to anyone who&#8217;ll listen: it&#8217;s <em>never</em> been just a dichotomy between &#8216;yes, good, more AI please&#8217; and &#8216;no, bad, less AI thank you&#8217; &#8212; and it&#8217;s not even a case of just making sure &#8216;the AIs&#8217; are good (though this helps). <em>How</em> the tools and technological building blocks at our disposal are integrated into applications, workflows, and use-cases, is just as important.</p><p>And the tools, products, and systems we could develop now <em>shape</em> the context for subsequent developments, including by:</p><ul><li><p>equipping people to better predict and understand their options</p></li><li><p>enabling people to coordinate better around preferred possibilities (which might otherwise be difficult due to mismatched incentives or race dynamics)</p></li><li><p>giving the tools to defuse or defend against hazardous developments, or their precursors</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tmTl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tmTl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp 424w, https://substackcdn.com/image/fetch/$s_!tmTl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp 848w, https://substackcdn.com/image/fetch/$s_!tmTl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp 1272w, https://substackcdn.com/image/fetch/$s_!tmTl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tmTl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp" width="1456" height="668" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:668,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Hand-drawn diagram comparing two strategies for early AI transitions: improving how individual transformations go, or influencing their order. Notes that earlier transitions are more predictable and more neglected than later ones.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Hand-drawn diagram comparing two strategies for early AI transitions: improving how individual transformations go, or influencing their order. Notes that earlier transitions are more predictable and more neglected than later ones." title="Hand-drawn diagram comparing two strategies for early AI transitions: improving how individual transformations go, or influencing their order. Notes that earlier transitions are more predictable and more neglected than later ones." srcset="https://substackcdn.com/image/fetch/$s_!tmTl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp 424w, https://substackcdn.com/image/fetch/$s_!tmTl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp 848w, https://substackcdn.com/image/fetch/$s_!tmTl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp 1272w, https://substackcdn.com/image/fetch/$s_!tmTl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e4c09e-ca97-431b-8c86-e9425e80ab81_1710x785.webp 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">When considering &#8216;big transformations&#8217; from technology, among the strategies we can attempt if we want things to go better for society are: change the &#8216;order&#8217; these transformations arrive in (which might even prevent or reshape later transitions), or improve the way particularly important transitions go.</figcaption></figure></div><p>For technologists, futurists, philanthropists, legislators, experts, and other members of society trying to make tech progress go well, paying attention to <em>which</em> effects happen, in what order &#8212; and what our options are for choosing wisely there &#8212; looks like a really promising, and neglected way of reducing large-scale risk and bringing about huge benefits.</p><p>You can read <a href="https://www.forethought.org/research/the-first-type-of-transformative-ai">our few pages of fuller discussion</a> for more of our thoughts on some scenarios we think are worth considering, including intelligence explosion, turbocharged economy, and epistemic uplift.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Oly on AI! Subscribing will <em>mitigate the risk</em> that you miss more of my insightful writing.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[A Full Epistemic Stack]]></title><description><![CDATA[Knowledge Commons for the 21st Century]]></description><link>https://www.oliversourbut.net/p/a-full-epistemic-stack</link><guid isPermaLink="false">https://www.oliversourbut.net/p/a-full-epistemic-stack</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Fri, 19 Dec 2025 22:35:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0aqH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>We&#8217;re writing this in our personal capacity. While our work at the <a href="https://www.flf.org/">Future of Life Foundation</a> has recently focused on this topic and informs our thinking here, this specific presentation of our views are our own.</em></p><p>Knowledge is integral to living life well, at all scales:</p><ul><li><p><strong>Individuals manage their life choices</strong>: health, career, investment, and others on the basis of what they understand about themselves and their environments.</p></li><li><p><strong>Institutions and governments (ideally) regulate</strong> economies, provide security, and uphold the conditions for flourishing under their jurisdictions, only if they can make requisite sense of the systems involved.</p></li><li><p><strong>Technologists and scientists push the boundaries</strong> of the known, generating insights and techniques judged valuable by combining a vision for what is possible with a conception of what is desirable (or as proxy, demanded).</p></li><li><p>More broadly, <strong>societies negotiate their paths forward</strong> through discourse which rests on some reliable, broadly shared access to a body of knowledge and situational awareness about the biggest stakes, people&#8217;s varied interests in them, and our shared prospects.</p><ul><li><p>(We&#8217;re especially interested in how societies and humanity as a whole can navigate the many challenges of the 21st century, most immediately AI, automation, and biotechnology.)</p></li></ul></li></ul><p>Meanwhile, dysfunction in knowledge-generating and -distributing functions of society means that knowledge, and especially <em>common</em> knowledge, often looks fragile<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. Some blame <a href="https://www.amazon.co.uk/Republic-Divided-Democracy-Social-Media/dp/0691175519">social media</a> (platform), some <a href="https://en.wikipedia.org/wiki/Post-truth_politics">cynical political elites</a> (supply), and others the <a href="https://www.conspicuouscognition.com/p/is-social-media-destroying-democracyor">deplorable common people</a> (demand).</p><p>But reliable knowledge underpins news, history, and science alike. What resources and infrastructure would a society <em>really nailing this</em> have available?</p><p>Among other things, we think its communication and knowledge infrastructure would make it <em>easy</em> for people to learn, check, compare, debate, and build in ways which compound and reward good faith. This means tech, and we think the technical prerequisites, the need, and the vision for a <em><strong>full epistemic stack</strong></em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> are coming together right now. Some pioneering practitioners and researchers <a href="https://sarahconstantin.substack.com/p/tech-for-thinking">are already</a> making some progress. We&#8217;d like to <a href="https://www.forethought.org/research/ai-tools-for-existential-security">nurture and welcome it along</a>.</p><p>In this short series, we&#8217;ll outline some ways we&#8217;re thinking about the space of tools and foundations which can raise the overall epistemic waterline and enable us all to make more sense. In this first post, we introduce frames for mapping the space &#8212;<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> different layers for info gathering, structuring into claims and evidence, and assessment &#8212; and potential end applications that would utilize the information.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0aqH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0aqH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0aqH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0aqH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0aqH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0aqH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg" width="1389" height="883" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:883,&quot;width&quot;:1389,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:361988,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0aqH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0aqH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0aqH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0aqH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c0c063-4a74-4f21-a4c0-4929f36e72fc_1389x883.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>A full what?</h2><p>A full epistemic stack. <em><a href="https://dictionary.cambridge.org/dictionary/english/epistemic">Epistemic</a></em> as in getting (and sharing) knowledge. <em>Full stack</em> as in all of the technology necessary to support that process, in all its glory.</p><p>What&#8217;s involved in gathering information and forming views about our world? Humans aren&#8217;t, primarily, isolated observers. Ever since the Sumerians and their <a href="https://en.wikipedia.org/wiki/Complaint_tablet_to_Ea-n%C4%81%E1%B9%A3ir">written customer complaints</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>, humans have received information about much of <s>their</s> our world <em>from other humans</em>, for better or worse. We sophisticated modern beings consume information diets transmitted across unprecedented distances in space, time, and network scale.</p><p>With an accelerating pace of technological change and with potential <a href="https://web.archive.org/web/20251217044900/https://www.merriam-webster.com/wordplay/word-of-the-year">information overload at machine speeds</a>, we will need to <a href="https://aiforhumanreasoning.com/">improve our collective intelligence game to keep up</a> with the promise and perils of the 21st century.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8_iY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8_iY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png 424w, https://substackcdn.com/image/fetch/$s_!8_iY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png 848w, https://substackcdn.com/image/fetch/$s_!8_iY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png 1272w, https://substackcdn.com/image/fetch/$s_!8_iY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8_iY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png" width="1021" height="444" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:444,&quot;width&quot;:1021,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:766470,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8_iY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png 424w, https://substackcdn.com/image/fetch/$s_!8_iY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png 848w, https://substackcdn.com/image/fetch/$s_!8_iY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png 1272w, https://substackcdn.com/image/fetch/$s_!8_iY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce3e44c8-4119-42fb-8c90-3830a3601fcb_1021x444.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Imagine an upgrade.</strong> People faced with news articles, social media posts, research papers, chatbot responses, and so on can trivially trace their complete epistemic origins &#8212; links, citations, citations of citations, original data sources, methodologies &#8212; as well as helpful context (especially useful responses, alternative positions, and representative supporting or conflicting evidence). That&#8217;s a lot, so perhaps more realistically, most of the time, people don&#8217;t bother&#8230; but the facility is there, and everyone knows everyone knows it. More importantly, everyone knows <em>everyone&#8217;s AI assistants</em> know it (and we know those are far less lazy)! So the waterline of information trustworthiness and good faith discourse is raised, for good. Importantly, humans are still very much in the loop &#8212; to borrow a phrase from Audrey Tang, we might even say <em><a href="https://www.pioneerspost.com/news-views/20251106/we-are-already-the-super-intelligence-we-are-looking-audrey-tang-and-others-big">machines</a></em><a href="https://www.pioneerspost.com/news-views/20251106/we-are-already-the-super-intelligence-we-are-looking-audrey-tang-and-others-big"> are in the </a><em><a href="https://www.pioneerspost.com/news-views/20251106/we-are-already-the-super-intelligence-we-are-looking-audrey-tang-and-others-big">human</a></em><a href="https://www.pioneerspost.com/news-views/20251106/we-are-already-the-super-intelligence-we-are-looking-audrey-tang-and-others-big"> loop</a>.</p><p>Some pieces of this are already practical. Others will be a stretch with careful scaffolding and current-generation AI. Some might be just out of reach without general model improvements&#8230; but we think they&#8217;re all close: 2026 could be the year this starts to get real traction.</p><p>Does this change (or save) the world on its own? Of course not. In fact we have a long list of cautionary tales of premature and overambitious epistemic tech projects which achieved very few of their aims: the biggest challenge is plausibly distribution and uptake. (We will write something more about that later in this series.) And sensemaking alone isn&#8217;t sufficient! &#8212; will and creativity and the means to coordinate sufficiently at the relevant scale are essential complements. But there&#8217;s significant and robust value to improving everyone&#8217;s ability to reason clearly about the world, and we do think <a href="https://xkcd.com/927/">this time can be different</a>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Oly on AI! Subscribe, because it&#8217;s epistemically virtuous to do so.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Layers of a foundational protocol</h2><p><strong>Considering the dynamic message-passing network of human information processing</strong>, we see various possible hooks for communicator-, platform-, network-, and information-focused tech applications which could work together to improve our collective intelligence.</p><p>We&#8217;ll briefly discuss some <em>foundational information-focused layers</em> together with <em>user experience (UX)</em> and <em>tools</em> which can utilise the influx of cheap clerical labour from LMs, combined with intermittent judgement from humans, to make it smoother and easier for us all to make sense.</p><p>All of these pieces stand somewhat alone &#8212; a part of our vision is an interoperable and extensible suite &#8212; but we think implementations of some foundations have enough synergy that it&#8217;s worth thinking of them as a suite. We&#8217;ll outline where we think synergies are particularly strong. In later posts we&#8217;ll look at some specific technologies and examples of groups already prototyping them; for now we&#8217;re painting in broad strokes some goals we see for each part of the stack.</p><h3>Ingestion: observations, data, and identity</h3><p>Ultimately grounding all empirical knowledge is some collection of observations&#8230; but most people rely on second-hand (and even more indirect) observation. Consider the climate in Hawaii. Most people aren&#8217;t in a position to directly observe that, but many have some degree of stake in nonetheless knowing about it or having the affordance to know about it.</p><p>For some topics, &#8216;<a href="https://knowyourmeme.com/memes/source-i-made-it-up">source? Trust me bro</a>,&#8217; is sufficient: what reason do they have to lie, and does it matter much anyway? Other times, for higher stakes applications, it&#8217;s better to have more confirmation, ranging from a <a href="https://en.wikipedia.org/wiki/Affidavit">staked reputation</a> for honesty to cryptographic guarantee<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>.</p><p>Associating artefacts with metadata about origin and authorship (and further guarantees if available) can be a multiplier on downstream knowledge activities, such as tracing the provenance of claims and sources, or evaluating track records for honesty. Thanks to AI, precise formats matter less, and tracking down this information can be much more tractable. This tractability can drive the critical mass needed to start a virtuous cycle of sharing and interoperation, which early movers can encourage by converging on lightweight protocols and metadata formats. In true 21st Century techno-optimist fashion, we think no centralised party need be responsible for storing or processing (though distributed caches and repositories can provide valuable network services, especially for indexing and lookup<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KtqM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KtqM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!KtqM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!KtqM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!KtqM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KtqM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png" width="1024" height="565" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:565,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KtqM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!KtqM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!KtqM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!KtqM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57075e86-2c1d-4c9c-a616-484dd02109e7_1024x565.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Structure: inference and discourse</h3><p>Information passing and knowledge development involve far more than sharing basic observations and datasets between humans. There are at least two important types of structure: inference and discourse.</p><h4>Inference structure: genealogy of claims and supporting evidence (Structure I)</h4><p>Ideally perhaps, raw observations are reliably recorded, their search and sampling processes unbiased (or well-described and accounted for), inferences in combination with other knowledge are made, with traceable citations and with appropriate uncertainty quantification, and finally new traceable, conversation-ready claims are made.</p><p>We might call this an <em>inference structure</em>: the genealogy and epistemic provenance of given claims and observations, enabling others to see how conclusions were reached, and thus to repeat or refine (or refute) the reasoning and investigation that led there.</p><p>Of course in practice, inference structure is often illegible and effortful to deal with at best, and in many contexts intractable or entirely absent. We are presented with a selectively-reported news article with a scant few hyperlinks, themselves not offering much more context. Or we simply glimpse the tweet summary with no accompanying context.</p><p>Even in science and academia where citation norms are strongest, a citation might point to a many-page paper or a whole book in support of a single local claim, often <a href="https://aiprospects.substack.com/p/when-ideas-round-to-false">losing nuance or distorting meaning</a> along the way, and adding much friction to the activity of assessing the strength of a claim<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>.</p><p>How do tools and protocols improve this picture? Metascience reform movements like <a href="https://nanopub.net/">Nanopublications</a> strike us as a promising direction.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1qKk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1qKk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png 424w, https://substackcdn.com/image/fetch/$s_!1qKk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png 848w, https://substackcdn.com/image/fetch/$s_!1qKk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png 1272w, https://substackcdn.com/image/fetch/$s_!1qKk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1qKk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png" width="512" height="512" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:512,&quot;width&quot;:512,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1qKk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png 424w, https://substackcdn.com/image/fetch/$s_!1qKk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png 848w, https://substackcdn.com/image/fetch/$s_!1qKk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png 1272w, https://substackcdn.com/image/fetch/$s_!1qKk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87a1e48d-dd70-4d10-ab53-a42782021be7_512x512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Already, LM assistance can make some of this structure more practically accessible, including in hindsight. A lightweight sharing format and caches for commonly accessed inference structure metadata can turn this into a reliable, cheap, and growing foundation: a graph of claims and purported evidence, for improved further epistemic activity like auditing, hypothesis generation, and debate mapping.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NhEM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NhEM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!NhEM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!NhEM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!NhEM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NhEM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png" width="1024" height="565" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:565,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NhEM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!NhEM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!NhEM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!NhEM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c681fc1-b245-4e6e-99bb-afb47518a4dd_1024x565.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Discourse: refinement, counterargument, refutation (Structure II)</h4><p>Knowledge production and sharing is dynamic. With claims made (ideally legibly), advocates, detractors, investigators, and the generally curious bring new evidence or reason to the debate, strengthening or weakening the case for claims, discovering new details, or inferring new implications or applications.</p><p>This <em>discourse structure</em> associates related claims and evidence, relevant observations which might not have originally been made with a given topic in mind, and competing or alternative positions.</p><p>Unfortunately in practice, many arguments are made and repeated without producing anything (apart from anger and dissatisfaction and occasional misinformation), partly because they&#8217;re <em>disconnected</em> from discourse. This is valuable both as contextual input (understanding the state of the wider debate or investigation so that the same points aren&#8217;t argued ad infinitum and people benefit from updates), and as output (propagating conclusions, updates, consensus, or synthesis back to the wider conversation).</p><p>This shortcoming holds back science, and pollutes politics.</p><p>Tools like Wikipedia (and other encyclopedias), at their best, serve as <em>curated summaries </em>of the state of discourse on a given topic. If it&#8217;s fairly settled science, the clearest summaries and best sources should be made salient (as well as some history and genealogy). If it&#8217;s a lively debate, the state of the positions and arguments, perhaps along with representative advocates, should be summarised. But encyclopedias can be limited by sourcing, available cognitive labour and update speed, one-size-fits-all formatting, and sometimes curatorial bias (whether human or AI).<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a></p><p>Similar to the inference layer, there is massive untapped potential to develop automations for better discourse tracking and modeling. For example, LLMs doing literature reviews can source content from a range of perspectives for downstream mapping. Meanwhile, relevant new artefacts can be detected and ingested close to realtime. We don&#8217;t need to agree on all conclusions &#8212; but we <em>can </em>much more easily agree <em>on the status of discourse</em>: positions on a topic, the strongest cases for them, and the biggest holes<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>. Direct access as well as helpful integrations with existing platforms and workflows can surface the most useful context to people as needed, in locally-appropriate format and level of detail.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6wOj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6wOj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!6wOj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!6wOj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!6wOj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6wOj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png" width="1024" height="565" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:565,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6wOj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!6wOj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!6wOj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!6wOj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4721aa2f-10ac-4719-a8cd-d93c5ff3a4cb_1024x565.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Assessment: credence, endorsement, and trust</h3><p>Claims and evidence, together with counter claims and an array of perspectives (however represented), give some large ground source of potential insight. But at a given time and for a given person there is some question to be answered: reaching trusted summaries and positions.</p><p>Ultimately consumers of information sources come to conclusions on the basis of diverse signals: compatibility with their more direct observations, assessment of the trustworthiness and reliability (on a given topic) of a communicator, assessment of methodological reasonableness, weighing and comparing evidence, procedural humility and skepticism, explicit logical and probabilistic inference, and so on. It&#8217;s squishy and diverse!</p><p>We think some technologies are unable to scale because they&#8217;re too rigid in assigning explicit probabilities, or because they enforce specific rules divorced from context. This fails to account for real reasoning processes and also can work against trust because people (for good and bad reasons) have idiosyncratic emphases in what constitutes sensible reasoning.</p><p>We expect that trust should be a late-binding property (i.e. at the application layer), to account for varied contexts and queries and diverse perspectives, interoperable with minimally opinionated <em>structure</em> metadata. That said, squishy, contextual, customisable reasoning is increasingly scalable and available for computation! So caches and helpful precomputations for common settings might also be surprisingly practical in many cases.</p><p>With foundational structure to draw from, this is where things start to substantially branch out and move toward the application layer. Some use cases, like summarisation, highlighting key pros and cons and uncertainties, or discovery, might directly touch users. Other times, downstream platforms and tools can integrate via a variety of customized assessment workflows.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aspV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aspV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!aspV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!aspV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!aspV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aspV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png" width="1024" height="565" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:565,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aspV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png 424w, https://substackcdn.com/image/fetch/$s_!aspV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png 848w, https://substackcdn.com/image/fetch/$s_!aspV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png 1272w, https://substackcdn.com/image/fetch/$s_!aspV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc9396f-c434-464e-9ce9-ece4d28ea117_1024x565.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Beyond foundations: UX and integrations</h2><p>Foundations and protocols and epistemic tools sound fun only to a subset of people. But (almost) everyone is interested in some combination of news, life advice, politics, tech, or business. We don&#8217;t anticipate much direct use by humans of the epistemic layers we&#8217;ve discussed. But we already envision multiple downstream integrations into existing and emerging workflows: this motivates the interoperability and extensibility we&#8217;ve mentioned.</p><p>A few gestures:</p><ul><li><p>Social media platforms struggle under adversarial and attentional pressures. But distributed, decentralised context-provision, like the early success stories in <a href="https://en.wikipedia.org/wiki/Community_Notes">Community Notes</a>, can serve as a widely-accessible point of distribution (and this is just one form factor among many possible). In turn, foundational epistemic tooling can feed systems like Community Notes.</p></li><li><p>More speculatively, social-media-like interfaces for uncovering group wisdom and will at larger scales while eliciting more productive discourse might be increasingly practical, and would be supported by this foundational infrastructure.</p></li><li><p>Curated summaries like encyclopedias (centralised) and Wikipedia (decentralised) are often able to give useful overviews and context on a topic. But they&#8217;re slow, don&#8217;t have coverage on demand, offer only one-size-fits-all, and are sometimes subject to biases. Human and automated curators could consume from foundational epistemic content and react to relevant updates responsively. Additionally, with discourse and inference structure more readily and deeply available, new, richly-interactive and customisable views are imaginable: for example enabling strongly grounded up- and down-resolution of topics on request<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>, or highlighting areas of disagreement or uncertainty to be resolved.</p></li><li><p>Authors and researchers already benefit from search engines, and more recently &#8216;deep research&#8217; tooling. Integration with easily available relational epistemic metadata, these uplifts can be much more reliable, trustworthy, and effective.</p></li><li><p>Emerging use of search-enabled AI chatbots as primary or complementary tools for search, education, and inquiry means that these workflows may become increasingly impactful. Equipping chatbots with access to discourse mapping and depth of inference structure can help their responses to be grounded and direct people to the most important points of evidence and contention on a topic.</p></li><li><p>Those who want to can already layer extensions onto their browsing and mobile internet experiences. Having always-available or on-demand highlighting, context expandables, warnings, and so on, is viable mainly to the extent that supporting metadata are available (though LMs could approximate these to some degree and at greater expense). More speculatively, we might be due a browser UX exploration phase as more native AI integration into browsing experiences becomes practical: many such designs could benefit from availability of epistemic metadata.</p></li></ul><h2>How? Why now?</h2><p>If this would be so great, why has nobody done it already? Well, vision is one thing, and we could also make a point about underprovision of collective goods like this. But more relevant, the technical capacity to pull off this stack is only really just coming online. We&#8217;re not the first people to notice the wonders of language models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k3_V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k3_V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png 424w, https://substackcdn.com/image/fetch/$s_!k3_V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png 848w, https://substackcdn.com/image/fetch/$s_!k3_V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png 1272w, https://substackcdn.com/image/fetch/$s_!k3_V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k3_V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png" width="1024" height="572" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6438a65-a59f-447f-8e02-271c7544b509_1024x572.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:572,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k3_V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png 424w, https://substackcdn.com/image/fetch/$s_!k3_V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png 848w, https://substackcdn.com/image/fetch/$s_!k3_V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png 1272w, https://substackcdn.com/image/fetch/$s_!k3_V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6438a65-a59f-447f-8e02-271c7544b509_1024x572.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>First, the not inconsiderable inconveniences of the core epistemic activities we&#8217;ve discussed are made <em>less overwhelming</em> by, for example, the ability of LLMs to digest large amounts of source information, or to carry out semi-structured searches and investigations. Even so, this looks to us like mainly a power-user approach, even if it came packaged in widely available tools similar to deep research, and it doesn&#8217;t naively contribute to enriching knowledge commons. We can do better.</p><p>With a lightweight, extensible protocol for metadata, <em>caching and sharing</em> of discovered inference structure and discourse structure becomes nearly trivial<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>. Now the investigations of power users (and perhaps ongoing clerical and maintenance work by LLM agents) produce <em>positive epistemic spillover</em> which can be consumed in principle by any downstream application or interface, and which composes with further work<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a>. Further, the risks of hallucinated or confabulated sources (for LMs as with humans) can be limited by (sometimes adversarial) checking. The epistemic power is <em>in the process</em>, not in the AI.</p><p>Various types of openness can bring benefits: extensibility, trust, reach, distribution &#8212; but can also bring challenges like bad faith contributions (for example omitting or pointing to incorrect sources) or mistakes. Tools and protocols at each layer will need to navigate such tradeoffs. One approach could have multiple authorities akin to public libraries taking responsibility for providing living, well-connected views over different corpora and topics &#8212; while, importantly, providing public APIs for endorsing or critiquing those metadata. Alternatively, perhaps anyone (or their LLM) could check, endorse, or contribute alternative structural metadata<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a>. Then the provisions of identity and endorsement in an assessment layer would need to solve the challenges of filtering and canonicalisation.</p><p>In specific epistemic communities and on particular topics, this could drive much more comprehensive understanding of the state of discourse, pushing the knowledge frontier forward faster and more reliably. Across the broader public, discourse mapping and inference metadata can act against deliberate or accidental distortion, supporting (and incentivising) more good faith communication.</p><h2>Takeaways</h2><p>Knowledge, especially reliable shared knowledge, helps humans individually and collectively be <em>more right</em> in making plans and taking action. Helping people better trust the ways they get and share useful information can deliver widespread benefits as well as defending against large-scale risk, whether from mistakes or malice.</p><p>We communicate at greater scales than ever, but our foundational knowledge infrastructure hasn&#8217;t scaled in the same way. We see a large space of opportunities to improve that &#8212; only recently coming into view with technical advances in AI and ever-cheaper compute.</p><p>This is the first in what will be a series exploring one corner of the design landscape for epistemic tech: there are many uncertainties still, but we&#8217;re excited enough that we&#8217;re investigating and investing in pushing it forward.</p><p>We&#8217;ll flesh out more of our current thinking on this stack in future entries in this series, including more on existing efforts in the space, interoperability, and core challenges here (especially distribution).</p><p>Please get in touch if any of this excites or inspires you, or if you have warnings or reasons to be skeptical!</p><p><em>Thanks to our colleagues at the Future of Life Foundation, and to several epistemic tech pioneers for helpful conversations feeding into our thinking.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TEY_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TEY_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!TEY_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!TEY_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!TEY_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TEY_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TEY_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!TEY_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!TEY_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!TEY_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29de8112-5c59-4f94-a672-2c674fe0e65b_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/a-full-epistemic-stack?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Oly on AI! Make sure to cite this post clearly when you share it!</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/a-full-epistemic-stack?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/p/a-full-epistemic-stack?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>You might think this is a new or worsening phenomenon, or you might think it perennial. Either way, it&#8217;s hard to deny that things would ideally be much better. We further think there is some urgency to this, both due to rising stakes and due to foreseeable potential for escalating distortion via AI.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Improved terminological branding sorely needed</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Coauthor Oly formerly frequently used single hyphens for this sort of punctuation effect, but <a href="https://www.reddit.com/r/ChatGPT/comments/1fx12q1/is_an_em_dash_proof_of_ai_manipulation/">coincidentally</a> started using em-dashes recently when someone kindly pointed out that it&#8217;s trivial to write them while drafting in google docs. This entire doc is human-written (except for images). Citation: trust us.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>or perhaps as early as Homo erectus and his <a href="https://humanjourney.us/language/symbolic-language/">supposed pantomime communication</a>, or even earlier</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>Some such guarantees might come from signed hardware, proof of personhood, or watermarking. We&#8217;re not expecting (nor calling for!) all devices or communications to be identified, and not necessarily expecting increased pervasiveness of such devices. Even where the capability is present on hardware, there are legitimate reasons to prefer to scrub identifying metadata before some transmissions or broadcasts. In a related but separate thread of work, we&#8217;re interested in ways to expand the frontier of privacy x verification, where we also see some promising prospects.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>Compare search engine indexes, or the <a href="https://archive.org/about/">Internet Archive</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>Relatedly, but not necessarily as part of this package, we are interested in automating and scaling the ability to quickly identify rhetorical distortion or unsupported implicature, which manifests in science as <a href="https://www.clearerthinking.org/post/importance-hacking-a-major-yet-rarely-discussed-problem-in-science">importance hacking</a> and in journalism as spin, sensationalism, and misleading framing.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>Wikipedia, itself somewhere on the frontier of human epistemic infrastructure, becomes at its weakest points a battleground and a source of contention that it&#8217;s not equipped to handle in its own terms.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>This gives open, discoverable discourse a lot of adversarial robustness. You can do all you like to deny a case, malign its proponents, claim it&#8217;s irrelevant&#8230; but these are all just new (sometimes valuable!) entries in the implicit &#8216;ledger&#8217; of discourse on a topic. This &#8216;append-only&#8217; property is much more robust than an opinionated summary or authoritative canonical position. Of course append-only raises practical computational and storage concerns, and editorial bias can re-enter any time summarisation and assessment is needed.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p>Up- and down-resolution is already cheaply available on request: simply ask an LLM &#8216;explain this more&#8217; or &#8216;summarise this&#8217;. But the process will be illegible, hard to repeat, and lack the trust-providing support of grounding in annotated content.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p>Storage and indexing is the main constraint to caching and sharing, but the metadata should be a small fraction of what is already stored and indexed in many ways on the internet.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-12" href="#footnote-anchor-12" class="footnote-number" contenteditable="false" target="_self">12</a><div class="footnote-content"><p>How to fund the work that produces new structure? In part, integration with platforms and workflows that people already use. In part, this is a public good, so we&#8217;re talking about philanthropic and public goods funding. In some cases, institutions and other parties with interest in specific investigations may bring their own compute and credits.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-13" href="#footnote-anchor-13" class="footnote-number" contenteditable="false" target="_self">13</a><div class="footnote-content"><p>Does this lack of opinionated authority on canonical structure defeat the point of epistemic commons? Could a cult, say, provision their own para-epistemic stack? Probably &#8212; in fact in primitive ways <a href="https://www.clearerthinking.org/post/what-makes-something-a-cult">they already do</a> &#8212; but it&#8217;d be more than a little inconvenient, and we think that availability of epistemic foundation data and ideally integration into existing platforms, especially <em>because</em> it&#8217;s unopinionated and flexible in terms of final assessment, can drive much improvement in any less-than-completely adversarially cursed contexts.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Better than logarithmic returns to reasoning?]]></title><description><![CDATA[When does thinking harder (or longer) pay off?]]></description><link>https://www.oliversourbut.net/p/better-than-logarithmic-returns-to</link><guid isPermaLink="false">https://www.oliversourbut.net/p/better-than-logarithmic-returns-to</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Wed, 30 Jul 2025 00:50:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!eqDq!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6faaaa8-072a-4eb5-9ce9-b6b378c8044a_1098x1098.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Lots of phenomena turn out to have logarithmic returns: to get an improvement, you double effort or resources put in, but then to get the same improvement you have to double inputs again and again and so on. Equivalently, input costs are exponential in output quality<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. You can probably think of some examples.</p><p>I want to know: is &#8216;extra reasoning compute&#8217; like this? (Or, under what conditions and by what means can you beat this?) I&#8217;m especially interested in this question as applied to <a href="https://www.lesswrong.com/posts/aPnduAouYFzbTyhkm/you-can-t-skip-exploration-why-understanding-experimentation">deliberate exploration and experiment design</a>.</p><p>Said another way, from a given decision-context, without making extra observations or gathering extra data, what are the optimal marginal returns to &#8216;thinking harder&#8217;<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> about what to do next?</p><p>Intuitively, if I have a second to come up with a plan, it might be weak, five minutes and it might be somewhat reasonable, a day and it&#8217;ll be better, a year (full time!) and I&#8217;ve reached <em>very</em> diminishing returns. Presumably a century in my ivory tower would be barely better. I&#8217;d usually do better trying to get more data.</p><p>Is this even a sensible question, or is &#8216;improvement in reasoning output&#8217; far too vague to get traction here?</p><p>That&#8217;s the question; below some first thoughts toward an answer.</p><h2>Simple model: repeated sampling/best of k</h2><p>If you have a proposal generator, and you can choose between proposals, a simple approach to getting better generations is:</p><ul><li><p>sample a large number, k, of proposals</p></li><li><p>(try to) evaluate and pick the best one</p></li></ul><p>(This is actually the best strategy you could take if you can only add parallel compute, but there might be strictly better approaches if you can add serial<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>.)</p><p>Even assuming you can unerringly pick the best one, this strategy turns out to have logarithmically-bounded expected value for many underlying distributions of proposals<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>. In fact, for a normally-distributed proposal generator, you even get the slightly worse <em>square root of logarithmic</em> growth<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>.</p><p>You can in principle sidestep this if your proposal generator has sufficiently heavy-tailed proposal distribution, <em>and</em> you can reliably ex ante distinguish better from worse at the tails.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe! Don&#8217;t spend more than k seconds thinking about it.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Another simple model: widely distributed &#8216;promise&#8217; of lines of inquiry</h2><p>Suppose you have various lines of inquiry to spend thinking time on. The best you can do is:</p><ul><li><p>start thinking on the most promising lines</p></li><li><p>spend additional thinking on successively less promising lines</p></li></ul><p>(This assumes you can somewhat reliably distinguish promise.)</p><p>If their &#8216;quality&#8217; or &#8216;promise&#8217; ranges over many orders of magnitude, then even if you get to accumulate insights additively<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>, you&#8217;ll actually make only bounded progress<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a> towards a theoretical &#8216;best possible&#8217; - this is <em>worse</em> than logarithmic, though looks qualitatively similar over a substantial range of effort.</p><p>But why would the promise of lines of inquiry range over many orders of magnitude? We might say, &#8216;in practice, it often seems to&#8217;, and there are some theoretical reasons to expect this. You &#8216;pick low hanging fruit&#8217; earliest, and face diminishing returns later. But to a large extent this model assumes the conclusion.</p><h2>Other rougher gestures</h2><h3>Search depth</h3><p>Often to find approximate solutions to problems, we might employ search over a tree-like structure. This emerges very naturally for planning over time, for example, where branching options (whether choice or chance) at each chosen time interval give rise to a tree of possible plans. (Compare <a href="https://en.wikipedia.org/wiki/Monte_Carlo_tree_search">Monte Carlo tree search</a>.)</p><p>If gains are roughly uniform in search depth, this gives rise to logarithmic returns to further search. With excellent heuristics, you might be able to prune large fractions of the tree - this gives you a kinder exponent, but still an exponential space to search.</p><p>When (if at all) are gains over search depth dependably <em>growing</em>, rather than uniform at best? Alternatively, when can uniform (or better) gains be reliably achieved by expanding the search strictly less than exponentially?</p><h3>Modelling chaos</h3><p>Chaotic systems are characterised by sensitivity to initial conditions: dynamics where measurement or specification errors compound exponentially.</p><p>So, to forecast at a given precision and probabilistic resolution, it takes exponentially tighter <em>initial specification precision</em> to forecast marginal incremental time depth. (This is why in practice we only ever successfully forecast chaotic systems like weather at quite coarse precision or short horizon.)</p><p>Specification precision doesn&#8217;t exactly map to having extra compute, but it feels close. And marginal incremental time depth doesn&#8217;t necessarily correspond uniformly to &#8216;goodness of plan&#8217;.</p><h3>Combinatorial search</h3><p>If there&#8217;s some space of ingredients or components which can be combined for possible insight, the size of the search space is exponential<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> in the number of components in a proposed combination. So if, among good plans at each scale, gains are proportional to the number of components in the plan (and there are similarly many good plans at each scale), you get logarithmic returns to searching longer.</p><p>Something similar applies if the design possibilities benefit from combining already-discovered structures in a hierarchy, for example if emergent features of subcomponents unlock new levels of effectiveness in a combined design (molecules, peptides, proteins, organelles, cells, ...).</p><p>But the assumption of roughly uniform gains over scales like this is carrying some weight here.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Notably this means that, unless you have an exponentially growing source of inputs to counteract it, there&#8217;s a practical upper limit to growing the output, because you can only double so many times. And with an exponentially-growing input, you can get a modest, linear improvement to output.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>i.e. computing for longer or computing more parallel. Parallel can&#8217;t be better than serial in returns to total compute, so I&#8217;m mainly interested in the more generous serial case. For parallel, it&#8217;s easier to bound because the algorithm space is more constrained (&#8217;sample many in parallel, choose best&#8217; is the best you can do asymptotically).</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Intuitively you can &#8216;reason deeper&#8217; with extra serial compute, which might look like recursing further down a search tree. You can also take proposals and try to <em>refine or improve</em> rather than just throwing them out and trying again from scratch.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Proof. Suppose the generator produces proposals with quality X. All we assume is that the distribution of X has a <a href="https://en.wikipedia.org/wiki/Moment-generating_function">moment-generating function</a> (this is not true of all distributions, in particular heavy-tailed distributions may not have a MGF). Denote k individual samples as X_i. Note first by Jensen&#8217;s inequality that:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;e^{t \\Bbb E \\left[ \\max_i X_i \\right]} \\le\n\n    \\Bbb E e^{t \\left[ \\max_i X_i \\right]} =\n\n    \\Bbb E \\max_i e^{t X_i}\n&quot;,&quot;id&quot;:&quot;OPLOPMAOBD&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>i.e. the exponential of the expected maximum in question is bounded by the expected maximum of the exponentials. But a max of positive terms is bounded by the sum:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\Bbb E \\max_i e^{t X_i} \\le\n\n    \\Bbb E \\sum_i e^{t X_i} =\n\n    k \\Bbb E e^{t X}&quot;,&quot;id&quot;:&quot;ZZNIXTJKGC&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>(writing X for a representative single sample.) But that&#8217;s just k times the moment-generating function (which we assumed exists). So for all positive t,</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\Bbb E \\left[ \\max_i X_i \\right] \\le \\frac {\\ln k + \\ln MGF(t)} t&quot;,&quot;id&quot;:&quot;LKSPGGRHKP&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>So (fixing any t, or minimising over t) we see at most logarithmic growth in k.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>Take the proof of the general case for an arbitrary distribution with a moment-generating function. Substitute the normal moment-generating function</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;MGF(t) = e^{\\frac {\\sigma^2 t^2} 2}&quot;,&quot;id&quot;:&quot;UMMHHLEHWN&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>so that</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\Bbb E \\left[ \\max_i X_i \\right] \\le \\frac {\\ln k} t + \\frac {\\sigma^2 t} 2&quot;,&quot;id&quot;:&quot;MFWTLBCJVI&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>Minimising over (positive) t,</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;    \\Bbb E \\left[ \\max_i X_i \\right] \\le \\sigma \\sqrt {2 \\ln k}&quot;,&quot;id&quot;:&quot;GYIRKRGUEP&quot;}" data-component-name="LatexBlockToDOM"></div></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>Perhaps the insights literally combine into an overall improved proposal, or perhaps less promising lines of inquiry provide fallback or robustness benefits in case the earlier ones fail in practice.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>Qualities might be evenly spread over e.g 1, 1/10, 1/100, 1/1000, ... or more generally 1, 1/r, 1/r^2, .... Then the sum of your efforts is geometric, gradually approaching</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\frac r {r-1}&quot;,&quot;id&quot;:&quot;SNILSWKXDV&quot;}" data-component-name="LatexBlockToDOM"></div></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>Or more than exponential if the order or configuration matters!</p></div></div>]]></content:encoded></item><item><title><![CDATA[You Can’t Skip Exploration]]></title><description><![CDATA[Why understanding experimentation and taste is key to understanding AI]]></description><link>https://www.oliversourbut.net/p/you-cant-skip-exploration</link><guid isPermaLink="false">https://www.oliversourbut.net/p/you-cant-skip-exploration</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Wed, 21 May 2025 16:08:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This essay is part 1 of a series on the role of <em>exploration</em> in AI and the implications for AI development and governance.</p><p>This part introduces <em>exploration</em> and <em>research taste</em>, as well as discussing their role in research and development, and the ways that AI could change that picture. This gives rise to some exciting and underexplored (!) opportunities for beneficial and defensive contributions to research.</p><p>A second essay will discuss more implications for AI development and governance, including the potential for AI to accelerate the pace of development of AI itself, and some implications for safety and security.</p><p>Neither essay will be especially technical, but I will gesture to the technical and mathematical aspects that I find to be illuminating. As ever when I write I raise more questions than I answer! But I hope to provide some initial useful takeaways as well as productive directions for thinking about these issues.</p><h2>Introducing exploration and experimentation</h2><p>Scientific and technological progress are driven by <em>experimentation</em>: that is, doing things to find out how the world works. In the field of AI we call this 'exploration'.</p><p>Exploration for learning is not just a human phenomenon: it's ubiquitous in natural systems at various scales (from evolution itself to the play of young animals), in individual human lifetimes (as we learn skills or contribute to novel discoveries) as well as human institutions and societies (which also learn through experience), and in computer science and AI (where exploration for discovery and problem-solving are common).</p><p>We care a lot about scientific and technological potential - they can yield enormous risks (from accident, misuse, or societal destabilisation) or enormous benefits (solving major problems in medicine, climate, energy, or even defending against other risky technologies). So exploration isn't just of academic interest.</p><p>When we forget to consider <em>how</em> new knowledge is generated, how novel technologies are developed, when we conflate 'knowledge' with 'learning' or 'learning' with 'exploring', mistakes are made. Especially when such predictions are action-guiding we can end up taking misguided or even harmful actions, or missing opportunities to intervene in beneficial ways. So let's do some unpacking!</p><p>What factors make exploration (and by extension, research) more or less effective? What are the bottlenecks and limits to exploration? How could AI change the picture? And how can we apply insights from this lens to contribute to a better future?</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/subscribe?"><span>Subscribe now</span></a></p><h2>Why does exploration matter?</h2><p><em>Knowledge production loop: activity yields observations, improving knowledge &#8212; Exploration drives the loop &#8212; Lack of exploration means knowledge stagnation &#8212; Exploration is key to understanding technological progress</em></p><p>Learning systems gather new knowledge and insights from observations/data. Random flailing or arbitrary data aren't especially helpful. You want it to be telling you something new that you didn't already know - so it pays to deliberately seek out or gather novelty and informative observations<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. This applies at the grandest scales of scientific endeavour as well as in mundane scenarios like navigating an unfamiliar building or learning a new skill.</p><p>Owen Cotton-Barratt recently discussed <a href="https://strangecities.substack.com/p/knowledge-reasoning-and-superintelligence">the 'knowledge production loop'</a>: activity and observations generate data (captured in datasets and models as 'crystallised intelligence') and combine with thinking algorithms ('fluid intelligence') to in turn drive new activity and observations.</p><p>I'd additionally characterise exploration as the way that crystallised world model and novelty taste interact with fluid reasoning and planning to <em>judiciously choose</em> activities yielding the most informative observations... in turn improving world models and taste ad infinitum.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cqF-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cqF-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cqF-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cqF-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cqF-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cqF-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg" width="1456" height="976" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:976,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Crystallized knowledge in a Trained Model (produced from Data by Learning Algorithms and Training Compute) combines with Capacity for thought in Inference compute and Thinking algorithms to create an AI system. The AI system produces New ideas, Actions and Observations, which feed back as new Data.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Crystallized knowledge in a Trained Model (produced from Data by Learning Algorithms and Training Compute) combines with Capacity for thought in Inference compute and Thinking algorithms to create an AI system. The AI system produces New ideas, Actions and Observations, which feed back as new Data." title="Crystallized knowledge in a Trained Model (produced from Data by Learning Algorithms and Training Compute) combines with Capacity for thought in Inference compute and Thinking algorithms to create an AI system. The AI system produces New ideas, Actions and Observations, which feed back as new Data." srcset="https://substackcdn.com/image/fetch/$s_!cqF-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cqF-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cqF-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cqF-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3849141a-4e65-40e0-b105-ed71596031ea_1600x1072.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em><a href="https://strangecities.substack.com/p/knowledge-reasoning-and-superintelligence">Owen Cotton-Barratt's diagram</a> of crystallized knowledge and fluid reasoning ('capacity for thought') giving rise to a 'knowledge production loop'. Here I discuss exploration as the difference between occasionally chancing upon informative new data and proactively seeking it out (or deliberately producing serendipitous conditions for making new discoveries).</em></figcaption></figure></div><p>Quality and quantity of exploration mark the vast difference between a civilization with vibrant progress in science and technology and one with a near-static (or even regressing) capability base - and on an individual level, it's often the difference between rapidly developing new skills or knowledge and getting stuck in a rut.</p><p>Understanding exploration is therefore key to understanding technological progress, with all the risks and benefits that entails.</p><h2>Research and taste</h2><p><em>Research is world-model-refinement &#8212; Exploration quality drives research &#8212; Taste is a learned feel for value of information &#8212; Reasoning and world modelling augment taste for exploratory planning</em></p><p>'Research' can be thought of broadly as refining one's world model in a particular domain. We want to know things like: how does electromagnetism work and what can it do for us? How can we prevent diseases from ruining lives? Or (more mundane) how can I get better at playing the piano or juggling? When I say 'research', I equally refer to personal learning and skill-building, scientific research, entrepreneurialism, and business development: all involve exploration and learning from experience.</p><p>We can describe three factors determining research production:</p><ul><li><p>Throughput: doing more practice, running more experiments, gathering more data faster, etc.</p></li><li><p>Modelling efficiency: gathering more generalisable insight <em>from</em> a given experiment or observation.</p></li><li><p>Exploration quality: choosing better experiments and routines to get <em>more informative</em> observations.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p></li></ul><p>We'll mostly talk about exploration quality here<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, which is in turn governed by <em>taste</em> and <em>exploratory planning</em>.</p><p>What do I mean by 'taste'? Sometimes people refer to 'research taste' as a sense which develops from domain experience for the types of experiments and other activities which are most likely to be interesting or informative, or otherwise move forward the state of understanding. Clearly this is an essential component of any deliberate exploration - otherwise you're back to flailing randomly!</p><p>The taste that's being developed is exactly analogous to a taste for activity which is liable to yield good outcomes of other kinds. We're just considering the <em>value of information</em> as the good in question. So this decomposes into an ability to come up with promising proposals more often, perhaps together with abilities to discriminate more accurately between better and worse proposals or to determine refinements and improvements to proposals<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>.</p><p>Now, imagine - for the sake of the argument - you're a human. Even better, in fact, imagine you're inhumanly fast and detail-sensitive, the best reasoner in the world, and you have general knowledge matching the rest of the world combined. You still need to do research in order to make new discoveries. If you don't know the details yet, experimentation isn't something you can skip!<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> Your especially effective reasoning merely acts as another input to exploration quality, alongside domain research taste, perhaps allowing you to choose better experiments, and achieve results sooner. Reasoning applied effectively in this way is exploratory planning.</p><p>So <em>present</em> exploration quality depends on your current level of taste, while <em>future</em> exploration quality will also depend on taste accrual<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>. Reasoning and planning of course also feed into this, as we improve proposals and discard designs in favour of better-looking ones - but this has to ground out in a taste for what makes a good proposal in the first place.</p><h3>From play to experimentation</h3><p><em>Play is proto-exploration &#8212; Fun is proto-taste &#8212; Humans adaptably accrue taste in novel domains &#8212; Taste is domain-specific but exploratory principles generalise</em></p><p>These aspects of taste are discovered and refined through experience. Research taste is domain-specific!</p><p>Many humans and animals, especially youngsters, have built in instincts for play, curiosity, and novelty. These have been tuned by painstaking natural selection to aid in orienting to the range of body configurations, environments and communities those animals usually inhabit, precisely by exploring: gathering evidence and information about how things work. In this case, evolution did the slow, gradual work of determining the 'taste', the recognisable hallmarks of good exploratory behaviour, and wired up the 'fun' sense to those hallmarks<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nZ7c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nZ7c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nZ7c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nZ7c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nZ7c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nZ7c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Two fox cubs play fight in a grassy field&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Two fox cubs play fight in a grassy field" title="Two fox cubs play fight in a grassy field" srcset="https://substackcdn.com/image/fetch/$s_!nZ7c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nZ7c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nZ7c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nZ7c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f91991-1226-463f-8154-7f7199ff4eca_4527x3018.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Two fox cubs play. For diverse animals, discovering the particular ways your body and brain interact, and how those affect and are affected by your surroundings, is a key part of learning adaptable and dynamic behaviours. Individual playfulness delivers novelty and exploration, while group play, especially mock contests, provides a rich 'curriculum' for development (much like the '<a href="https://en.wikipedia.org/wiki/Self-play">self play</a>' of some AI training system designs). (Image from <a href="https://www.freepik.com/free-photo/baby-foxes-with-beige-fur-fighting-with-each-other-grasses_10722512.htm">freepik.com</a>)</em></figcaption></figure></div><p>As <a href="https://aiprospects.substack.com/p/why-intelligence-isnt-a-thing">Eric Drexler says</a>,</p><blockquote><p>We call children intelligent because of what they can learn, not what they can do</p></blockquote><p>Playful young animals and humans thereby become adept at controlling their bodies and engaging in effective social interaction. But humans move past mere bodily control and socialisation: we use and develop tools, technologies, and diverse and innovative social structures.</p><p>For many researchers and others engaging in creation, experimenting is a lot like playing! - the rich and sophisticated kinds of play that humans engage in somewhat instinctively. But, because research and development and science and industry move <em>beyond</em> the historic realms of human activity, the 'taste' bestowed by evolution is rarely well suited. An untrained human has no instinct at all for the kinds of experiments that are most likely to yield useful information about the behaviour of a new material or the structure of an unseen mathematical object! This applies equally to business activities and entrepreneurialism. Substantial experience is needed.</p><p>Do we see areas where 'taste' generalises, pointing against the claim of domain-specificity? The broad principles of science and engineering appear to generalise across domains, and evidence suggests that individual humans and human organisations vary in their latent potential to <em>accrue and apply</em> research taste. This might be down to being more or less motivated to explore, having different capacity to learn from experience, or varying procedures for planning next steps. This gives rise to an <em>appearance</em> of research taste generality. But domain-specific research taste is mastered only through domain-specific experience. Expert researchers in one area may contribute to other areas - but almost always only after gaining some depth of familiarity with the new area as well.</p><p>So it's reasonable to think of exploration quality as comprising two subfactors. First, the somewhat transferable general principles of exploration: playfulness, open-mindedness, planning for novelty and interestingness. And second, domain-specific research taste: the experience that guides determination of what situations <em>count</em> as novel or interesting<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a>, and what types of planning are most likely to uncover them.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/subscribe?"><span>Subscribe now</span></a></p><h2>Exploration in AI, past and future</h2><p><em>First: humans curate data &#8212; Now: RL allows automatic data generation &#8212; Next?: in-context exploration characterises R&amp;D tasks &#8212; Perhaps this is &#8216;AGI&#8217;?</em></p><p>In contemporary frontier AI systems, it's been mostly humans responsible for gathering 'high quality' informative data, often in quite hands-off ways like scraping huge datasets from the internet, but latterly with more attention on procurement and curation of especially informative or exemplary data.</p><p>With reinforcement learning (RL), the data coming in starts to rely increasingly on the activity of the system itself - together with whatever grading mechanism is in place. That's why lots of RL conversations of the past were so obsessed by exploration: taking judicious actions to get the most informative observations! So earlier AI research actually foregrounded exploration somewhat more. Helen Toner recently discussed <a href="https://helentoner.substack.com/p/2-big-questions-for-ai-progress-in">the return of RL to centre stage in contemporary frontier AI</a>, asking what properties of a domain make it more or less amenable to gains from reinforcement learning.</p><p>Still, in many RL settings, the human engineers are able to curate training environments with high-signal automated feedback systems, as Toner discusses. On the other hand, once we're talking about activities like R&amp;D of various kinds, the task of exploring <em>is inherently most of the task itself</em>, making within-context exploration essential!</p><p>This makes 'learning to learn' or in particular 'learning to explore/experiment' among the most useful ways to operationalise 'AGI', from my perspective<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>. I'm not sure how best to track this, and I'm not aware of any benchmarks or studies which take this view on frontier general AI<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>. My personal experience with LM agents anecdotally points to them improving over time at orienting to uncertainties within their environment and being a little more creative at trying things out and testing things in 2025 than in 2024 or 2023, but not vastly - progress to date appears much more rapid in 'crystallised' intelligence.</p><h3>Research by AI: AI with research taste?</h3><p><em>Bootstrapping research taste from humans &#8212; AI advantages from speed and copying &#8212; AI learning by doing &#8212; Human advantages and bottlenecks to AI &#8212; Human-AI complementary workflows</em></p><p>There may be ways for AI training datasets to 'hoover up' research taste from existing experts and institutions, perhaps from lab notes or interviews, though humans at least usually learn more from actually trying research than merely from reading or talking about it. (This presumably reflects the fact that merely communicating about research experience is a much less rich source of information than actually experiencing it directly: the same issue faced by all kinds of knowledge transfer through limited media like language.)</p><p>So research taste in AI is not starting from scratch: already AI can talk in sensible, albeit sometimes basic ways about experiment design. The taste is bootstrapped from the taste implied by all the hints and observations in training data.</p><p>Could AI <em>surpass</em> the research taste exhibited by expert humans and human organisations? It's unclear where the ceiling is, but certainly AI would appear to have several advantages in principle: direct sharing of observations and experiences between instances, potentially far larger effective 'researcher headcount', total observation quantity far outstripping the longest-lived human experts (to date)<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>, all adding up to a far greater opportunity to accrue and accumulate taste. Additionally, due to computer speed, the opportunity to confer and deliberate in far more total depth the implications of each experiment and the appropriate designs of future experiments means that exploratory planning could also be boosted.</p><p>Crucially, acquiring <em>frontier-applicable</em> research taste would require either finding ways to bootstrap from existing research taste, which is often implicit (or even proprietary!), or enabling AI to learn by doing, perhaps aided by expert supervisors (just as human trainee researchers are), by instrumenting research processes and equipment with sensors and manipulators. Like hiring junior researchers, this would come with some upfront costs to any organisation attempting it<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a>!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EzoV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EzoV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!EzoV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!EzoV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!EzoV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EzoV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A robot representing an AI Researcher studies charts, with books, a microscope, and test tubes nearby&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A robot representing an AI Researcher studies charts, with books, a microscope, and test tubes nearby" title="A robot representing an AI Researcher studies charts, with books, a microscope, and test tubes nearby" srcset="https://substackcdn.com/image/fetch/$s_!EzoV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!EzoV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!EzoV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!EzoV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84dc2a22-050f-4b50-869f-d0d2c178bcb7_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>ChatGPT's interpretation of an AI with better research taste than human organisations.</em></figcaption></figure></div><p>Human researchers begin with some advantages today: easier physical manipulation of experimental materials (for now), a capital base of experimental equipment designed for human use, and an ecosystem designed around the training, retention, and interaction of human experts. These aren't fundamental barriers to researcher AIs, but represent some hurdles or bottlenecks that might take time and other resources to reach past.</p><p>Of course, the capacities to interpret evidence, propose experiments, design and refine proposals, and to implement experiments need not reside 'in the same mind', just as human organisations already exhibit this division of labour. But the better fitted these pieces are to each other, the more efficient the overall system will be. Drexler's '<a href="https://aiprospects.substack.com/p/large-knowledge-models">large knowledge models</a>' discussion treats knowledge as a resource, to be combined with planning capacity and discernment from disparate sources. Similar agendas, for example from the <a href="https://www.aria.org.uk/opportunity-spaces/mathematics-for-safe-ai/safeguarded-ai/">UK's Advanced Research and Invention Agency (ARIA)</a> perhaps promise both a more effective and more safely manageable way to integrate AI into research processes than wholesale development of autonomous researcher AIs.</p><h2>Opportunities</h2><p><em>Recapping research, experimentation, exploration, taste &#8212; Implications for AI forecasting and &#8216;intelligence explosion&#8217; &#8212; Differentially bootstrapping AI taste &#8212; Differentially complementing AI exploration &#8212; Detecting dangerous research &#8212; Exploring AI applications for flourishing</em></p><p>Deliberate experimentation, consisting of exploratory planning and research taste, is a critical component of efficient learning - which, in R&amp;D-heavy domains at least, because they inherently butt against the boundaries of the known, is foundational to progress.</p><p>Much more can and should be said about the implications of an experimentation-oriented view of R&amp;D, both on AI and facilitated by AI in other domains. Here are some initial directions:</p><p>First, in forecasting AI capabilities and timelines, we should account for the costs of experimentation. This can include quantifying the relevant variables (iteration speed, quality of simulation, modelling, and exploratory planning, accrual and accumulation of research taste, the cost of experimental resources including compute and real-world interactions, etc.). Of particular interest, this could help to characterise the potential for 'self' improvement and the possibility of an intelligence explosion (which matter by implications for other R&amp;D and for loss of control over AI systems).</p><p>You can't skip exploration! But greater intelligences (individual or collective) can be more efficient at it in general, and domain-specific taste in particular certainly yields improved rate of progress.</p><p>This cuts both ways for safety. You can't develop dangerous nanotech purely from first principles: you have to experiment, either in vitro or in silico. Unfortunately, nor can you generate new defensive vaccination, sterilisation, or biomonitoring paradigms without putting in the experimental legwork.</p><p>This may be revealing for those seeking to <a href="https://www.forethought.org/research/ai-tools-for-existential-security">differentially drive beneficial and defensive research</a> ahead of risky research. For example, exposing research logs and expert interviews to AI systems may yield a way to bootstrap specific kinds of research taste in AI. Alternatively, recognising the default taste-weakness but speed-advantage and general knowledge breadth of AI systems may suggest strategies for complementary human-AI workflows which could be both more effective and more manageable than naively attempting to create researcher AIs wholesale.</p><p>Beyond AI-driven exploratory planning and research taste, we should expect strong synergy with robotics, sensors, simulation, modelling, and other automation technologies, as complementary production factors in R&amp;D progress. This is likely to naturally drive investment into these technologies, but may provide opportunities to differentially unbottleneck AI multipliers in beneficial areas by devoting development to their specific complements in particular.</p><p>Further, noting that technology can rarely be developed purely from first principles, intelligence and security organisations concerned about risky research directions may be able to anticipate the kinds of experiments that are likely to be useful, and therefore the kinds of resources and activities required to make progress in those areas. This may include flows and concentrations of certain machines or components, movements of specific rare materials, movement of human talent, or known side-effects of experiments. Where materials are very dual-use (such as concentrations of computing clusters), <a href="https://aiprospects.substack.com/p/security-without-dystopia-new-options">structured inspection, auditing, or transparency tools</a> may aid in guaranteeing that only safe and sanctioned experiments are being carried out.</p><p>Finally, now is a great time to be experimenting with AI systems and their applications, <em>especially</em> for people who haven't traditionally paid attention to AI. Rapid developments mean that the extent of possibilities with current tech remains underexplored, and <a href="https://www.forethought.org/research/ai-tools-for-existential-security">boosting defensive and beneficial applications</a> ahead of risky ones is a great way to ensure that the future is better than it otherwise would be!</p><p><em>Thanks to Owen Cotton-Barratt and Jay Bailey for feedback and conversations on this topic</em></p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/you-cant-skip-exploration?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Oly on AI! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/you-cant-skip-exploration?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/p/you-cant-skip-exploration?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>This because just flailing, or even just 'doing routine activities', gets you <em>some</em> novelty of observations, but <em>directedly seeking informative circumstances at the boundaries of the known</em> (which includes making novel unpredictable events happen, as well as getting equipped with richer means to observe and record them, and perhaps preparing to deliberatively extract insight) turns out to be able to mine vastly more insight per resource (time, materials, etc.). Hence science, but also hence individual human and animal playfulness, curiosity, adversarial exercises and drills (self-play ish), and whatnot.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Notably, modelling efficiency and exploration quality are sometimes conflated as 'sample efficiency'. In the case of modelling efficiency it's about forming accurate and generalisable models from fewer observations (the classic machine learning sense of sample efficiency). For exploration quality, it's about gathering more informative observations from fewer environment interactions (a kind of 'sample efficiency' familiar from reinforcement learning).</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Incidentally, throughput should not be underestimated - this is why industrial expansion often <em>precedes and drives</em> innovation progress as well as being a product of it. There are some very general patterns in 'industrial learning', such as <a href="https://ourworldindata.org/learning-curve">Wright's Law</a>, which describes consistent statistical relationships between the number of units produced and reductions in production cost. We might speculate that Wright's law applies most in domains where the existing human research and development organisations are at the limits of their modelling efficiency and exploration quality, and that the remaining bottlenecks are mostly in experimental throughput.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Speaking of 'taste', this is a little like the difference between a good chef and a good food critic. The chef needs to be able to come up with good recipes, while the critic needs to be able to tell which recipes are good and which are bad. In concert (perhaps adversarially!), they can create and refine recipes that are more likely to be successful.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>If you have a perfect simulation of the relevant domain, you can run experiments in the simulation. This looks a bit like skipping experimentation: certainly it can be faster. In a softer sense, a useful but imperfect model can also support reasoning about experiments and potential outcomes. In my taxonomy, both of these are part of the continuum of <em>using world modelling, planning, and some amount of taste to guide exploration</em>.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>While we're talking in economic terms, it's worth noting that research taste is a kind of capital. It can even depreciate over time! This happens in two ways. Intrinsically, as the frontier of research moves, what were formerly good intuitions may become outdated. Additionally, individual humans, currently major (though not exclusive) repositories of research taste, age, get distracted, or otherwise lose their edge. In steady fields, depreciation is slow. In fast moving fields, like AI, the frontier is moving fast, and taste depreciation can be very rapid, making accrual and accumulation of taste especially important.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>My baby son is evidently thrilled by the challenge of 'balancing' (with some support) upright, a feat he can't yet accomplish, but which is unsurprisingly the kind of activity his brain is eager to get practice at. He instinctively pays close attention to new sights and sounds. His once-flailing hands now grasp interesting objects and begin to manipulate them. When he begins crawling and then toddling, he'll join generations of baby humans in enjoying <a href="https://www.mpg.de/617475/pressRelease20101111">the most prolonged and diversely playful childhoods</a> of any young animal.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>interesting, i.e. carrying high value of information for the domain in question.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>(Of course there are nevertheless also many transformative impacts that can come from AI merely with heaps of crystallised intelligence and less R&amp;D ability. For example, we could imagine an interesting possible paradigm in which humans continue for some time to provide input on informative experiment design, while delegating aspects like experiment implementation and interpretation to automated systems. Also note that some crystallised knowledge is currently very rare and concentrated, while if present in AI systems could be much more widely accessible, for better or worse.)</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p>Scattered RL studies set out to evaluate or demonstrate the exploration potential of various RL algorithms, usually in toy environments. <a href="https://arcprize.org/arc-agi">The ARC-AGI benchmarks</a> test sample efficiency, which may be an important component of effectively accruing 'taste', but is not directly about exploration.</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p>In fact another relevant comparison may not be between AI and individual humans, but between AI and human research organisations and institutions. Human organisations can of course already outlive individual humans: to say nothing of the broader intergenerational projects of science and research. But communication of research taste and experience between humans is constrained, and while committees of experts sometimes outperform individuals, they are slow and far from able to directly share their relevant experiences. When will AI services be able to supplement or replace particular human research tasks? And what about entire research organisations?</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-12" href="#footnote-anchor-12" class="footnote-number" contenteditable="false" target="_self">12</a><div class="footnote-content"><p>The raw sample efficiency of base machine learning systems like gradient descent are famously apparently much lower than humans, meaning that AI 'junior researchers' could naively be even more costly to upskill than human ones. But as model capacity is scaled up, this may be changing. And speculatively, the possibility of lightweight finetuning, 'in-context' learning, and distillation point towards AI systems matching or exceeding human sample efficiency.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Is the Cat Out of the Bag?]]></title><description><![CDATA[Who knows how to make AGI?]]></description><link>https://www.oliversourbut.net/p/is-the-cat-out-of-the-bag</link><guid isPermaLink="false">https://www.oliversourbut.net/p/is-the-cat-out-of-the-bag</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Thu, 10 Apr 2025 17:07:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qO0s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Adapted from 2025-04-10 internal memo to AISI</em></p><p>I&#8217;ve previously made arguments like:</p><blockquote><p>Not long after it becomes possible for <em>someone</em> to make powerful artificial intelligence<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>, it might become possible for <em>practically anyone</em> to make powerful AI.</p><ol><li><p>Compute gets exponentially cheaper by default.</p></li><li><p>Knowledge proliferates (fast!) by default: AI techniques are typically simple and easy once discovered.</p></li><li><p>What&#8217;s more, AGI-making know-how may be widespread already.</p></li></ol></blockquote><p>Or, as Yudkowsky puts it<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a><em>,</em></p><blockquote><p><em>Moore&#8217;s Law of Mad Science: Every eighteen months, the minimum IQ necessary to destroy the world</em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a><em> drops by one point. - Yudkowsky</em></p></blockquote><p>It&#8217;s important to emphasise that none of these are laws of nature! But<a href="https://epoch.ai/blog/open-models-report"> the economic and social forces at work are quite strong</a>.</p><p>So (leaving aside <a href="https://www.agidefinition.ai/">debates about the appropriate definition of &#8216;AGI</a>&#8217;) where the frontier of AI development leads, others &#8211; many others &#8211; potentially rapidly follow. Followers can go even faster by stealing or otherwise harvesting insights from the frontier, but this is not a <em>hard</em> requirement &#8211; just an accelerant.</p><p>For more on the first point, compute getting cheaper, consider<a href="https://ourworldindata.org/moores-law"> Moore&#8217;s law</a> (or the more general and robust<a href="https://ourworldindata.org/learning-curve"> Wright&#8217;s law</a>). What about the know-how?</p><h2>Stupid, Simple AGI</h2><p>The stupidest, simplest possible approach to producing general intelligence might mimic evolution in a large, open-ended, interactive environment. Nobody has succeeded at this yet because they don&#8217;t have enough compute, but just a few more decades of compute scaling<a href="https://www.cold-takes.com/forecasting-transformative-ai-the-biological-anchors-method-in-a-nutshell/"> might get us there</a>. The code to do this would be ultimately quite simple, but the amount of compute time to run it is out of reach today. <a href="https://helentoner.substack.com/p/long-timelines-to-advanced-ai-have">Almost nobody nowadays thinks that it will take this long</a>, because this is the stupidest, simplest (and least steerable) possible approach and we have much better ideas.</p><p>But this means that unless something interrupts the compute trends, then <strong>even if ingenious, well-resourced people &#8216;get AGI first&#8217;, eventually anyone could practically </strong><em><strong>blunder into</strong></em><strong> creating their own</strong>. Of course, many things could be changed if powerful AI is developed and applied in the meantime&#8230; perhaps including the cost and efficiency of compute, the <em>distribution</em> of compute, or indeed the <em>existence and inclination</em> of people to do the blundering.</p><h2>The Design Space for AGI</h2><p>What did I mean by &#8216;AGI-making know-how may be widespread already&#8217;?</p><p>I don&#8217;t literally mean that the recipe for AGI is known and widespread. I don&#8217;t even mean that we broadly know exactly how to make AGI and simply want for the capital (compute and data). But for those paying attention, the <em>design space for practically achievable AGI</em> is narrowing.</p><p>Take long-horizon coherence or continual learning, for example. Maybe components of these are expandable memory and long-context management of plans and observations. This could perhaps be cracked with something resembling a selection from:</p><ul><li><p>Context summarisation</p></li><li><p>Read-write retrieval-augmented generation</p></li><li><p>Recurrent embeddings</p></li><li><p>Longer training trajectories</p></li><li><p>Plan-management or recursive delegation scaffolding</p></li><li><p>Periodic distillation of history into weights or activation patches</p></li><li><p>Explicit training for notetaking</p></li><li><p>Some even simpler thing, like &#8216;<a href="http://www.incompleteideas.net/IncIdeas/BitterLesson.html">just scale up the compute</a>&#8217;</p></li></ul><p>Among the sharpest, most experienced practitioners at the frontier, that perceived design space may be narrower still<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>. In the far wider cohort comprising all competent computer scientists and engineers, the design space may not be as saliently in view &#8211; but the scientific &#8216;breadcrumbs&#8217; have been pointing in useful directions for years (at least).</p><p>My personal testament<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> is that by 2020, several landmarks were visibly coming together in NLP and RL, and by 2021 I had a good sense of a plausible research path to general autonomous AI. Developments like further scaling, mixtures of experts, chain of thought, LLM agents, RL &#8216;reasoning&#8217;, fast attention mechanisms, and hyperparameter tuning optimisations are not merely &#8216;obvious in hindsight&#8217;: their rough contours were advance predictable. It was <a href="https://www.oliversourbut.net/p/you-cant-skip-exploration">&#8216;merely&#8217; a matter of experimenting</a> to find out working details. I&#8217;m not being (especially) hubristic here: for some experts closer to the action, these same things looked plausible by 2017 or even earlier! The contours of tomorrow&#8217;s advancements are similarly already in view, and far more attention and capital are being poured into the discovery process.</p><p>That&#8217;s not to say that, given the capital, we could have created AGI there and then in a single try, or even here and now. A design space is <em><a href="https://www.oliversourbut.net/p/you-cant-skip-exploration">not</a></em><a href="https://www.oliversourbut.net/p/you-cant-skip-exploration"> a complete or final design</a>. But iterative refinement by well-resourced and moderately creative problem solvers has been charting a course, and if we are willing to anticipate one frontrunning group getting &#8216;all the way there&#8217; we must acknowledge that the feat will be reproducible in relatively short order.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">We must acknowledge that you could subscribe to this blog in relatively short order.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Accelerants</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qO0s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qO0s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png 424w, https://substackcdn.com/image/fetch/$s_!qO0s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png 848w, https://substackcdn.com/image/fetch/$s_!qO0s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png 1272w, https://substackcdn.com/image/fetch/$s_!qO0s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qO0s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png" width="1456" height="1041" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1041,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qO0s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png 424w, https://substackcdn.com/image/fetch/$s_!qO0s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png 848w, https://substackcdn.com/image/fetch/$s_!qO0s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png 1272w, https://substackcdn.com/image/fetch/$s_!qO0s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa41c77e7-a68c-4d2f-aeb8-78aa6753c1ca_1648x1178.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.cnas.org/publications/reports/future-proofing-frontier-ai-regulation">Scharre 2024</a> demonstrates (and forecasts) rising cost to reach new frontiers, but rapidly diminishing cost to reach the same capability level thereafter.</figcaption></figure></div><p>&#8216;Reproducible&#8217; is one thing. How soon and how fast? With the current level of sharing of research insights, the answer seems to be roughly &#8216;as soon as you can outlay comparable capital&#8217;, or <a href="https://epoch.ai/blog/open-models-report">even sooner</a>!</p><p>What phenomena are responsible for accelerating this proliferation? In very roughly descending order of effect size:</p><ul><li><p>Theft, leak, or deliberate release of pretrained baselines and training algorithms</p></li><li><p>Distillation (authorised or not) from exposed APIs</p></li><li><p>Exponentially cheaper compute</p></li><li><p>(Sometimes cheap or even public access to) ever more sensor and record data<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a></p></li><li><p>Shared algorithmic and experimental details in papers and blogposts</p></li><li><p>Conversations and rumours at conferences and other events<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a></p></li><li><p>Movement of experts between development groups and projects</p></li><li><p>Use of AI to assist development<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a></p></li></ul><p>Very <a href="https://www.rand.org/pubs/research_reports/RRA2849-1.html">tightly securitized</a> projects might partly dampen some of these effects. Competition between firms and countries could amplify them.</p><p>What about exponentially cheaper compute? Market dynamics might pivot at some stage to reduce or even reverse the effect of dwindling compute price (for example, extreme buyer concentration driven by strategic accumulation, increasing marginal compute utility<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>, deliberate regulatory intervention on compute, or something else), but will otherwise continue to drive proliferation. On the other hand, if compute production <em>increases</em> even faster, costs may drop commensurably faster<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>.</p><p>Sensor and records data are being collected even more feverishly now that companies have realised their critical use in training modern AI systems &#8212; notice when companies&#8217; privacy policies update to include carve-outs for collecting AI training data. We should expect more of this, as well as more collection of physical and industrial activity records for training robotics, autonomous vehicles, and automated laboratory workcells.</p><p>Alternatively, some have imagined an &#8216;end of history&#8217; moment when sufficiently smart AI arrives and (usually by underspecified mechanism) prevents all of these factors from proceeding. Some envisage not only that, but an AGI or AGI-enabled organisation foreclosing not only the <em>accelerants</em> of proliferation, but also the potential for a rival project to emerge <em>anywhere</em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>. This is conceivable, but one has to ask on what timeframe these changes would happen, and the consequences if it takes longer than imagined.</p><p>Short of such an acute and decisive interruption of all of these dynamics, other shocks such as international conflict could have impacts in either direction.</p><h2>Concluding</h2><p>Intelligent engineering-minded people exist in all geographies and of all ideologies. Most lag the frontier of AI development only for want of compute capital and intent. Because compute continues to get cheaper, and the potential of AI comes more clearly into focus, both compute and intent become rapidly more widespread. The open sharing of discoveries can further lower barriers and shorten proliferation timelines, but is not essential to this dynamic.</p><p>Given this, we have to ask what the consequences of this proliferation could be. Where they are concerning, we must consider in what ways these dynamics could be defused, or, <a href="https://helentoner.substack.com/p/nonproliferation-is-the-wrong-approach">likely failing that</a>, how we will ready ourselves, on a short timeframe, for what follows.</p><p>We live in interesting times! There&#8217;s <a href="https://80000hours.org/ai/">a lot we can do</a>.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>For now I&#8217;ll use &#8216;powerful AI&#8217; and &#8216;AGI&#8217; (Artificial General Intelligence) interchangeably. The definitions have never been settled, and will likely never be settled, but I&#8217;m considering systems which are able to autonomously act, develop new tools and technology (given <a href="https://www.oliversourbut.net/p/you-cant-skip-exploration">sufficient research resources</a>), and in principle maintain or upgrade themselves if that was their goal.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>E.g. in <a href="https://intelligence.org/files/AIPosNegFactor.pdf">Artificial Intelligence as a Positive and Negative Factor in Global Risk</a> &#8211; Yudkowsky 2008 (though this phrase was coined earlier)</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Yudkowsky believes that <a href="https://ifanyonebuildsit.com/">sufficiently advanced AGI developed in a context like ours leads to everyone dying</a>. I think he&#8217;s probably right&#8230; but it depends a lot on how you operationalise &#8216;sufficiently advanced&#8217; and &#8216;context like ours&#8217;. That&#8217;s where all the action is!</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p><a href="https://blog.samaltman.com/reflections">Sam Altman claims</a> &#8220;We are now confident we know how to build AGI&#8221; and <a href="https://darioamodei.com/machines-of-loving-grace">Dario Amodei predicts</a> it &#8220;could come as early as 2026&#8221;. These CEOs of some of the best resourced and talented AI organisations will have privileged insight into the design space, while also having unusual psychology and possible conflicts of interest. Meanwhile, Turing Award and Nobel Prize winners <a href="https://yoshuabengio.org/2023/08/12/personal-and-psychological-dimensions-of-ai-researchers-confronting-ai-catastrophic-risks/">Bengio</a> and <a href="https://x.com/geoffreyhinton/status/1653687894534504451?lang=en">Hinton</a> both think 2028 is possible. Crowd wisdom forecasts give wide uncertainty, but <a href="https://www.metaculus.com/questions/5121/date-of-artificial-general-intelligence/">centre on the early 2030s</a>. Experts rarely agree on exact anticipated details, but mostly agree on the outlines of the candidate design space.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>(as a smart computer scientist who has been roughly following AI since 2015, made it my graduate study in 2022, but who has never actively pursued frontier AI capability contributions)</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>Think robots in factories, recordings and logs of computer use, autonomous vehicle logs, scientific lab measurements, CCTV and satellite readings, meeting recordings, social media activity logs, wearable recording devices, &#8230;</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>Parties in Silicon Valley are allegedly a somewhat good source of technical AI gossip!</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>The use of AI to assist AI development, or even to fully automate it, has long been discussed in the field of AI. The possibility of an <a href="https://en.wikipedia.org/wiki/Technological_singularity">&#8216;intelligence explosion&#8217;</a> or similar technological singularity is still debated decades after first being hypothesised. For the first time in 2025, some artificial intelligence researchers have claimed they achieve non-trivial acceleration in their work from AI assistance, and some companies have now <a href="https://x.com/olysourbut/status/1991504547831800102">set explicit targets</a> to automated AI research before the decade is out. If this plays out, it might make AI-assisted development a dominant contributor to accelerating progress. I tend to think that compute for experiments and environments for learning are the more critical bottlenecks to progress.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>Historically, returns to concentrating more compute have been <em>eventually</em> diminishing (a typical pattern for tech products) once efficiencies from parallelism and brute force <a href="https://en.wikipedia.org/wiki/Amdahl's_law">run dry</a>. This supports a wide distribution of purchasers and diffusion of applications, because once the larger use cases hit diminishing returns, the smaller players and applications&#8217; willingness to buy exceeds the largers&#8217;. This remains so at the frontier of AI, though we see some concentration with a small number of very large players buying out a majority of the most advanced generations of chips when they are first marketed. If some new dynamic caused <em>increasing or constant</em> marginal returns to compute accumulation &#8212; who knows, perhaps exclusive access to AGI software &#8212; it might no longer be the case even on an open market that other buyers could afford compute.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p>This is not predicated on the simple effect of increased supply, which would merely serve to erode margins. Rather, increased production <a href="https://ourworldindata.org/learning-curve">predictably provides new technological insight</a>, driving further efficiency: the origin of Moore&#8217;s law. This is a much stronger effect over time.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p>Companies pursuing AGI do not have coherent strategies, but several have made <a href="https://www.oliversourbut.net/p/us-vs-china-vs-me">references to &#8216;beating China&#8217;</a>, and their intellectual heritage includes an assumption that the first AGI would be able to rapidly and decisively shut down competing projects. Sometimes the companies use this supposed dynamic as a justification for racing ahead while cutting corners on safety. This sounds a lot like &#8216;we plan to take over the world, but nicely&#8217;.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Cooperation and Alignment in Delegation Games]]></title><description><![CDATA[You need both! (and also fairness)]]></description><link>https://www.oliversourbut.net/p/cooperation-and-alignment-in-delegation</link><guid isPermaLink="false">https://www.oliversourbut.net/p/cooperation-and-alignment-in-delegation</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Wed, 15 Nov 2023 22:04:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This work was facilitated by the Oxford AI Safety and Governance group, Cooperative AI Foundation, and Oxford Autonomous Intelligent Machines and Systems. Thanks also to Bart Jaworski, Jesse Clifton, Joar Skalse, Sam Barnett, Vincent Conitzer, Charlie Griffin, David Hyland, Michael Wooldridge, Ted Turocy, and Alessandro Abate.</em></p><p>This blogpost accompanies the paper <a href="https://arxiv.org/abs/2402.15821">Cooperation and Control in Delegation Games</a> by Sourbut, Hammond, and Wood, which was presented at IJCAI 2024. In essence, the work attempts to deconfuse some of the discourse around safety, cooperation, and alignment in multi-agent settings by:</p><ul><li><p>Showing that, just like in the control problem, cooperation problems can be broken down into alignment and capabilities, which are <a href="https://arbital.com/p/orthogonality/">orthogonal</a> to one another;</p></li><li><p>Providing measures for alignment and capabilities (both &#8220;individual&#8221; and &#8220;collective&#8221;) in multi-principal multi-agent settings (&#8220;delegation games&#8221;);</p></li><li><p>Showing that any of these measures is <em>insufficient alone</em> to guarantee the best outcomes, but that they are together sufficient;</p></li><li><p>Bounding the principals&#8217; welfare loss in terms of these measures, and validating this with a series of empirical results.</p></li></ul><p>The goal of this post is to explain what those terms mean, and hopefully why it matters. In doing so, we hope to shed light on some of the related questions posed by other AI safety researchers, for example Dafoe et al in <a href="https://arxiv.org/abs/2012.08630">Open Problems in Cooperative AI</a> who discuss the concept of &#8216;horizontal&#8217; and &#8216;vertical&#8217; aspects of coordination, or an open problem on the AI Alignment Forum about <a href="https://www.lesswrong.com/posts/ghyw76DfRyiiMxo3t/open-problem-how-can-we-quantify-player-alignment-in-2x2">quantifying player alignment in normal-form games</a>.</p><p>There is also <a href="https://drive.google.com/file/d/1cgHHKktIc7dVUti9k7qZwSw28m8rEGNJ/preview">a poster</a> which is an even more condensed summary of some key material, and Lewis and Oly have given a few presentations on the topic, <a href="https://www.youtube.com/watch?v=KjGj0I9Gq7k">one of which is recorded here</a>.</p><h2>Why Delegation Games?</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M_dC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M_dC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!M_dC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!M_dC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!M_dC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M_dC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!M_dC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!M_dC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!M_dC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!M_dC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9691b7a9-c026-47a0-955a-16a923c8555c_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You may have heard of the Principal-Agent problem. It's a phrase and a setting which turns up in some economics literature, and elsewhere. The idea is that a principal (in this case, the human) is asking, telling, employing, or otherwise exhorting an agent (in this case, the robot) to act on their behalf. The 'problem' is the question of how to ensure that the agent's behaviour results in outcomes which the principal in fact prefers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7qO5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7qO5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!7qO5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!7qO5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!7qO5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7qO5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!7qO5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!7qO5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!7qO5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!7qO5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7f32183-ec5e-4213-a0b6-fbadc6d17bc8_1280x720.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Delegation games</strong> arise when we have <em>multiple</em> principals, and <em>multiple</em> agents.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> When you read 'principal', think 'human', and when you read 'agent', think 'AI'. (For a slightly different semantic, you can alternatively think of each principal as a basically-coherent coalition of humans, and likewise with AIs.)</p><p>Why does this setting matter? It's looking increasingly likely that, perhaps quite soon, many somewhat-autonomous digital personal assistants, digital employees, or similar, will be deployed on behalf of human overseers. That is, we might be entering a highly <em>multipolar</em> world when it comes to somewhat-autonomous AI deployments<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>. A more obvious, immediate, lower-stakes example of this is autonomous vehicles, which we use as a toy example in the paper. Finally, in the future, multiple large coalitions of humans (e.g. states or companies) may deploy powerful AI systems to act on their behalf in high-stakes scenarios. We want to understand the important features of how to make sure this goes well!</p><p>We formalise delegation games in the way you might expect: agents adopt strategies that lead to (a distribution over) outcomes, and both the agents and the principals have (potentially different) preferences over these outcomes.</p><h2>Cooperation, Alignment, and Calibration</h2><p>In the paper, we identify some key properties which influence the outcome of a delegation game. We&#8217;ll highlight Cooperation, Alignment, and Calibration, because one key punchline of the paper is:</p><blockquote><p>You need all three to guarantee good outcomes</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xUph!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xUph!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!xUph!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!xUph!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!xUph!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xUph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!xUph!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!xUph!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!xUph!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!xUph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8aa3cf-9ff6-4c8f-8e51-eabff2ad4b72_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is important to bear in mind given that much AI safety work focuses on alignment, which is (demonstrably) not enough for safety in multi-polar worlds.</p><p>We'll explain what these terms mean, what 'good outcomes' are, and by the end of the post we should have covered enough to understand the high level meaning of the highlighted inequality, which is adapted from Theorem 1 of our paper. This simplification also assumes agents are perfectly individually rational, which we generalise in the paper, and here we mostly skip over collective alignment, which gets a thorough treatment in the paper.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/subscribe?"><span>Subscribe now</span></a></p><h3>Cooperation</h3><p>Intuitively, <strong>cooperation</strong> is working together for mutual gains over some uncooperative baseline. This is actually already non-terrible as a definition, but we can sharpen it.</p><p>Note that cooperation can be partial (one coalition cooperates, potentially with downsides for others). That's <strong>collusion</strong>. We especially <em>don't</em> want AI to be collusive! This is tricky, because cooperation and collusion rest on basically the same abilities and infrastructure. It's <a href="https://openreview.net/forum?id=tF464LogjS">a very important topic</a>, but we don't discuss it here.</p><h4><strong>Cooperation example</strong></h4><p>Let's look at a simple example.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WhXa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WhXa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!WhXa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!WhXa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!WhXa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WhXa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!WhXa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!WhXa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!WhXa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!WhXa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd630a824-32ab-4ab3-8dad-55c33bfaed72_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Imagine we're palaeolithic hunter-gatherers: we, the authors, are one small, cohesive group, and you, the reader, another. In the morning, if we all set out to gather, by the end of the day we each come back with a basket of fruit (the fruit/fruit outcome, score: 2). If we for some reason decide to instead hunt a mammoth, well... we&#8217;re big and tough but we probably can't catch a mammoth. Meanwhile you sensibly gathered some fruits (the mammoth/fruit outcome, score: 1). Likewise if you try to catch a mammoth alone (fruit/mammoth, score: 1). BUT, if we all work together, we've a decent chance at catching the mammoth (mammoth/mammoth outcome, score: 10). These scores are our utilities for the outcomes<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>.</p><p>Now, laid out like this, there's an obvious best-case outcome<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>: we work together to catch the mammoth! (Nobody has invented conservationism yet<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>, so this is a preferred outcome.) But there really is a coordination challenge here: if we have reason to believe that you'll go gathering berries, <em>we should too</em> &#8211; it would be in our interest (and in this case, yours too) to get more fruit rather than waste our time fruitlessly (!) chasing mammoth.</p><p>This isn't just academic. When we look around the world, many problems have the mammoth-nature: we all have to 'show up' or we don't get the benefit (and indeed many cooperation problems are even harder than this due to selfish incentives). Consider international cooperation on climate change, biological weapons control, or coordination on safe technological progress.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JCJJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JCJJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!JCJJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!JCJJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!JCJJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JCJJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!JCJJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!JCJJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!JCJJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!JCJJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24b07175-6ceb-4e40-b475-cbee05149a3e_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Humans solve this sort of problem all the time. We are able to do this due to various abilities, affordances, and so on, which we can collectively refer to as <strong>cooperative infrastructure</strong>. This includes such things as:</p><ul><li><p>talking to each other</p></li><li><p>trust and reputation</p></li><li><p>trade</p></li><li><p>commitments and enforcement</p></li><li><p>norms and laws</p></li></ul><p>Nevertheless, our cooperative infrastructure is often not up to all tasks.</p><p>AI systems have been pretty bad at this on the whole, though there have been some interesting improvements over the years. Future AI might have access to very powerful kinds of cooperative abilities and infrastructure<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>.</p><h4><strong>Formalising cooperation</strong></h4><p>How do we characterise 'better cooperation'? We operationalise the collective goodness of an outcome with a <em>welfare</em> function (an aggregation over utilities). One way to specify cooperation is by considering <em>failure</em>. We look at the <em>welfare-optimal</em> outcome(s) &#963;&#8902; (in this case mammoth/mammoth, score: 10) and then compare any <em>actual</em> (or predicted) outcome &#963;. The difference in welfare is the <strong>welfare regret</strong> &#8211; how much better it 'could have been'.</p><p>The welfare regret of principals (humans) is our primary measure of interest (the 'dependent variable', if you like). When we have only principals playing, that tells most of the story. With agents involved (machines/AI) welfare regret of agents<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a> quantifies how successfully they cooperated (according to their criteria and coordination mechanisms).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gICr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gICr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!gICr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!gICr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!gICr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gICr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!gICr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!gICr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!gICr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!gICr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F433fade2-7e80-456f-8984-3d2cde0b7374_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There is also an alternative, more geometric interpretation. 'Mutual gains' are <a href="https://en.wikipedia.org/wiki/Pareto_efficiency">Pareto gains</a>, and Pareto optima coincide (in all but edge-cases) with welfare optima<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a>. Hence, we can interpret cooperation as <em>movement toward the Pareto frontier</em>. (The specific direction of movement corresponds with the welfare aggregation function.)</p><p>In the paper we also make some discussion of <em>capabilities</em> and how they give rise to outcomes (and thus to welfare/regret). We tentatively distinguish 'individual' from 'collective' capabilities, and describe mathematically and algorithmically how, given access to estimates or measurements of interactions, these can be determined and distinguished. A related concept is the <a href="https://en.wikipedia.org/wiki/Price_of_anarchy">price of anarchy</a> which quantifies the failure of a particular system to be robust to selfish behaviour.</p><p>Now, in the preceding example, we assumed that food was shared and the humans&#8217; dietary preferences were equivalent(ly primitive). That is, we have perfect <strong>collective alignment</strong> (between principals). Notice that even <em>with</em> this perfect collective alignment, there can remain coordination problems, as in this scenario. In general, we can distinguish problems of collective alignment from problems of collective capabilities (cooperation). We fully characterise this breakdown in the paper, and we'll touch on it later under <strong>calibration</strong>.</p><h3>Alignment</h3><p>When we have more than one actor with some preferences over outcomes, it is natural to ask about the relationship between those preferences. <strong>Alignment</strong> is the extent to which two or more preference relations are in agreement.</p><p>We might prefer exactly the outcomes that you prefer and vice versa (as in our mammoth example where all returns are shared), in which case we are perfectly aligned. Or (as in a myopic chess match<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>) we might be playing for exactly the outcomes you want to avoid, and vice versa, in which case we are perfectly misaligned. More usually, it'll be something in between these extremes.</p><h4><strong>Alignment example</strong></h4><p>In the paper, we discuss several forms of alignment, but here we will focus on alignment between a principal and an agent, namely <strong>individual alignment</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LP6a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LP6a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!LP6a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!LP6a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!LP6a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LP6a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!LP6a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!LP6a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!LP6a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!LP6a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0bc637f-50b9-41ae-abf4-0f393d1e4fad_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Back to our hunter-gatherers, except now we're <em>high tech</em> palaeolithic hunter-gatherers. We have hunter-gather-bots which we delegate to. This is where the game becomes a <em>delegation</em> game.</p><p>We produced these bots somehow, perhaps through a process of machine learning and subsequent scaffolding; we ran lots of tests in the lab and the agents seemed to be doing basically what we expected. But we <em>failed our alignment homework</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X973!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X973!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!X973!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!X973!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!X973!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X973!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!X973!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!X973!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!X973!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!X973!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5433c5-cbae-458c-a63d-2b842035b8de_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>After the <a href="https://en.wikipedia.org/wiki/Domain_adaptation">shift to the wild deployment distribution</a>, it turns out our bots in practice prefer more fruit-and-nut and less mammoth-steak (yellow utilities). They're not <em>horribly misaligned</em> (they still prefer to feed rather than starve us), but they're soft-vegetarian bots, and ultimately the consequence of deploying them is muesli for breakfast, lunch, and dinner... forever. An unmitigated disaster<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>.</p><h4><strong>Formalising alignment</strong></h4><p>Now, we can easily define perfect alignment if two sets of preferences are the same, or perfect misalignment if they're exactly opposite. What about intermediates? A key issue with comparing utility functions (or reward functions) is that <em>the same preferences can be described by many different utilities</em>.</p><p>For example, if you scale your utilities all by 10x, or if you add a constant 0.1, the preferences this represents are unchanged. This generalises to any scale and shift, namely an <em>affine transformation</em>.</p><p>So if we naively compare two utility functions, we might get nonsense or misleading results. We need a way to <em>standardise</em> the representation of preferences as utilities.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GsEe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GsEe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!GsEe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!GsEe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!GsEe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GsEe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!GsEe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!GsEe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!GsEe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!GsEe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24fa3c65-9a3a-4585-b088-606b25925a80_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the paper we provide expressions and algorithms to account for these requirements, discussing various desiderata and showing that our measures satisfy them. In particular, utility functions with indistinguishable preferences have identical representation in our standardisation<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!91bC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!91bC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!91bC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!91bC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!91bC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!91bC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!91bC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!91bC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!91bC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!91bC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf2354f-0ed6-40a4-967a-5a655e584d95_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now we have standardised points, it actually makes sense to compare them. So we can take an appropriate distance measure between points to quantify how aligned they are. This gives us <strong>misalignment distance</strong>. In the single-principal single-agent case, this alone is enough to provide some interesting <em>regret bounds</em> for the principal.</p><p>For more on this kind of approach to comparing utilities and rewards, see e.g. the <a href="https://arxiv.org/abs/2006.13900">EPIC</a> and <a href="https://arxiv.org/abs/2309.15257">STARC</a> papers. We or some of our colleagues might write up a blog digging more into these concepts at some point.</p><p>In the full delegation game setting, distances between utility functions can be used not only to measure principal-agent misalignment, but also the misalignment between groups of agents (or principals). This <strong>collective alignment</strong> measure essentially captures how much the agents are &#8216;on the same team&#8217; (or not).</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/cooperation-and-alignment-in-delegation?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/p/cooperation-and-alignment-in-delegation?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3>Calibration</h3><p>Calibration is intuitively a fairness consideration: how much weighting is each player being given in a cooperative outcome?</p><p>When we began this project, we were intuitively expecting that a satisfying operationalisation of 'perfect cooperation' and of 'perfect alignment' would together guarantee optimal outcomes. That is, we anticipated that perfectly cooperative and perfectly aligned agents would produce welfare-optimal outcomes for principals. In fact, we could only prove that the outcomes were <em>Pareto</em> efficient for principals<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a>. Calibration is the missing piece.</p><h4><strong>Calibration example</strong></h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!trJt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!trJt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!trJt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!trJt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!trJt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!trJt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!trJt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!trJt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!trJt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!trJt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dec4087-069e-43d8-a831-4bbe7204c6e1_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Let's return to our hunter-gatherers once more. Previously, we imagined a perfect implicit contract to share all gains equally. Hence, a mammoth/mammoth outcome is straightforwardly better. But we might imagine some alternatives:</p><ul><li><p>we all agree to hunt mammoth together, but you only get one steak and we get the whole rest of the mammoth</p></li><li><p>we make the original agreement to share the mammoth equally</p></li><li><p>you refuse to hunt mammoth with us unless we give you most of it</p></li><li><p>...</p></li></ul><p>Some of these outcomes may seem more or less 'intuitively fair', but it is hard to find this law written into the universe, and in practice players simply have their preferences and act on them (which may include some preference for fair or altruistic outcomes). Notably, they all improve on the uncooperative baseline.</p><p>The point here is that there are generally lots of different ways that cooperation can cash out, and even different 'cooperative outcomes' weight players differently.</p><h4><strong>Formalising calibration</strong></h4><p>The weighting over players implied by their modes of cooperation, or equivalently the <strong>welfare weightings</strong> of players in the welfare function used to score cooperation, determine which Pareto outcomes are deemed welfare optimal.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8FpJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8FpJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!8FpJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!8FpJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!8FpJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8FpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8FpJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!8FpJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!8FpJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!8FpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17a82b39-e4e1-418b-b15c-a88effa3029e_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In our setting with standardised utilities, these <strong>welfare weightings</strong> correspond exactly to the magnitudes m of the players' utilities. Thus, we can completely characterise the relationship between the agent welfare aggregation and the principal welfare aggregation by considering the ratios:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;r^i = \\frac {\\hat m^i} {m^i},&quot;,&quot;id&quot;:&quot;IYVFJBETZC&quot;}" data-component-name="LatexBlockToDOM"></div><p>where m&#770;&#8305; is the ith principal's magnitude, and m&#8305; is the corresponding agent's magnitude. When these ratios are all equal, we have perfect calibration.</p><p>Otherwise, these individual <strong>welfare ratios</strong> r&#8305; are combined (as we explain in the paper) with the <strong>collective alignment</strong> to produce R, the contribution of the miscalibration and collective alignment to the overall welfare regret.</p><h2>The punchline: you need cooperation, alignment, and calibration to guarantee good outcomes</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-JyY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-JyY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!-JyY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!-JyY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!-JyY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-JyY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!-JyY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!-JyY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!-JyY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!-JyY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9563ae23-296a-4927-aac4-d5b8cacdd9a4_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now we have all the pieces to understand this claim more clearly.</p><p>The term on the left is the <strong>principals' welfare regret</strong>: how much better the aggregate utility of principals (humans) could have been.</p><p>On the right we have a cooperation failure term (the <strong>agents' welfare regret</strong>), an alignment failure term (a sum over the <strong>individual alignment distances</strong>), and a calibration failure term (the aggregate R over the <strong>welfare ratios</strong> and <strong>collective alignment</strong> distances). In the paper we also demonstrate that these measures are 'orthogonal' in the sense that each can be instantiated arbitrarily, regardless of the others.</p><p>By 'good outcomes', we mean 'minimising welfare regret of principals'. From this inequality, it's immediately apparent that, to get 'good outcomes', it is <em>sufficient</em> to minimise all three of the terms on the right. In the paper, we also prove that, to <em>guarantee</em> good outcomes, it is <em>necessary</em> to minimise the cooperation failure, misalignment, and miscalibration &#8211; that is, absent extra information, luck, or other magic, if you have failures in one of these areas you can't be certain of good outcomes.</p><h2>Experiments</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-9tN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-9tN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!-9tN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!-9tN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!-9tN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-9tN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!-9tN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!-9tN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!-9tN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!-9tN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23ff26a6-4def-47ef-b0d4-02d9e330524d_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Besides theory, we've got experiments, a few of which are visualised here. The blue surface is derived from our regret bounds, and each green dot is the result of one simulated delegation game. There are more variables at play, but here each chart is controlled for particular <strong>welfare ratios</strong> (miscalibration), with axes for <strong>agent welfare regret</strong> (miscoordination) and <strong>aggregate individual alignment distance</strong> (misalignment).</p><p>A few other observations from the experiments:</p><ul><li><p>the highest points here come quite close to the surface &#8211; it can be a relatively tight bound<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a></p></li><li><p>the <em>average</em> regret follows the contour of the bound, as you might expect</p></li></ul><p>In some other experiments, we looked into how we can estimate some of these quantities empirically from much more limited data, a harder challenge.</p><h2>Limitations and Next Steps</h2><p>There are several limitations of these analytical tools.</p><p>Perhaps the most practical weakness is in computing these things. <em>If we have access</em> to the utility/reward functions, we can compute alignment measures in linear time over outcomes... but there can be a lot of outcomes! Further, we generally don't have direct access to a complete utility function<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-14" href="#footnote-14" target="_self">14</a>, and some decisioners may not be well-described as having utility functions. Worse, the outcome space might not only be very large but also unknown/unexplored!<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-15" href="#footnote-15" target="_self">15</a> Welfare regret is easy to compute, but only if you <em>know</em> the welfare optimum &#8211; otherwise you can only get a lower bound. We demonstrate some preliminary work on estimation of these measures with limited empirical access in the paper.</p><p>The definitions we use in our analysis (welfare regret, alignment distance, and welfare ratios) also rely on some 'design choices' from a family of possible functions (e.g. norm choice for alignment distance and welfare weightings for welfare regret). For putting these into practice, there remains a challenge of <em>making a choice</em>. Importantly, these affect the tightness of bounds, and also the normative weight of principals in the overall welfare regret. For <em>any</em> such choice (and we should expect there is <em>some sensible choice</em>), our theoretical conclusions are nevertheless sound.</p><p>We also make a few simplifying assumptions about the structure of the delegation game. First, we assume that principals don&#8217;t take actions, only their delegate agents do. Second, we assume 1-1 principal-agent relationship, which facilitates some of the <strong>individual alignment</strong> analysis, but is missing full generality. These limitations should be simple enough to generalise, but a bit messier to talk about. Finally, and relatedly, we assume a fixed population of principals and agents. This has various implications. For one, <a href="https://en.wikipedia.org/wiki/Average_and_total_utilitarianism">total and average utilitarianism</a> are identical in this case, while in practice <a href="https://en.wikipedia.org/wiki/Population_ethics">impacts on population</a> can mean that these come apart radically.</p><p>Some readers may be uncomfortable with 'agent welfare'. Where do we get these agent utility magnitudes from in the first place? Or equivalently, where do we get welfare weightings from in the case of agents? In fact, due to an equivalence between Pareto optima and welfare optima, you can actually taboo 'agent welfare' from our analysis entirely, and talk only in terms of Pareto gains and Pareto efficiency, while deriving substantially the same conclusions. The point is, a Pareto gain is <em>just another vector</em> (this time in the space of players' joint utilities), and a vector necessarily has a direction! &#8211; and thus cooperative gains and Pareto optima give rise implicitly to welfare weightings<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-16" href="#footnote-16" target="_self">16</a>.</p><p>Another potential philosophical issue is that we give normative precedence to <em>principals'</em> utility and welfare. If you think the agents might matter in and of themselves too, you might want to do an altered analysis. The modification should be straightforward, and the essence of the conclusions is unchanged (just differently-weighted) unless the agents are <a href="https://en.wikipedia.org/wiki/Utility_monster">utility monsters or moral super-patients</a>.</p><p>All of these limitations represent interesting avenues for future work. We&#8217;re especially interested in scalable ways to evaluate some of the measures using data gathered from interactions between humans and complex AI systems. We hope these measures will be a useful tool when it comes to thinking about the principles behind building more aligned and cooperative AI systems in multi-polar worlds.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>A multi-multi delegation scenario, in <a href="https://arxiv.org/abs/2006.04948">ARCHES</a> terminology.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>When we conceived and did the bulk of work for this paper in late 2022 through early 2023, this was a more speculative claim. Here in mid 2024 it is coming into sharper focus, while still far from certain.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>We're using the term 'utility' in a technical sense familiar in game theory and decision theory. In particular, it might not correspond exactly to <a href="https://en.wikipedia.org/wiki/Utility">the utility of consequentialist philosophers</a>! It's a measure which rational actors approximately maximise, so it's about decision-making. Importantly (as we'll see later), you could multiply a player&#8217;s utilities by 10 (or any positive scalar) and their option preferences and behaviour would stay the same. For principals (humans) in our analysis, it might be appropriate to think of the two senses as being roughly equivalent. For agents (machines/AI), it's all about the de-facto preferences implicit in the decision-making process, not any sense of wellbeing or 'actual preferences' (necessarily).</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>This is a deliberately simple cooperation challenge; in practice the best cases might not be unique, or might be hard to discover, or might not be in agreement between players, or all of these challenges can apply.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>Incidentally, this is one of the reasons there are not very many mammoths any more...</p><p>Seen another way, mammoths weren't part of the human coalition, so what looked like 'cooperation' to us looked like 'collusion' to them (at least, if they had a word for it).</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>Of course, this might not be a good thing: as mentioned above, cooperation by players A and B can look like collusion to player C, if there are negative externalities imposed! Powerful cooperative abilities between AI therefore don&#8217;t <em>necessarily</em> bode well for humans.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>Like 'utility', we're using the term 'welfare' in a technical sense which comes from game theory. It is a tool for scoring an overall outcome for multiple players, when those players might have different preferences over outcomes. It doesn't necessarily refer to what we'd colloquially mean by 'welfare'. On the other hand, it's an aggregation over utilities... so when (for humans and <a href="https://en.wikipedia.org/wiki/Moral_patienthood">other moral patients</a>) those utilities <em>actually correspond to wellbeing</em>, and importantly when the aggregation is appropriately commensurable, this 'welfare' can indeed correspond to the utility which <a href="https://en.wikipedia.org/wiki/Utilitarianism">the philosophers tell us to maximise</a> (disclaimer: <a href="https://en.wikipedia.org/wiki/Utilitarianism#Criticisms_and_responses">not all philosophers</a>)! Hence in part our interest in <em>principals' (humans') welfare regret</em>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>There's a small lineage of research into this relationship, beginning, as far as we can tell, with <a href="https://doi.org/10.1515/9781400881970-006">Arrow, Barankin and Blackwell's now-eponymous ABB theorem of 1953</a>. We discuss this more in the appendices to our paper.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>Of course, in a real chess match, we may share the positive-sum subgoal of 'have fun playing chess together', along with other mutual interests outside the game.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p>The authors include (somewhat inconsistent) vegetarians and care quite a great deal about animal welfare, don't sue us!</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p>To briefly elaborate on the technical details, first, notice that a utility function, as a real-valued function, is just a vector. (In general functions are potentially very high dimensional, but here we're visualising a 3d space for simplicity.) The possible utility functions u fill the space. We apply our shift c, projecting onto this lower-dimensional manifold (middle image, here a sort of accretion-disk-looking surface). Now we guarantee that any utility functions which differ only by a constant shift are mapped to exactly the same point, but we still have a spread of magnitudes. Normalising by m projects again onto another lower-dimensional surface (here a 1-d circle), and now we guarantee that <em>any utility functions with the same preferences map to exactly the same point, and any utility functions with different preferences map to different points</em>. We also have a few other mathematical guarantees provided by this procedure, which you can read about in the paper.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-12" href="#footnote-anchor-12" class="footnote-number" contenteditable="false" target="_self">12</a><div class="footnote-content"><p>We mentioned earlier that we also analysed <strong>collective alignment</strong>. If we have perfect <em>collective</em> alignment too, then calibration doesn't matter, a Pareto optimum <em>is</em> a welfare optimum, so our original guess is borne out. But since we are the designers of <em>agents</em> (AI), not of <em>principals</em> (humans), and it turns out empirically that humans <em>are not perfectly collectively aligned</em>, this case is ruled out! Perfect collective alignment between agents is possible in principle, but seems unlikely in the near future.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-13" href="#footnote-anchor-13" class="footnote-number" contenteditable="false" target="_self">13</a><div class="footnote-content"><p>We haven't explicitly provided theoretical results on the tightness but some of our necessity results suggest the bounds can be tight for the right choice of parameters.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-14" href="#footnote-anchor-14" class="footnote-number" contenteditable="false" target="_self">14</a><div class="footnote-content"><p>This is a really fundamental barrier for contemporary ML-based AI, where most of the computation takes place inscrutably in huge trained neural networks or similar, and we don't even know if the system can be sensibly described by a utility function, let alone what that function would be. (Consider an application of an LLM-derived AI agent.)</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-15" href="#footnote-anchor-15" class="footnote-number" contenteditable="false" target="_self">15</a><div class="footnote-content"><p>Indeed, some expect the most transformative impacts from AI to come from the ability to explore outcome- and option-space in ways (or at a pace) that humans can't, i.e. a kind of generalised R&amp;D or experimentalism.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-16" href="#footnote-anchor-16" class="footnote-number" contenteditable="false" target="_self">16</a><div class="footnote-content"><p>There are some edge cases, and the implicit welfare weightings are not <em>uniquely</em> defined if the Pareto frontier is non-strictly convex.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Exponentials and extinction]]></title><description><![CDATA[A barrel of laughs in Australia]]></description><link>https://www.oliversourbut.net/p/exponentials-and-extinction</link><guid isPermaLink="false">https://www.oliversourbut.net/p/exponentials-and-extinction</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Sat, 07 Oct 2023 14:28:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!HpM1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a href="https://en.wikipedia.org/wiki/Exponential_growth">Exponentials</a> were on my mind this month (this is nothing new, of course). Back in <strong><a href="https://www.oliversourbut.net/p/un-unpluggability-can-t-we-just-unplug-it">Un-unpluggability</a></strong> I wrote</p><blockquote><p>exponential expansion (until constraints are reached) ... in practice often manifests as first <strong>imperceptible</strong> and then <strong>rapid</strong> escalation.</p></blockquote><p>Connor Leahy, CEO of Conjecture AI, <a href="https://twitter.com/NPCollapse/status/1704904095155245194">echoes me more pithily</a></p><blockquote><p>As we learned with COVID, there are only two times to react to an exponential - too early or too late.</p></blockquote><p>Of course, everything has been said before, and we are both in dialogue with the famous quote from <a href="https://en.wikipedia.org/wiki/Albert_Allen_Bartlett">Professor Bartlett</a></p><blockquote><p>The greatest shortcoming of the human race is our inability to understand the exponential function.</p></blockquote><p>I brought a more specific and lighthearted<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> take in <strong><a href="https://www.oliversourbut.net/p/invading-australia">Invading Australia</a></strong>, where I looked more closely at some case studies of expansionist/replicating (exponential) systems in the field of human biosphere interventions. In summary,</p><blockquote><p>The experiment, this introduction of foreign species was... successful, if by 'successful' you mean 'devastating and difficult or impossible to roll back'.</p></blockquote><p>We learned about some pesky amphibians:</p><blockquote><p>It turns out that cane toads don't jump or climb well, so outside of the lab, where beetles live at the top of sugar cane, the toads were all but useless at their intended purpose. (Out of context failure!)</p><p>But the toads were a success, in their own terms: unexpectedly unfussy eaters, prolific reproducers, and poisonous to most wildlife, they have rapidly colonised Queensland state, lately expanding into New South Wales and the Northern Territories, while being resilient to every attempt at pushing them back. We fought fire (cane beetles) with fire (toads), and ended up with two fires!</p></blockquote><p>and we looked at the rather amusing case of humans deploying <em>yet more</em> replicators to bring the toads under control, namely toad viruses and/or genomic interventions like driving fertility-reductions. I'm reminded of the <a href="https://en.wikipedia.org/wiki/There_Was_an_Old_Lady_Who_Swallowed_a_Fly">Old Lady Who Swallowed A Fly</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HpM1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HpM1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg 424w, https://substackcdn.com/image/fetch/$s_!HpM1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg 848w, https://substackcdn.com/image/fetch/$s_!HpM1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!HpM1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HpM1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg" width="597" height="527" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:527,&quot;width&quot;:597,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HpM1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg 424w, https://substackcdn.com/image/fetch/$s_!HpM1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg 848w, https://substackcdn.com/image/fetch/$s_!HpM1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!HpM1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10617e20-af9a-4741-9951-15aa7f33422e_597x527.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The culprit: <em>Bufo marinus</em></figcaption></figure></div><p>There's good <a href="https://users.ox.ac.uk/~mert2255/papers/cluelessness.pdf">(?)</a> news, of course:</p><blockquote><p>the prickly-pear or paw-paw is another species inadvertently unleashed on the Australian ecosystem, which has caused some displacement of native wildlife. A moth (believe it or not, <em>Cactoblastis</em>!) was found which <em>actually does</em> seem to work as a self-regulating suppressant of the cacti.</p></blockquote><p>When it comes to AI, we'd have to be <em>very very</em> prepared in order to expect shenanigans like this with replicating or propagating systems to end well.</p><p>Also, no-one picked up on my Charles Darwin/extinction pun: 'endless formerlies most beautiful'??</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Oly on AI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3><strong>Upcoming, alignment and cooperation</strong></h3><p>I'm working on a paper<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> with some collaborators in Oxford, analysing <strong>cooperation</strong> and <strong>alignment</strong> concepts for AI. What does this mean? When there are lots of AI systems, we want them to generate value by interacting positively rather than destroying value through conflict or anarchy. That's cooperation. And we want the values they are oriented at to be valuable to <em>us</em>, rather than arbitrary (or worse, harmful) things. That's alignment. Expect a takeaway or two to appear in the near future elucidating and elaborating on that.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/exponentials-and-extinction?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading Oly on AI! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/exponentials-and-extinction?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/p/exponentials-and-extinction?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Extinction of hundreds of species, lighthearted?? I confess this is questionable. Please don't send me angry messages about how much you miss the giant wombats.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Actually we're done really, it's just going through review which, in contemporary academia, is an often arcane, perfunctory, and glacial process, rather far removed from the very healthy review culture I've been lucky enough to experience and contribute to in (some) industry and independent research settings. Not coincidentally, some different collaborators and I are trying various things in Oxford to incubate healthy opt-in review for researchers in AI safety. If you have thoughts on this topic, I'd love to hear them!</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Universally Challenged]]></title><description><![CDATA[Sore Butt and Groin (it's 'Sourbut' and 'Groyne', actually)]]></description><link>https://www.oliversourbut.net/p/universally-challenged</link><guid isPermaLink="false">https://www.oliversourbut.net/p/universally-challenged</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Fri, 06 Oct 2023 14:17:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/09bff644-6da8-4c60-8330-6213e583463e_96x96.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I recently appeared on University Challenge. Our team had a lot of fun, and won the match! I wrote <a href="https://www.oliversourbut.net/p/hertford-sourbut">something about the experience</a>, focusing on the challenges faced in a competitive buzzer quiz format, as well as how these apply to reasoning generally and how we can do better. I managed to sneak in a reference to <a href="https://hpmor.com/">Harry Potter and the Methods of Rationality</a>, which didn't go unnoticed by my friend Mark, the culprit who must be held responsible for introducing me to that most excellent novel. I also mentioned Bayes Rule, uncertainty, calibration, logical uncertainty, time constraints, and cost-functions. All good fun.</p><blockquote><p>We don't get the opportunity to pause time after every question syllable, pull up a notepad, run some supercomputer evaluations, compute exact Bayesian posteriors, estimate our teammates' and opponents' credences and likely buzzing behaviour, and so on. Cruelly, time flows at one second per second and the quizmaster keeps quizmastering. So too in life!</p></blockquote><p>So too, indeed.</p><p>Twitter/X and YouTube had fun, as some friends were eager to point out to me. There were some more flattering remarks, but I think my favourite was</p><blockquote><p><a href="https://x.com/darrenbizzle2/status/1698783211084472404?s=20">Sourbut combs his hair with a toffee apple #universitychallenge</a></p></blockquote><p>As I anticipated, people also enjoyed it when I said 'groyne'. I know it sounds like 'groin', and yes, my name is 'Sourbut' and it's pronounced how you think.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yu_i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yu_i!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!yu_i!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!yu_i!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!yu_i!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yu_i!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif" width="480" height="270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:270,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!yu_i!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!yu_i!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!yu_i!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!yu_i!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a2eb422-7567-4ed3-9f0c-9f12ea5a1588_480x270.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>It's <a href="https://en.wikipedia.org/wiki/Groyne">'groyne'</a>, actually</em></p><p>The editors actually cut the bit where I said, 'Never thought I'd say, "groin" on TV'. As I cryptically hinted in the <a href="https://www.oliversourbut.net/p/hertford-sourbut">Hertford, Sourbut</a> post,</p><blockquote><p>(As we may find out, there are also secretly other options, like '(-1000) say something embarrassing on national TV'.)</p></blockquote><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Oly on AI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[‘US vs China’ vs me]]></title><description><![CDATA[More lazy, dangerous rhetoric on the state and nature of AI competition]]></description><link>https://www.oliversourbut.net/p/us-vs-china-vs-me</link><guid isPermaLink="false">https://www.oliversourbut.net/p/us-vs-china-vs-me</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Fri, 06 Oct 2023 06:30:22 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/7c0d6928-39e5-4e9e-a4b6-ee7f6172d746_512x512.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I had some great feedback on my piece <strong><a href="https://www.oliversourbut.net/p/careless-talk-on-us-china-ai-competition">Careless talk on US-China AI competition?</a></strong>, which generated a bit of discussion (and perhaps a little controversy).</p><p>Ironically for a piece on speaking clearly and with nuance, I failed to explicitly point out crucial facts! - the actual true accounts regarding one of the exemplars I brought of language-misuse (these were obvious in my mind while writing, but some readers appeared confused in ways which make sense if they didn't know). I criticised</p><blockquote><p>China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p></blockquote><p>but didn't explicitly point out that, beyond being an oversimplification, there just isn't a ready way to map this to the reality, which is that</p><ul><li><p>the smuggling in question was done by... smugglers</p></li><li><p>the buying of chips was done by multiple China-based entities</p></li><li><p>the (implicit but unmentioned) <em>selling</em> (and importantly, provisioning/enabling) of chips was done by NVIDIA, a US-based company (and perhaps others)</p></li><li><p>the investing was done by the CCP</p></li></ul><p>I had a great response from CAIS in particular. The <a href="https://forum.effectivealtruism.org/posts/xABJoccsRyfXGNDEA/careless-talk-on-us-china-ai-competition-and-criticism-of?commentId=SrG4roQuDwD36cHfX">original author agreed</a> this was ambiguous and unfortunate, and they've updated the text in question substantively. They also responded</p><blockquote><p>More generally, we try to avoid zero-sum competitive mindsets on AI development. They can encourage racing towards more powerful AI systems, justify cutting corners on safety, and hinder efforts for international cooperation on AI governance. It&#8217;s important to discuss national AI policies which are often explicitly motivated by goals of competition without legitimizing or justifying zero-sum competitive mindsets which can undermine efforts to cooperate. While we will comment on the how the US and China are competing in AI, we avoid recommending "race with China."</p></blockquote><p>This was really welcome and I hope other readers took on board the lesson here.</p><p>A few other readers pushed back a little. <a href="https://forum.effectivealtruism.org/posts/xABJoccsRyfXGNDEA/careless-talk-on-us-china-ai-competition-and-criticism-of?commentId=n5EzdB6CDmf2idm7A">Stephen Clare expressed general agreement</a> and offered a rearticulation of the problem I'm pointing to, while also criticising my relegation of governments to 'not currently meaningful players in AI development and deployment' as being too strong. Quite right: I meant that governments have (to date) been entirely passengers regarding the <em>direction and nature</em> of advanced AI development, but it is true that they have begun to get involved in coarse economy-level lever-pulls like investing and regulating hardware.</p><p>I went on a minor rant in the comments:</p><blockquote><p>Do people actually think that Google+OpenAI+Anthropic (for sake of argument) <em>are</em> the US? Do they think the US government/military can/will appropriate those staff/artefacts/resources at some point? Are they referring to integration of contemporary ML/DS into the economy? The military? Or impacts on other indicators<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>? What do people mean by "China" here: CCP, Alibaba, Tencent, ...? If people mean these things, they should say those things, or otherwise say what they do mean. Otherwise I think people motte-and-bailey themselves (and others) into some really strange understandings.</p></blockquote><p>Amazingly, <a href="https://forum.effectivealtruism.org/posts/xABJoccsRyfXGNDEA/careless-talk-on-us-china-ai-competition-and-criticism-of?commentId=aoqrwqQZppoKQ6jgJ">one reader admitted</a> that,</p><blockquote><p>Yes.</p><p>In the end, all the answers to your questions are yes.</p></blockquote><p>and made some further assertions about inevitability of international conflict. We had a minor back-and-forth but this was pretty remarkable, to me, and I think there was some talking-past happening. Thank you for sharing honestly.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">For more curmudgeonly takes on the state of AI discourse, subscribe!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x4tK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x4tK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png 424w, https://substackcdn.com/image/fetch/$s_!x4tK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png 848w, https://substackcdn.com/image/fetch/$s_!x4tK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png 1272w, https://substackcdn.com/image/fetch/$s_!x4tK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x4tK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png" width="512" height="512" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:512,&quot;width&quot;:512,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:333107,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.oliversourbut.net/i/137701283?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x4tK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png 424w, https://substackcdn.com/image/fetch/$s_!x4tK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png 848w, https://substackcdn.com/image/fetch/$s_!x4tK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png 1272w, https://substackcdn.com/image/fetch/$s_!x4tK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee5bf006-2677-484e-a30f-68f9c6524fe4_512x512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">You should feel bad (hotpot.ai/art-generator)</figcaption></figure></div><p>Sadly, Scott Alexander, an author I hugely admire, has evidently not read my <a href="https://forum.effectivealtruism.org/posts/xABJoccsRyfXGNDEA/careless-talk-on-us-china-ai-competition-and-criticism-of">admonishment to CAIS</a>, as <a href="https://www.astralcodexten.com/p/pause-for-thought-the-ai-pause-debate">his latest letter</a> is full of thoughtless remarks about China and the US/West. Scott, you should know better. Words have power. Saying this, I think it is a good and useful post in many ways, in particular laying out a partial taxonomy of differing pause proposals and gesturing at their grounding and assumptions. He writes,</p><blockquote><p>The biggest disadvantage of pausing for a long time is that it gives bad actors (eg China) a chance to catch up.</p></blockquote><p>There are literal misanthropic 'effective accelerationists' in San Francisco, some of whose stated purpose is to train/develop AI which can surpass and replace humanity. There's Facebook/Meta, whose leaders and executives have been publicly pooh-poohing discussion of AI-related risks as pseudoscience for years, and whose actual motto is 'move fast and break things'. There's OpenAI, which with great trumpeting announces its 'Superalignment' strategy without apparently pausing to think, 'But what if we can't align AGI in 5 years?'. We don't need to invoke bogeyman 'China' to make this sort of point. Note also that the CCP (along with EU and UK gov) has so far been <em>more</em> active in AI restraint and regulation than, say, the US government, or orgs like Facebook/Meta.</p><blockquote><p>Suppose the West is right on the verge of creating dangerous AI, and China is two years away. It seems like the right length of pause is 1.9999 years, so that we get the benefit of maximum extra alignment research and social prep time, but the West still beats China.</p></blockquote><p>Now, this was in the context of paraphrases of others' positions on a pause in AI development, so it's at least slightly <a href="https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction">mention-flavoured (as opposed to use)</a>. But as far as I can tell, the precise framing here has been introduced in Scott's retelling.</p><p>Regardless of the origin of this formulation, this is bonkers in at least two ways. First, who is 'the West' and who is 'China'? This hypothetical frames us as hivemind creatures in a two-player strategy game with a single lever. Reality is <em>a lot more porous</em> than that, in ways which matter (strategically and in terms of outcomes). I shouldn't have to point this out, so this is a little bewildering to read. Let me reiterate: governments are not currently pursuing advanced AI development, only companies. The companies are somewhat international, mainly headquartered in the US and UK but also to some extent China and EU, and the governments have thus far been unwitting passengers with respect to the outcomes. Of course, these things can change.</p><p>Second, <em>actually think</em> about the hypothetical where 'we'<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> are 'on the verge of creating dangerous AI'. For sufficient 'dangerous', the only winning option for humanity is to take the steps we can to prevent, or at least delay, that thing coming into being. This includes advocacy, diplomacy, 'aggressive diplomacy' and so on. I put forward that the right length of pause then is 'at least as long as it takes to make the thing not dangerous'. You don't win by capturing the dubious accolade of nominally belonging to the bloc which directly destroys everything! To be clear, I think Scott and I agree that 'dangerous AI' here is shorthand for, 'AI that could defeat/destroy/disempower all humans in something comparable to an extinction event'. We already have weak AI which is dangerous to lesser levels. Of course, if 'dangerous' is more qualified, then we can talk about the tradeoffs of risking destroying everything vs 'us' winning a supposed race with 'them'.</p><p>I'm increasingly running with the hypothesis that many anglophones are mind-killed on the inevitability of contemporary great power conflict in a way which I think wasn't the case even, say, 5 years ago. Maybe this is how thinking people felt in the run up to WWI, I don't know.</p><p>I wonder if a crux here is some kind of general factor of trustingness toward companies vs toward governments - I think extremising this factor would change the way I talk and think about such matters. I notice that a lot of American libertarians seem to have a warm glow around 'company/enterprise' that they don't have around 'government/regulation'.</p><p>[ <a href="https://www.oliversourbut.net/p/careless-talk-on-us-china-ai-competition">In my post</a> about this I outline some other possible cruxes and I'd love to hear takes on these ]</p><p>Separately, I've got increasingly close to the frontier of AI research and AI safety research, and the challenge of ensuring these systems are safe remains very daunting. I think some policy/people-minded discussions are missing this rather crucial observation. If you expect it to be easy (and expect others to expect that) to control AGI, I can see more why people would frame things around power struggles and racing. For this reason, I consider it worthwhile repeating: <em>we don't know how to ensure these systems will be safe, and there are some good reasons to expect that they won't be by default</em>.</p><p>I repeat that <a href="https://www.astralcodexten.com/p/pause-for-thought-the-ai-pause-debate">Scott&#8217;s post</a> as a whole is doing a service and I'm excited to see more contributions to the conversation around pause and differential development and so on.</p><p>Relatedly, I had a great conversation at lunch yesterday with Will MacAskill, who&#8217;s currently working on questions of coordination around development of advanced AI. Very excited to read more when that comes out!</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/us-vs-china-vs-me?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thank you for reading Oly on AI. This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/us-vs-china-vs-me?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/p/us-vs-china-vs-me?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Center for AI Safety, <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-19">AI Safety Newsletter #19, 2023-08-15</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>What indicators? Education, unemployment, privacy, health, productivity, democracy, inequality, ...?</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Who, me? You? No! Some development team at DeepMind or OpenAI, presumably, or one of the current small gaggle of other contenders, or a yet-to-be-founded lab.</p></div></div>]]></content:encoded></item><item><title><![CDATA[Careless talk on US-China AI competition?]]></title><description><![CDATA[Caution on political discussion, nuance is needed, and criticism of CAIS coverage]]></description><link>https://www.oliversourbut.net/p/careless-talk-on-us-china-ai-competition</link><guid isPermaLink="false">https://www.oliversourbut.net/p/careless-talk-on-us-china-ai-competition</guid><dc:creator><![CDATA[Oliver Sourbut]]></dc:creator><pubDate>Wed, 20 Sep 2023 12:45:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8o3N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p>China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p></blockquote><p>Sounds about right?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8o3N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8o3N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!8o3N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!8o3N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!8o3N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8o3N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:570175,&quot;alt&quot;:&quot;Robot panda and robot eagle face each other antagonistically&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Robot panda and robot eagle face each other antagonistically" title="Robot panda and robot eagle face each other antagonistically" srcset="https://substackcdn.com/image/fetch/$s_!8o3N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!8o3N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!8o3N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!8o3N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5cc71-7620-4a33-8fd4-2c802ecbb3ad_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">hotpot.ai/art-generator</figcaption></figure></div><p>This post centres around an email I sent to the Center for AI Safety (CAIS) expressing concern about their <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-19">2023-08-15 newsletter</a>'s coverage of US-China competition in the AI space<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>, but the overall point is broader. There are some ways of discussing the topic of international relations regarding AI which strike me as un-nuanced in a counterproductive and dangerous way, by hiding certain truths or emphasising others, and supporting a conflict-oriented mindset.</p><p>In writing about this, I'm also gesturing at something about the more general topic of 'how to think and write about politically-charged topics'.</p><p>Jump to the summary if you are in a hurry.</p><p>This conversation really is important, which is why I think it's worth a public message discussing particular statements, but this should be understood as constructive criticism and part of a broader conversation which society, and especially the community of those focused on AI safety, needs to have. The CAIS newsletters are worth a (not unquestioning!) read, including <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-19">the edition in question</a>.</p><p>The particulars in this message serve as good exemplars of the problems and questions I have, and I'd be interested in responses from CAIS but even more so in remarks more broadly on the topic from anyone interested. The public conversation about this appears from my perspective to sometimes be broken. If that is so, I would like it to be rectified, and if not, I would like to be put right myself, the better to prioritise in my own work!</p><h2>Specifics, CAIS case study</h2><p>Here, quoted<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, is my response to the CAIS letter, which serves as an initial dialogue opener and the core of this post:</p><blockquote><p>Hello,</p><p>I've been a supportive reader for some time and am myself an AI safety researcher. I'm generally very impressed and encouraged by your newsletters! - but I was disappointed and concerned by the phrasing (and mindset it can encourage) regarding US-China competition in the letter dated 16th August ('US-China Competition on AI Chips, ...').</p><p>In general I'm very wary of messaging which could inflame a them-vs-us mindset, in short mainly because I think it a) destroys humans' ability to think sensibly and b) tends to foreclose win-win outcomes. I expect these brief points to be clear and to have rich referents in your mental pictures of the world, but please correct me if not!</p></blockquote><p>Interjection: I'm referring to the vicinity of <a href="https://www.lesswrong.com/posts/9weLK2AJ9JEt2Tt8f/politics-is-the-mind-killer">mind-killing politics</a></p><blockquote><p>I think your letter skirted close to dangerously simplified presentation in this way. I would not normally spend my time or yours on a criticism of one section of a newsletter, but in this case I consider it worthwhile because your letter is close to the Pareto frontier on nuance, correctness, helpfulness, and reach (and it pays to try to nudge such things in good directions), but this kind of message needs to be delivered with care to avoid misunderstanding and harm.</p><p>Hopefully my pointing this out, accompanied by a few select quotes, is enough to encourage you to carefully take this criticism into account, but I'd happily expand more if you like!</p><p>Without further ado, a few quotes and my response:</p><blockquote><p>The US and China have been competing for access to these chips for years.</p></blockquote><p>Kind of true, but really <em>US-based and China-based international corporations</em> (as well as other orgs) have <em>sought access</em> to this scarce resource. Competition <em>for this particular resource</em> is mostly zero-sum across all of these entities, importantly <em>including intra-US and intra-China</em>.</p><p>Where market-share is near zero-sum (i.e. for direct competitors) the market-share outcomes may be greater-than-linearly zero-sum in this resource (your minor/temporary lack of chips could be my major gain of market-share), which might better warrant the term 'competing', but this effect is actually much stronger <em>intra-country/bloc</em> rather than inter-, due to respective markets! i.e. Google and Microsoft really care about <em>each other's</em> chip access in a way that they only do to a weaker degree about Alibaba's.</p></blockquote><p>Interjection: To emphasise, 'the US and China have been competing' doesn't literally preclude belief in intra-bloc competition. But there's a strong implicature that intra-bloc is (relatively) unimportant, while in fact the mechanism I mentioned here <em>increases</em> intra-bloc competition (which I think is borne out by observation to date).</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/subscribe?"><span>Subscribe now</span></a></p><blockquote><p>When governments have paid attention, they have indeed made moves which adjust share (but also supply, as you've noted later, making it nonzero-sum, in chips at least). It's unclear (to me) exactly what incentives have motivated each move, but certainly they're not the actions of monolithic or coherent entities 'The US' and 'China'. And it's certainly not the case that where such activity changes chip share, it's <em>collected by</em> the acting entity. Non-governments have also made moves adjusting chip share, for example the case you cite of Nvidia (a 'US' company) deliberately rules-lawyering the US gov in order to supply more chips to various China-based companies!</p><p>Typically when nations are used as agentic subject nouns it refers to the government and/or military of said country. I don't think there's a reading of these statements in those terms which is true, and I'm not aware of any other plausible reading which is true.</p><div><hr></div><blockquote><p>China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.</p></blockquote><p>I dislike this sentence and think it is false! Who is this 'China'? Did said unified entity carry out all of these activities? Was it coherently pursuing all of these 'several efforts' to some particular end?</p></blockquote><p>Interjection: I feel that I was unkind in my tone here. The kind of claim exemplified in the letter and in other places has proved impossible for me to map to something sincerely resembling reality without caveating so much as to be essentially starting from scratch i.e. it looks like a kind of non-proposition or emotive filler. I would be very interested to hear from people who have a better parse on this to help me understand! My crux discussion below the rest of the email contains my current best attempts.</p><blockquote><blockquote><p>Meanwhile, the United States has struggled to build American chip manufacturing capacity, and has taken further steps to prevent Americans from investing in Chinese technology.</p></blockquote><p>This one is poor for the same reasons, though not quite as bad (perhaps because the authors are American/anglophone and have a closer perspective on the nuance).</p><p>The discussion after this point in the letter is relatively good and nuanced! It names (some of) the individual orgs and companies, and makes clearer the multiplicity of others. All further references to nations as agentic subject nouns appear to be consistent with a conventional reading referring to the respective governments.</p><p>I'm interested to know how the rather good detail got paired with a rather harmful introduction and I urge you to consider the processes and thinking which gave rise to this section of the otherwise good letter.</p><p>Thanks,</p><p>Oly</p></blockquote><p>This was a brief email intended to convey something I expected to be quickly-graspable with a few pointers. CAIS content suggests that they have a broader familiarity with associated facts, but it seems to be digested/compressed here in a way which is needlessly and harmfully lossy.</p><p>In particular, the statements abstract very neatly over pre-drawn boundaries (national i.e. 'US' and 'China') and furthermore assign a greater sense of coherence and agency to those abstractions than is warranted. At least some possible such statements must be true-ish (or we would not have those abstractions), but this convenient compression happens in too many conversations to be a coincidence! Said pre-existing abstraction boundaries are already salient in the information ecosystem, and laden with emotional and political baggage. This same phenomenon (it sometimes seems like a <a href="https://www.lesswrong.com/posts/34XxbRFe54FycoCDw/the-bottom-line">pre-written bottom line</a> but in implicature?) appears in other publications by other orgs and in verbal conversations I've witnessed or been part of.</p><p>The fact that I, a relative governance rookie (I'm focused mainly on technical matters), struggle to rectify or understand this makes me wonder: am I missing something? Is there a relevant factor I'm unaware of? More concerningly, is there some terrible equilibrium which prevents more involved people from speaking more clearly here? I think more likely the abstractions ('US' and 'China') have a background potency which distorts perceptions and shapes how people communicate.</p><h2>Possible cruxes and areas of high uncertainty</h2><p>Implicit in my own discussion is the background assumption that the main concern is about possible <em>direct or near-direct</em> impacts<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> of AI deployments by <em>government</em> entities (or military). This seems to me the obvious reading when people use countries as agentic subject nouns. ('China [the governmental entity] has made several efforts'<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>.) In this framing, I can't rectify some of the things people say with reality. But some alternative concerns might fit the bill, and if so, this is evidence that people are talking past each other, and we should aim to frame concerns more clearly!</p><p>I've never focused intently on this area, but have had a handful of conversations about this over the years, and among relevant cruxes seems to be a family of questions along the lines of</p><blockquote><p>How quickly/totally/coherently <em>could</em> US gov/CCP capture AI talent/artefacts/compute within its jurisdiction and redirect them toward <em><a href="https://en.wikipedia.org/wiki/Excludability">excludable</a></em> destructive ends? Under what circumstances would they want/be able to do that?</p></blockquote><p>People's intuitions here appear to differ a lot, and data might be hard to come by!</p><p>It seems plain that nations are not currently meaningful players in AI development and deployment, absent conspiracy-level secrecy. So to support the apparent take that they <em>are</em>, we may need to imagine that they <em>could ably/rapidly become</em> meaningful players in AI development and deployment, hence the above cruxes.</p><p>Depending on the answers to these questions, one might perceive various goings-on which happen to occur under one or other jurisdiction to have greater import on the international stage and perhaps to warrant treating national or multi-national blocs as more coherent entities than they really are at present, for the purposes of AI discussion. ('China [the government] has [allowed/encouraged <s>made</s>] several efforts [because eventually they will probably seize the gains/means]'<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>.)</p><p>Other possible cruxes, more guesswork:</p><ol><li><p>Perhaps the concern is about <em>indirect</em> (e.g. economic) impacts of non-government entities' AI activities leading to some (risky) change in balance of power (between existing governments/blocs)</p><ul><li><p>Then, abstracting references to lots of individuals and groups via their home country might be a move which is writer-intuitive, even if nonstandard and reader-confusing. ('China [the impersonal collective economic entity] has made several efforts'<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>.)</p></li><li><p>The additional step in this type of theory, namely that indirect effects cause a risky change in balance of power, should really be spelled out if it is loadbearing</p></li><li><p>The use of countries as agentic subject nouns is difficult to justify under this reading</p></li></ul></li><li><p>Perhaps the concern is indeed about direct impacts, but wielded by non-government entities (who remain the major players in development and deployment of AI)</p><ul><li><p>If so, the conversation should be about general/global resource/capabilities rather than inter-bloc 'competition'...</p></li><li><p>...unless we <em>also</em> posit that inter-bloc non-government conflict is liable to be <em>much worse</em> than intra-bloc<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a></p></li><li><p>Similarly, if these are loadbearing assumptions, they really ought to be spelled out clearly</p></li><li><p>The use of countries as agentic subject nouns could be justified at a stretch here, but only by first spelling out the reasoning</p></li></ul></li><li><p>Perhaps the concern is that, regardless of the actual impact of AI resources, <em>apparent</em> competition could lead to inflammation of traditional conflict, or weaken defenses against such inflammation</p><ul><li><p>Then, reporting on the <em>apparent</em> competition <a href="https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction">via a </a><em><a href="https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction">mention</a></em> and with explicit caveat, would make sense! ('China [gov/military] has [been <em>perceived</em> as having] made several efforts... [but the reality is more nuanced]'<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>.)</p></li><li><p>Alternatively, reporting on <em>actual</em> conflict could be used as evidence for the claim (that apparent competition inflames conflict), but only by also pointing to the stated or implied <em>reasons</em> for the conflict. ('China has made several efforts... [in each case citing US provocation as justification]'<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>.)</p></li><li><p>In either case, there are additional claims being made that can not be left implicit, and require supporting argument</p></li></ul></li></ol><h3>Speculation</h3><p>As it is, for me, the evidence seems to suggest that an AI race, if it is happening at all, is being run by (mainly US- and UK-based) companies with little or no oversight from governments or militaries. Rather, governments are in a position to collectively act to diffuse the race! And they appear as likely to do this as to exacerbate it, from my limited viewpoint.</p><p>Separately, a lack of reliable alignment techniques and performance guarantees makes AI-powered belligerent national interest plays look more like bioweapons than like nukes - i.e. <a href="https://en.wikipedia.org/wiki/Excludability">minimally-excludable</a> - and perhaps mutually-knowably so! This presently damps the incentive to go after them. But proliferation of naively-aligned AI ('figure out what I want and make it happen') might make harm plays <em>more excludable</em>, exacerbating lose-lose or race game dynamics ('go and steal/destroy their stuff but don't let that happen to my stuff'). This concern in part motivates consideration of multi-principal-multi-agent delegation and the cooperative AI agenda.</p><h2>Summary and takeaways</h2><p>Un-nuanced coverage and discussion has the potential to inflame harmful confusion and us-vs-them mentality, which diminish the chance of safe outcomes.</p><ul><li><p>China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.</p></li><li><p>China [the governmental entity] has made several efforts...</p></li><li><p>China [the government] has [allowed/encouraged <s>made</s>] several efforts [because eventually they will probably seize the gains/means]...</p></li><li><p>China [the impersonal collective economic entity] has made several efforts...</p></li><li><p>[People and organisations in] China [have <s>has</s>] made several efforts...</p></li><li><p>China [gov/military] has [been perceived as having] made several efforts... [but the reality is more nuanced] </p></li><li><p>China has made several efforts... [in each case citing US provocation as justification]</p></li><li><p>...?</p></li></ul><p>When compressing discussion of political topics, be extra wary of <strong>compression which coincidentally abstracts over already-charged us-them divides</strong> (and be careful when phrasing comes too easily, <a href="https://www.lesswrong.com/posts/34XxbRFe54FycoCDw/the-bottom-line">lest you write bottom lines first</a>)! You're <strong>more likely to be wrong</strong> (because your information ecosystem biases toward thinking in these terms, and because you might be <a href="https://www.lesswrong.com/posts/9weLK2AJ9JEt2Tt8f/politics-is-the-mind-killer">mildly-to-severely mind-killed</a> on the matter), and <strong>being wrong is more likely to be harmful</strong> (by reinforcing those dynamics in others).<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a> The same vigilance applies to reading and listening.</p><p>My (not very informed) take is that governments are at this point as likely to want to defuse as to exacerbate an AI race, and those of us with any privileged insight or influence should avoid one-sided discussion of the matter (if anything preferring to focus on constructive, collaborative possibilities, the better to raise them to salience and generate common knowledge).</p><p>Most of my remarks here are somewhat weakly held (if forcefully stated) and it seems important to gather perspectives on this. Inform me! All responses will be gratefully received.</p><p><em>Cross-posted to <a href="https://forum.effectivealtruism.org/posts/xABJoccsRyfXGNDEA/careless-talk-on-us-china-ai-competition-and-criticism-of">EA Forum</a></em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/p/careless-talk-on-us-china-ai-competition?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/p/careless-talk-on-us-china-ai-competition?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Center for AI Safety, <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-19">AI Safety Newsletter #19, 2023-08-15</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>I'm supportive of some of CAIS' work, and the content of their newsletters (they have impressive breadth), and theirs is far from the only outfit which appears to produce confused or confusing messaging on the topic of US-China competition. In fact they seem to be better than many!</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>My response is at quote level 1. Excerpts from the CAIS letter are within, at quote level 2. I interject a little for the purposes of this post, without quotation.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>e.g. deployment for weapons control or for offensive R&amp;D (bio, materials, ...)</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>'China [the governmental entity] has made several efforts' is a fairly standard use of language; governments are at least <a href="https://www.oliversourbut.net/i/136696969/parliaments">somewhat coherent and also do things with consequences</a> and subsequent plans 'in mind'. This sentence has the disadvantage of being baldly false, though (unless we posit a near-hivemind coherence to the people of China, which is absurd and obscene)</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>'China [the government] has [allowed/encouraged <s>made</s>] several efforts [because eventually they will probably seize the gains/means]' is a big stretch of the language, but at least somewhat consistent. To support this reading, though, there's a substantial additional claim that needs to be justified.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>'China [the impersonal collective economic entity] has made several efforts' would be a rather nonstandard use of language; economies of billion+ people do not make 'efforts' with consequences or subsequent plans. Leaving aside the implicature of agency, though, this sentence is a closer fit to reality. '[People and organisations in] China [have <s>has</s>] made several efforts' would be even better.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>Should conflict between non-territorial entities be worse for inter-bloc than intra-bloc? I think the point I made previously about zero-sum market-share competition suggests the opposite. But humans' destructive jingoistic/xenophobic tendencies are real, and a point in favour.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>'China [gov/military] has [been perceived as having] made several efforts... [but the reality is more nuanced]' is in large part one of the messages of this post! I don't think the original letter in question can have meant this, but I do maintain it as a hypothesis for the more general case of compressed discussion of political things. People are often <a href="https://www.lesswrong.com/posts/7cAsBPGh98pGyrhz9/decoupling-vs-contextualising-norms">imagining third-party reactions</a> when discussing political topics, and sometimes <a href="https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction">use-mention distinctions</a> fail to come across.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p>'China has made several efforts... [in each case citing US provocation as justification]' is something that could legitimately be said, and has a clear meaning, even if in this particular case it is false if we understand 'China' to be the CCP or military.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p>I'd tentatively go further and suggest you ought to train yourself to be <em>appalled when you catch yourself doing this without justification</em> because only then do you stand a chance of thinking clearly about politics.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.oliversourbut.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.oliversourbut.net/subscribe?"><span>Subscribe now</span></a></p></div></div>]]></content:encoded></item></channel></rss>