Citations Needed
Magic Encyclopedias to Save the World
Last week FLF launched a competition “to find the best workflows and methodologies for using AI to produce reliable, trustworthy knowledge bases”. I had (and have ongoing) a substantial role in that effort. Why do I think it’s so important? It’s a lot of reasons actually! I’ll gesture at a few here.
Conjuring a magic encyclopedia
For now, assume with me that it can be done. Wish away with me the various technical and financial challenges. Great! Now we can rapidly conjure up a deeply, fully researched knowledge base on any topic. All claims point back to who’s said them, in what context, and (importantly) with what justifications and evidence (if any). Any quibbles or nuances which have been expressed on a point are similarly readily available. It’s not opinionated: all competing viewpoints with their associated justifications are associated and comparable.
That’s way, way too much information! Imagine trying to read everything ever about diet or shipping or taxes or microbes. It’s not happening. So as well as this, we now magically have tools which gather similar points together, summarise, and can make a decent stab at which points we’ll consider most or least relevant. We can dig deeper (or send AI agents to scout deeper) as desired. And when new interesting and informative content arises, or in contexts where nuance and clarification are helpful, it can be bubbled up to our attention.
All this is doable today: enough web searches, enough cross-referencing of tweets, articles, journals, following of citation chains, gathering and comparing of hypotheses and points of view, etc. will make progress. But it’s exhausting. When someone does go to those lengths, their partial — but heroic — efforts to map out what’s been said often languish either unpublished or unrecognised.
Don’t we already have this? A shining example is Wikipedia, where the collective curatorial effort of a wide range of editors gradually maps out an expanding core of topics and commentary. But Wikipedia has lags, biases, and (perhaps most importantly) huge gaps, especially on important frontier questions. (Let’s not talk about Grokipedia.1)
Meanwhile, the tech to ‘smartly browse’ and bubble up informative pieces is nascent too, in bits and pieces like AI chatbots and community notes: already useful in their ways, but faltering, unreliable.
It’s these comparison points and the early progress I see which gives me some excitement that the grander vision is viable and that we can take steps towards it now.
Who cares?
I’m not naive. I know that many (most?) humans a lot of the time aren’t actually interested in finding out or sharing what’s true; mainly they want to say what makes themselves and their friends seem popular and cool… and enemies seem dastardly and disgusting. We all have these impulses to greater or lesser extent. Yes, you too! Sometimes those impulses seem deranged (they’re not designed for the modern world); other times they might even make sense, at least selfishly.
Nevertheless (and perhaps mysteriously!), a lot of the time, some people actually want to find out true things and share them. (I do! Do you?) Hence journalism, science… even hearsay and rumour (at their — perhaps rare — finest). We recognise that, when they’re actually anchored and doing their best to be right (or at least less wrong), those are absolutely foundational to wellbeing and prosperity in a modern society. Without (good) journalism, politics runs astray and tyrants abound. Without (grounded) science and technology, public health suffers, food supply and shelter and infrastructure decay, and progress falters.
As Ben Goldhaber and I previously wrote:
Knowledge is integral to living life well, at all scales:
Individuals manage their life choices: health, career, investment, and others on the basis of what they understand about themselves and their environments.
Institutions and governments (ideally) regulate economies, provide security, and uphold the conditions for flourishing under their jurisdictions, only if they can make requisite sense of the systems involved.
Technologists and scientists push the boundaries of the known, generating insights and techniques judged valuable by combining a vision for what is possible with a conception of what is desirable (or as proxy, demanded).
More broadly, societies negotiate their paths forward through discourse which rests on some reliable, broadly shared access to a body of knowledge and situational awareness about the biggest stakes, people’s varied interests in them, and our shared prospects.
(We’re especially interested in how societies and humanity as a whole can navigate the many challenges of the 21st century, most immediately AI, automation, and biotechnology.)
But our knowledge-producing institutions are plagued by publish-or-perish and clickbait incentives alike2 — and the social media landscape is even worse, riddled with misinformation and brainrot from all political quarters. I care about this. So do you, I daresay.
I especially care now, as society is poised before a series of important decisions about our future relationship with technology, especially AI. It could be ruinous, with tyranny, neo-feudalism, or extinction real prospects. Or it could be fantastic. Just wanting it to be OK isn’t enough: we have to seek, generate, share, and defend important knowledge — about developments in technology, as well as about trends in politics and power — and act on it.
How do we actually help?
There’s no single path or silver bullet. But the incredibly high-level picture is: better communication of knowledge is usually good. It helps people be more informed and make better decisions according to their needs. A better shared understanding makes it easier for people to work together toward shared goals (even if they don’t agree on all priorities). On average if people make better decisions and can work together better, we’ll get more flourishing and less catastrophic risk.
We’re trying to stimulate one piece of this picture with the knowledge-base direction. Heavy-handedly adjudicating what’s true rarely works.3 Instead, equip people with the fullest picture possible, as accessibly as possible, and we find our way: evidence adds up, and when it doesn’t, that means we need to look for more.
As Scott Alexander wrote years ago (emphasis partly mine),
Logical debate has one advantage over narrative, rhetoric, and violence: it’s an asymmetric weapon. That is, it’s a weapon which is stronger in the hands of the good guys than in the hands of the bad guys. In ideal conditions (which may or may not ever happen in real life)... when done right, it can only prove things that are true. … Unless you use asymmetric weapons, the best you can hope for is to win by coincidence.
I’m not focused on logical debate per se, and in any case I wouldn’t be so Manichean about it — we’re all ‘good guys’ sometimes and ‘bad guys’ sometimes (whether we mean it or not) — but the articulation is compelling. Humanity has accrued a slowly-growing arsenal of these asymmetric weapons: libraries, citation, scientific review4, databases, encyclopedias, web search, to name a few. Today they’re creaking under the weight of a confusing information deluge and assaulted by powerfully-vested interests.
I earnestly believe that an upgrade to truth-seekers’ ability to find and scrutinise information, to build and share fuller pictures of topics at hand, can be ‘infectious’: more people more of the time can see a little further, pierce a little more of the fog of confusion and misinformation, be better epistemically defended, and embody — and exemplify — truth-seeking cognition. When people (through malice or negligence) spread confusion and falsehoods, they’re that bit more likely to face scrutiny and consequences. After all, we’re making that scrutiny cheaper, easier, and more accessible.
This applies whether the ‘wielders’ of these new weapons are curious members of the public, scientists, analysts in public institutions, business leaders and technologists, or even the AI assistants those folks recruit to accelerate their work.
In politics, that can mean that people engage more often in collaborative, truth-seeking cognition and less in tribal cognition. And in technology, it can mean that more people can stay better abreast of the important shifts and prospects that will shape our future — helping the public hold decisionmakers to account (and choose better ones), and helping those decisionmakers sincerely and deeply engage with the topics at hand. I want these kinds of epistemic heroics to become commonplace, and I want the epistemic giants among us to stride further still.
Though quite a flawed execution, I think the idea behind Grokipedia — namely, to get AI to substantially help with curating knowledge bases and to use that for collective epistemics — was in the right direction. Unfortunately it was mostly a vanity project and little thought appears to have been given to the grounding or validation, making it less useful than Wikipedia.
Do you remember the replication crisis, which we’re still dragging ourselves out of? The new disease of importance hacking? Have you ever critically read a newspaper for rhetorical slant? Taking a more cynical stance, it’s not only clickbait and publish-or-perish (which are regrettable incentive pressures, but hardly attributable to malice). Science and journalism alike have deep political and adversarial infections as well.
(And even if I wanted to, I don’t have particularly heavy hands, alas.)
I feel compelled to point out that the current state of ‘official’ journal- and conference-managed scientific review is truly dire, especially in some fields including psychology and AI. I hold up the ideal of scientific review, not its pale and diseased shadow as sometimes charaded on Earth.



