In an effort to write more and to think less, I’ll be publishing something every day during November. I’ve a bunch of drafts and partially-written posts which have merit but no polish, I’m hoping these short, low-polish, daily essays will give me an opportunity to say something rather than have the ideas decay slowly.

If any of these ideas make it into a full post, I’ll link to the full post.

These mini-essays will be lower quality, poorly thought out, basically un-edited, and in general a lot more off-the-cuff thoughts. Imagine these a bit closer to Boyd’s stream-of-consciousness or late-night bar-thoughts than the other essays I write. But please let me know if any of them resonate!

AI Safety has a scaling problem (2 November 2025)

AI safety research is (mostly) done via fellowship funnels that filter out the population of the internet down to people who have shown they can do AI safety research. This process works well, but requires one-on-one mentorship from academics and researchers, and thus cannot scale. Anthropic recently had a fellowship position for which they accepted 32 people, from a total of 2000 applicants (that’s a 1.3% acceptance rate). MATS mentors were recently talking on twitter about the absurdly high qualifications of the applicants. High bars are good, but it’s not good that people who can do AI safety are not in fact doing AI safety. This is a resource constraints problem, and deserves attention to fix it. I’ve got ideas for a AI research bounty, in which anyone can submit cash prizes for completing some well-defined work, and then anyone can submit completion of that work and receive the prize. But a greater description will have to wait for the full essay.

When to be risky and when to be safe (1 November 2025)

There’s a trade-off that’s often made but rarely considered, and best described through the analogy of dating: when dating to marry¹, you see some people attempt to make every date go well, and try to ensure everyone they date will like them. This seems like a reasonable goal, but I fear it collapses every potential date into a point, as though they’d respond identically. If everyone reacts the same, then it’s reasonable to play it safe and attempt to please everyone. But that’s not the case, in reality 1. everyone reacts to different things in different (often conflicting) ways, and 2. you don’t care about making everyone think you’re a chill dude (a mediocre response), you care about ensuring one person thinks you’re the best person on the planet (an extreme response). You will not get an extreme response by attempting to please everyone, you must be specific enough that you displease many people in order for you to maximally fit with one person.

This applies more generally: you can either pursue a strategy to optimise your “median” performance (never having any terrible experiences but also never having any amazing experiences), or you can optimise your “maximum” performance (often having bad experiences but occasionally having amazing experiences).

The difference in your optimal strategy comes down to the cost of failure, the number of good outcomes you need, and the bar above which you succeed². When building a start-up, you cannot afford to play it safe, because if you take “normal” decisions then you’ll end up like “normal” start-ups, which is to say that you’ll go out of business. You must be weird, and take big risks, because the bar for success is above the median performance. You don’t need to figure out the right way to run a business a hundred times, you just need to do it once. So exploration trumps exploitation (in this example). Likewise for finding a job, you just need one company to give you an offer, you don’t care about a hundred companies thinking you’re a nice guy but not quite what they’re looking for. You’d rather 99 companies think you’re not a good fit, but one company think you’re perfect, than have 100 companies think you might be a good fit. “Might be a good fit” does not get you hired.

Or otherwise with the intention of settling down with someone for more than ten years. ↩
I’m pretty sure these are all the same or equivalent, but I’ll explore these ideas in a full essay ↩

Boyd's Blog

Explorer

Writing every day in November 2025

AI Safety has a scaling problem (2 November 2025)

When to be risky and when to be safe (1 November 2025)

Graph View

Table of Contents

Backlinks

Boyd's Blog

Explorer

Writing every day in November 2025

AI Safety has a scaling problem (2 November 2025)

When to be risky and when to be safe (1 November 2025)

Footnotes

Graph View

Table of Contents

Backlinks