TuneInTalks
From The Lawfare Podcast

Scaling Laws: What Keeps OpenAI’s Product Policy Staff Up at Night? A Conversation with Brian Fuller

August 8, 2025
The Lawfare Podcast
https://feeds.acast.com/public/shows/60518a52f69aa815d2dba41c

How rules get built inside the black box

Behind the polished interface of a popular chatbot sits an architecture of judgment calls: design choices, trade-offs, and conversations that decide what a model will say, who it will serve, and which harms it must avoid. Those decisions are the work of product policy — an awkwardly named discipline that blends strategy, ethics, engineering translation, and practical governance. The people who practice it are not just lawyers or public affairs specialists; they are the glue between researchers chasing capability and organizations trying to shepherd those capabilities into a world that can absorb them.

Where strategy meets software

Product policy teams begin long before a product ships. They translate a product’s imagined features into a landscape of possible harms, then build mitigations and tests to ensure those harms remain theoretical. The work is strategic: it aligns with long-term business objectives—growing user adoption and demonstrating utility—while also responding to the shifting regulatory terrain in places like Brussels and Washington, D.C. This dual lens forces a constant negotiation between commercial timelines and social thresholds.

From brainstorm to risk matrix

A practical ritual animates the work. Teams sketch features, imagine interface flows and edge cases, and convene stakeholders across engineering, legal, research, and operations. They build risk matrices listing undesirable outcomes and propose mitigations. The mitigations must be feasible, consistent with product vision, and verifiable by testing. When a test fails, the product is rolled back and the mitigations are rethought. That iterative loop — brainstorm, mitigate, test, repeat — is the closest thing to an operating manual for responsible deployment.

Trade-offs that feel philosophical and intensely practical

Good policy work recognizes that values often trade off against one another. Safety and privacy, for example, can sit at opposing poles: a world optimized for safety may tolerate invasive monitoring, while a privacy-first approach can leave dangerous gaps in protective capacity. Product policy is the discipline of navigating those trade-offs, intentionally populating teams with diverse perspectives so decisions do not default to a single cultural or professional lens.

Local values, global consequences

Products built in a handful of cities are used everywhere. The diffusion of modern generative models is historic in speed and scale, meaning policy choices reflect and affect an entire planet. To avoid parochial outcomes, product policy work needs a global aperture: consulting regional partners, sending delegations to distant contexts, and translating local concerns into product requirements so that different communities’ needs are not an afterthought.

When mitigations surprise the designers

Models do not obey intuition. A policy that seems straightforward can produce bizarre, unintended outputs when a model invents ways to satisfy constraints. One concrete lesson arises from content moderation on image generation: simplistic rules about nudity produced photorealistic figures that circumvented the intended ban. The error was not theoretical; it revealed how policies must be written with a deep understanding of model behavior and how models can exploit rule boundaries.

Testing, humility, and collaborative design

The antidote to these surprises is humility plus multidisciplinary collaboration. Writing rules in isolation invites failure. When policy is authored in tandem with engineers and researchers, the resulting controls are more likely to be robust. The people doing the testing must be empowered to fail tests honestly; sandbagging or constructing tests to reach a predetermined conclusion undermines the entire safety regime. Formal processes that require transparent testing and realistic failure modes strengthen the product lifecycle.

Scaling risks: from snarky replies to biosecurity

Concerns have shifted as models have evolved. Early anxieties centered on toxic language or political slant; today’s frontier worries are heavier and more consequential. Models that can hypothesize practical ways to harm others — especially in biological, chemical, radiological, or nuclear realms — create new governance imperatives. The diffusion of know-how is increasingly the bottleneck: if dangerous technical knowledge becomes easy to generate, traditional controls based on limiting access to equipment or privileged expertise become insufficient.

What industry leadership looks like

Private labs can’t substitute for governments, but they can help define what “safe enough” looks like. That means publishing rigorous standards, convening expert validators, and cooperating across jurisdictions so that one lab’s lax practices do not become the weak link for global security. It also means avoiding the theatrical posture of being a lone gatekeeper; leadership in this domain is about building consensus and shared metrics for evaluation.

Process, people, and the path forward

Policy work is also an apprenticeship. The fastest route into the space is not a single credential but a pattern of proximity: working alongside engineers, arguing in the room where trade-offs are made, and cultivating the humility to admit what one does not know. That openness allows experts to inform decisions and produces policies that behave well in messy, real-world conditions.

  • Embed early: Involve policy perspectives from ideation through testing to anticipate harms.
  • Test honestly: Create transparent, adversarial testing regimes and act on failures.
  • Globalize governance: Translate local concerns into product design and consult diverse communities.
  • Value pluralism: Deliberately include conflicting viewpoints to prevent single-value dominance.

Policy is not a one-time checkbox but an ongoing craft of translation: converting human values into rules that live inside systems that do not share human instincts. The work is imperfect and iterative, but when done with intellectual humility and collaborative rigor it is the most effective mechanism society has for shaping what powerful technologies will be permitted to do.

Even as models accelerate, the deeper test will be whether those charged with guarding the road can keep pace not by asserting certainty, but by designing institutions and processes that adapt when the map changes.

Insights

  • Embed policy experts with engineering teams early to identify and mitigate risks before deployment.
  • Design tests to be adversarial and transparent so failures reveal true weaknesses rather than masked success.
  • Include cultural and regional perspectives in policy design to avoid narrow value assumptions and unintended harms.
  • Treat policy-making as collaborative work across research, engineering, legal, and operations to produce robust outcomes.
  • Prioritize potential existential and biosecurity risks as models grow more capable, rather than focusing only on surface harms.

More from The Lawfare Podcast

The Lawfare Podcast
Lawfare Daily: ‘Big Tech in Taiwan’ with Sam Bresnick
How vulnerable are U.S. tech giants if Taiwan becomes a geopolitical flashpoint?
Aug 7, 2025
The Lawfare Podcast
Lawfare Daily: Oona Hathaway on the Collapse of Norms Against the Use of Force 
Can the postwar ban on conquest survive threats, strikes, and U.S. policy reversal?
Aug 6, 2025
The Lawfare Podcast
Lawfare Daily: The Fallacy of NATO's New Spending Target
The 5% NATO pledge looks big — but will Europe actually be able to fight together?
Aug 5, 2025
The Lawfare Podcast
Lawfare Daily: The Trials of the Trump Administration, Aug. 1
A week of legal theater where burn-bag drama met quiet procedural power plays.
Aug 4, 2025

You Might Also Like

00:0000:00