From Lenny's Podcast: Product | Career | Growth

Inside the expert network training every frontier AI model | Garrett Lord (Handshake CEO)

1:09:50

August 24, 2025

Lenny's Podcast: Product | Career | Growth

https://api.substack.com/feed/podcast/10845.rss

Business

Entrepreneurship

Technology

Why Handshake's Student Network Became an AI Training-Data Powerhouse

When product teams and researchers talk about what actually moves the needle for large language models today, the answer increasingly points to high-quality human-created data. Garrett Lord, co-founder and CEO of Handshake, explains how a decade-old campus recruiting platform transformed overnight into one of the fastest-growing suppliers of post-training data for frontier AI labs. The story is as much about product-market fit as it is about recognizing an unfair advantage: access to millions of students, thousands of advanced-degree experts, and a trusted campus brand.

From Careers Platform to Human Data Marketplace

Handshake began as a social careers network for college students and early-career professionals. That long-term accumulation of profiles, academic signals, and university partnerships created a rare asset: a direct, targetable audience of 18+ million users that includes hundreds of thousands of PhDs and master’s students. With labs shifting from pre-training on ubiquitous internet text to post-training that requires specialized, verifiable, and often multimodal data, that audience became a new product.

What Post-Training Data Looks Like

Post-training work covers supervised fine-tuning, reinforcement learning with human feedback, trajectory capture, rubric-based evaluation, and multimodal labeling (audio, video, tool use). Garrett describes how experts—PhD researchers, domain specialists, and professional practitioners—are paid to discover model failure modes, provide ground-truth answers, and record step-by-step tool use. These units of data are often returned as structured JSON and packaged to be directly useful for post-training experiments and evaluation.

Why Experts Matter More Than Generalists

As models become more capable, the low-skill, generalist labor that once sufficed for simple labeling is less valuable. What frontier labs now need are experts who can break models in deep subdomains—mathematics, chemistry, physics, law, medicine—and produce constrained, verifiable datasets that improve reasoning, tool use, and domain-specific capabilities. Handshake's approach elevates contributors from anonymous micro-task labor to trained fellows who receive instruction, assessment, and higher compensation for hard work.

Three Priorities for Model Builders

Quality: every unit of data must be precise and verifiable to avoid contaminating model behavior.
Volume: labs need thousands to millions of units across focused hypothesis-driven experiments.
Speed: rapid iteration allows researchers to test hypotheses and expand only the data pipelines that show gains.

Scaling a New Business Inside an Old One

Launching a disruptive product from within a mature company requires structure and separation. Garrett outlines how Handshake spun up a distinct organization with dedicated engineering, product, design, and operations teams focused solely on the AI data business. Early hires were entrepreneurial and comfortable with ambiguity, processes were metrics-driven, and the culture emphasized extreme ownership and rapid customer feedback.

Competitive Moat: Audience Over Ads

Many competitors buy users through expensive ads and recruiter outreach. Handshake’s decade-long relationships with 1,600 universities and high brand affinity mean near-zero acquisition cost and higher conversion and retention for expert contributors. That audience access becomes the primary moat in human-sourced training data.

The Broader Impact On Careers And Research

Rather than displacing graduates, accessible AI tools combined with paid assessment work can accelerate career outcomes. Young people who are AI-native gain outsized productivity advantages, while PhD fellows earn substantial per-hour rates by doing specialized labeling and evaluation work that simultaneously informs their research and classrooms. For employers and universities, this model promises better talent matching, improved educational design, and measurable benefits to the labor market.

Types Of Data To Expect Next

Future datasets will grow beyond text: CAD files, scientific instrument telemetry, multimodal video trajectories, annotated tool interactions, and richer audio corpora. Synthetic data has a role in verifiable domains, but domain-specific human data will remain essential for many years as labs chase narrow, high-value capability gains.

Handshake’s pivot illustrates how an established network and deep domain trust can be repurposed into a high-velocity human data engine for AI labs pursuing post-training improvements; the result is faster model progress, new work pathways for experts, and a measurable business that shows how access to audience and expert quality can define competitive advantage.

Key points

Handshake leveraged 18 million students and alumni, including 500,000 PhDs and three million master’s students.

Handshake launched a post-training data business that reached $50M ARR within four months.

Post-training work focuses on supervised fine-tuning, RLHF, trajectory capture, and rubric-based evals.

Model builders prioritize quality, volume, and speed when buying human-created training data.

Experts can earn $100–$200 an hour performing high-value labeling and reasoning tasks.

Handshake’s competitive advantage is near-zero acquisition cost via university partnerships.

Successful internal spinouts require separate teams, metrics cadence, and entrepreneurial hires.

Human-in-the-loop labeling remains critical for domain-specific gains for at least the next decade.

Timecodes

00:00 Opening: Why human data matters right now

00:01 Guest introduction and Handshake overview

00:05 What is data labeling and post-training explained

00:09 Handshake's focus: expert network and target audiences

00:13 Concrete examples: GB QA paper and expert workflows

00:17 What trajectory and multimodal data mean

00:22 Quality, volume, and speed: what labs need

00:33 Origin story: discovering the AI data opportunity

00:36 Rapid growth: revenue milestones and lab partnerships

00:40 Incubating a new business inside an established company

00:45 Acquisition strategies and audience moat

00:52 Operational choices: teams, cadence, and hiring

01:00 Long-term vision: labor markets and matching

01:00 Future data types and role of synthetic data

01:03 Advice to entrepreneurs and closing thoughts

01:04 Lightning round and personal anecdotes

01:08 Closing remarks and hiring call

More from Lenny's Podcast: Product | Career | Growth

Lenny's Podcast: Product | Career | Growth

How Intercom rose from the ashes by betting everything on AI | Eoghan McCabe (founder and CEO)

How Intercom turned a six-week GPT prototype into a $100M AI agent business.

Business

Entrepreneurship

Technology

1:23:20

Aug 21, 2025

Lenny's Podcast: Product | Career | Growth

Why ChatGPT will be the next big growth channel (and how to capitalize on it) | Brian Balfour (Reforge)

ChatGPT could become the next dominant distribution platform—are you ready to place your bet?

Business

Entrepreneurship

Technology

1:29:11

Aug 17, 2025

Lenny's Podcast: Product | Career | Growth

The one question that saves product careers | Matt LeMay

Learn three practical steps product teams use to link work directly to business results.

Business

Entrepreneurship

Technology

1:32:09

Aug 14, 2025

Lenny's Podcast: Product | Career | Growth

Inside ChatGPT: The fastest growing product in history | Nick Turley (Head of ChatGPT at OpenAI)

Inside the launch and future vision of GPT-5 from ChatGPT’s product lead.

Business

Entrepreneurship

Technology

1:35:38

Aug 9, 2025

The Thoughtful Entrepreneur

The Wolf Of All Streets

Good Sleep: Positive Affirmations

KGCI: Real Estate on Air

The School of Greatness

00:0000:00

Why Handshake's Student Network Became an AI Training-Data Powerhouse

From Careers Platform to Human Data Marketplace

What Post-Training Data Looks Like

Why Experts Matter More Than Generalists

Three Priorities for Model Builders

Scaling a New Business Inside an Old One

Competitive Moat: Audience Over Ads

The Broader Impact On Careers And Research

Types Of Data To Expect Next

Key points

Timecodes

More from Lenny's Podcast: Product | Career | Growth

You Might Also Like