Emergent Intelligence: Sanjay Krishnan On AI’s Rapid Evolution
Why AI suddenly feels useful: data, compute, and usage explained
Large language models (LLMs) reached practical usefulness because three forces converged: massive human-generated datasets, abundant compute (GPUs and cloud clusters), and high-volume real-world usage. These combined resources let models learn linguistic patterns from broad corpora and begin to approximate human-like responses across topics. Usage data—billions of queries per day—acts as a grounding signal that converts probabilistic text prediction into real-world value.
How usage makes models practical: grounding and feedback loops
Raw language prediction becomes actionable when systems are used interactively. Usage signals tell developers which outputs are helpful, and when connected to real services (calendar, food ordering, APIs) models start taking decisions that affect people. That bridge from text prediction to decision-making is where threats and benefits both intensify.
New cybersecurity threats for an AI-powered internet
AI agents introduce novel attack surfaces not seen with static websites or databases. Two key risks are context poisoning (manipulative prompts or framing) and training-data poisoning (maliciously injected examples that alter future model behavior). Because models are highly adaptive, small, well-crafted inputs can disproportionately shift outputs—raising concerns for enterprise reputation, physical safety, and service integrity.
Detecting and mitigating poisoning: provenance, behavior, and vetting
Practical defenses blend traditional cybersecurity with new controls: behavior-based monitoring (IP anomalies, repetitive content), data provenance tracking, watermarking or source tagging, and mandatory human review for customer-facing outputs. Purely automated “ignore bad data” approaches are insufficient—organizations must combine organizational safeguards, logging, and human-in-the-loop checks to reduce risk.
Breakthroughs to watch: multimodal reasoning and coding assistants
Two rapid advances surprised researchers: the leap in coding assistants that accelerate software tasks, and multimodal reasoning that links text, images, and other inputs. These changes unlock new product categories (image-aware agents, retrieval-augmented generation, and collaborative model ensembles) while shifting how educators, developers, and enterprises adapt workflows.
Future direction: marketplaces of specialized models and ensemble intelligence
Rather than a single dominant model, the next wave will emphasize specialized models and networks of models (databases + creativity models + domain experts) that interoperate. Enterprises will combine retrieval-augmented generation, consensus architectures, and model-routing strategies to balance creativity, factual accuracy, and safety.
- Keywords covered: large language models, training data poisoning, data provenance, multimodal AI, agentic internet, retrieval-augmented generation.