Language Generation in the Limit - Jon Kleinberg

Computer Science/Discrete Mathematics Seminar I 10:30am|Simonyi Hall 101 and Remote Access Topic: Language Generation in the Limit Speaker: Jon Kleinberg Affiliation: Cornell University Date: April 21, 2025 Although current large language models are complex, the most basic specifications of the underlying language generation problem itself are simple to state: given a finite set of training samples from an unknown language, produce valid new strings from the language that don't already appear in the training data. Here we ask what we can conclude about language generation using only this specification, without any further properties or distributional assumptions. In particular, we consider models in which an adversary enumerates the strings of an unknown target language that is known only to come from a possibly infinite list of candidate languages, and we show that it is possible to give certain non-trivial guarantees for language generation in this setting. The resulting guarantees contrast dramatically with negative results due to Gold and Angluin in a well-studied model of language learning where the goal is to identify an unknown language from samples; the difference between these results suggests that identifying a language is a fundamentally different problem than generating from it. The talk will cover joint work with Sendhil Mullainathan and with Fan Wei.

Cosystolic Expansion - Irit Dveer Dinur

Cosystolic Expansion - Irit Dveer Dinur

Training Sand to Think: Artificial General Intelligence & Future of Physics

Training Sand to Think: Artificial General Intelligence & Future of Physics

We're 99.9% sure this pattern is true, but no one can prove it

We're 99.9% sure this pattern is true, but no one can prove it

18 PA 05 Sampling ID 63548

18 PA 05 Sampling ID 63548

Language Generation in the Limit

Language Generation in the Limit

Using Large Language Models | Build Your Own LLM Workshop #1

Using Large Language Models | Build Your Own LLM Workshop #1

6. Monte Carlo Simulation

6. Monte Carlo Simulation

Neil Turok on how theoretical physics went wrong and why universities don’t encourage originality

Neil Turok on how theoretical physics went wrong and why universities don’t encourage originality

Model Collapse Ends AI Hype

Model Collapse Ends AI Hype

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

Co-Creator of Haskell: Functional Programming, Thinking in Types, Useless Languages | Simon Jones

Co-Creator of Haskell: Functional Programming, Thinking in Types, Useless Languages | Simon Jones

MIT Godel Escher Bach Lecture 1

MIT Godel Escher Bach Lecture 1

This is not the AI we were promised | The Royal Society

This is not the AI we were promised | The Royal Society

Questions for Theory in the New Age of Machine Learning

Questions for Theory in the New Age of Machine Learning

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think

Terence Tao: Nobody Understands Why AI Actually Works

Terence Tao: Nobody Understands Why AI Actually Works

The most beautiful formula not enough people understand

The most beautiful formula not enough people understand

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

AI Isn't as Powerful as We Think | Hannah Fry

AI Isn't as Powerful as We Think | Hannah Fry

The World's Most Important Machine

The World's Most Important Machine