Representation-driven Option Discovery in Reinforcement Learning, Marlos C. Machado

DS4DM Coffee Talk Representation-driven Option Discovery in Reinforcement Learning Marlos C. Machado – University of Alberta, Canada Aug 23, 2023 The ability to reason at multiple levels of temporal abstraction is a fundamental aspect of intelligence. In reinforcement learning, this attribute is often modeled through temporally extended courses of actions called options. Despite the popularity of options as a research topic, they are seldom included as an explicit component in traditional solutions within the field. In this talk, I will try to provide an answer for why this is the case and emphasize the vital role options can play in continual learning. Rather than assuming a predetermined set of options, I will introduce a general framework for option discovery, which utilizes the agent's representation to discover useful options. By leveraging these options to generate a rich stream of experience, the agent can improve its representations and learn more effectively. This representation-driven option discovery approach creates a virtuous cycle of refinement, continuously improving both the representation and options, and it is particularly effective for problems that require agents to exhibit different levels of abstractions to succeed.

Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic ...

Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic ...

Training Sand to Think: Artificial General Intelligence & Future of Physics

Training Sand to Think: Artificial General Intelligence & Future of Physics

MIT 6.S191: Reinforcement Learning

MIT 6.S191: Reinforcement Learning

I Gave ChatGPT a Body

I Gave ChatGPT a Body

The Strange Math That Predicts (Almost) Anything

The Strange Math That Predicts (Almost) Anything

The French Do Not Care About Work

The French Do Not Care About Work

Something is jamming GPS over Europe. Here's what we found

Something is jamming GPS over Europe. Here's what we found

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

Smooth Jazz & Soul R&B 24/7 – Midnight Jazz Lounge | Relaxing Instrumental Vibes

Smooth Jazz & Soul R&B 24/7 – Midnight Jazz Lounge | Relaxing Instrumental Vibes

Dynamics-Aware Unsupervised Discovery of Skills (Paper Explained)

Dynamics-Aware Unsupervised Discovery of Skills (Paper Explained)

What do tech pioneers think about the AI revolution? - The Engineers, BBC World Service

What do tech pioneers think about the AI revolution? - The Engineers, BBC World Service

Ukraine's drone war is isolating Crimea

Ukraine's drone war is isolating Crimea

I turned an old van into a 2-STORY tiny house

I turned an old van into a 2-STORY tiny house

LIVE: Conan O’Brien speaks at Harvard graduation ceremony (full)

LIVE: Conan O’Brien speaks at Harvard graduation ceremony (full)

Transformers, the tech behind LLMs | Deep Learning Chapter 5

Transformers, the tech behind LLMs | Deep Learning Chapter 5

"First Proof: Mathematicians Putting AI to the Test" March 14, 2026

"First Proof: Mathematicians Putting AI to the Test" March 14, 2026

Sarah Paine - Why Putin and Xi can't escape geography

Sarah Paine - Why Putin and Xi can't escape geography

The FASTEST introduction to Reinforcement Learning on the internet

The FASTEST introduction to Reinforcement Learning on the internet

How AI Cracked the Protein Folding Code and Won a Nobel Prize

How AI Cracked the Protein Folding Code and Won a Nobel Prize

Which country has the best education in the world? - The Global Story podcast, BBC World Service

Which country has the best education in the world? - The Global Story podcast, BBC World Service