Microsoft Chief AI Officer: How to Fix Broken LLMs | Dr. Paul Rodrigues

Dr. Paul Rodrigues, Chief AI Officer for Microsoft's National Security Group, reveals the hidden layers of risk within large language models and how to defend them. In this comprehensive briefing, learn why the technology industry is facing unprecedented security threats due to releasing highly advanced AI models before we completely understand how their inner layers interact. Dr. Rodrigues explains the "AI Onion", a structural model mapping vulnerabilities from core weight alignment and reinforcement learning up to user experience applications. Discover the true engineering challenge of locating hidden objectives inside neural networks, identifying dataset bias introduced during reinforcement learning with human feedback (RLHF), and establishing bulletproof defense infrastructures. This talk outlines the operational blueprints used by the Microsoft AI Red Team to holistically stress-test systems, bypass guardrails safely, and implement point-and-click Retrieval Augmented Generation (RAG) frameworks like GraphRAG to keep data securely grounded.⏱️ Timestamps 0:00 - Intro: Dr. Paul Rodrigues, Chief AI Officer 0:35 - Mapping Vulnerabilities Across LLM Layers 2:23 - The History and Evolution of NLP Models 7:55 - AI Alignment and Human Preference Risks 11:37 - Uncovering Hidden Objectives and Ethics 14:02 - Platform Security: Guardrails, RAG, & Red Teaming 21:29 - Q&A: Automation Paradox and Dataset Diversity Key Takeaways The Hidden Core: Inspecting an artificial intelligence model for hidden motivations or malicious alignment still remains an incredibly labor-intensive, manual process involving interactive interrogation. Dataset Value Bias: When organizations establish specialized annotation guidelines for human preference pools, they inherently bake explicit cultural, organizational, or ethical perspectives into the model's downstream operations. Holistic Defense: Security cannot focus solely on the base neural network; safety teams must audit the entire integrated application infrastructure—including guardrails, grounding rules, meta-prompting, and systemic data logging. #Cybersecurity #generativeai #largelanguagemodels #aisecurity #techbriefing #datasovereignty

A Neuroscientist Explains Why AI Models Don't Experience Time Like You Do | Dr Christopher Honey

A Neuroscientist Explains Why AI Models Don't Experience Time Like You Do | Dr Christopher Honey

The VR Illusion That Tricks Your Brain (Google Neuroscientist Explains)

The VR Illusion That Tricks Your Brain (Google Neuroscientist Explains)

Why AI Sounds Smart Even When It's Wrong

Why AI Sounds Smart Even When It's Wrong

You’ll stop using ChatGPT after listening to this | Jonathan Pageau [ARC 2026]

You’ll stop using ChatGPT after listening to this | Jonathan Pageau [ARC 2026]

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

The FULL VIDEO of Trump they didn’t want released

The FULL VIDEO of Trump they didn’t want released

People Keep Asking Me About Racism In Germany. Here’s My Honest Answer.

People Keep Asking Me About Racism In Germany. Here’s My Honest Answer.

Billionaire's WARNING: I'm SELLING. The Crash Is Already Here!

Billionaire's WARNING: I'm SELLING. The Crash Is Already Here!

She Asks if I Know Coldplay and This Singer Shocks The Street

She Asks if I Know Coldplay and This Singer Shocks The Street

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Reading the Brain in Real-Time: The Future of Memory & Neurofeedback | Dr Megan deBettencourt

Reading the Brain in Real-Time: The Future of Memory & Neurofeedback | Dr Megan deBettencourt

The French Do Not Care About Work

The French Do Not Care About Work

Why You Can Never Truly Be "Anonymous" in Virtual Reality | Avi Bar-Zeev

Why You Can Never Truly Be "Anonymous" in Virtual Reality | Avi Bar-Zeev

ASMR Addictive Fast Tapping Collection For Deep Sleep & Anxiety Relief (No Talking) — 2.5 Hours

ASMR Addictive Fast Tapping Collection For Deep Sleep & Anxiety Relief (No Talking) — 2.5 Hours

Linus Torvalds: AI Is Changing Linux Fast

Linus Torvalds: AI Is Changing Linux Fast

🚗 BYD : The biggest SCAM of the car industry ?

🚗 BYD : The biggest SCAM of the car industry ?

Yann LeCun: World Models: Enabling the next AI revolution

Yann LeCun: World Models: Enabling the next AI revolution

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

Neuralink, Meta, and the Hype Cycle of Brain-Computer Interfaces | Dr. Anna Wexler

Neuralink, Meta, and the Hype Cycle of Brain-Computer Interfaces | Dr. Anna Wexler