Declarative MLOps - Streamlining Model Serving on Kubernetes // Rahul Parundekar// MLOps Meetup #123

MLOps Community Meetup #123! Last Wednesday, we talked to Rahul Parundekar, Founder of A.I. Hero, Inc. //Abstract Data Scientists prefer Jupyter Notebooks to experiment and train ML models. Serving these models in production can benefit from a more streamlined approach that can guarantee a repeatable, scalable, and high velocity. Kubernetes provides such an environment. And while third-party solutions for serving models make it easier, this talk demystifies how native K8s operators can be used to deploy models along with best practices for containerizing your own model, and CI/CD using GitOps. // Bio Rahul has 13+ years of experience building AI solutions and leading teams. He is passionate about building Artificial Intelligence (A.I.) solutions for improving the Human Experience. He is currently the founder of A.I. Hero - a platform to help you fix and enrich your data with ML. At AI Hero, he has also been a big proponent of declarative MLOps - using Kubernetes to operationalize the training and serving lifecycle of ML models and has published several tutorials on his Medium blog. Before AI Hero, he was the Director of Data Science (ML Engineering) at Figure-Eight (acquired by Appen), a data annotation company, where he built out a data pipeline and ML model serving architecture serving 36 models (NLP, Computer Vision, Audio, etc.) and traffic of up to 1M predictions per day. // Jobs board https://mlops.pallet.xyz/jobs // Related links Website: https://aihero.studio The Declarative MLOps Series: / streamlining-machine-learning-operations-w... / containerizing-and-serving-an-ml-model-wit... / continuous-integration-for-serving-ml-mode... / continuous-delivery-of-ml-models-on-kubern... ---------- ✌️Connect With Us ✌️------------ Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, Feature Store, Machine Learning Monitoring, and Blogs: https://mlops.community/ Connect with Demetrios on LinkedIn: / dpbrinkm Connect with Rahul on LinkedIn: / rparundekar Timestamps: [00:00] Musical introduction to Rahul Parundekar [04:15] LLMs in Production Conference announcement [04:36] Purchase our Swag shirt! [06:45] Declarative Paradigm [08:40] Why now? [09:31] It's great for scalability [10:01] Most MLOps tools work well with K8s [11:00] Easy-deploys with tool-provided CRDs [11:57] Caveats [13:46] This talk [14:09] 3 Ways to Serve ML Models [14:14] Way 1: Serving a Model with an HTTP Endpoint [15:08] Way 2: Serving the Model with a Message Queue [15:43] Way 3: Long-running Task that Performs Batch Processing [18:17] Buil your own container [20:00] The main predictor (1/2): Singleton with load method [20:23] The main predictor (2/2): Predict [20:47] Way 1 5 steps [23:54] Way 2 2 steps [25:03] Way 3 2 steps [26:00] Tests: Sanity check for the model [26:53] Bringing it together: Entrypoint [31:49] Continuous Integration (CI) [34:35] Create docker-compose.yaml to make it easier for CI [36:00] On PR: Run tests with Github Actions [36:38] Branch-protection [37:51] On PR: Github Actions automatically runs our test [38:10] On PR: PRs can be then merged on approval [38:28] Container Repository [39:15] Continuous Integration (CI) [39:26] On merge to main [40:45] Actions that can constraint [42:38] TODO [43:17] Continuous Delivery [45:42] Argo CD [46:39] Image promotion with Kustomize [00:00] [00:00] [00:00] [00:00] [00:00] [00:00] [00:00] [00:00] [00:00] [00:00]

Tecton 0.6: Notebook-driven Development // Jason Dunne // MLOps Meetup #122

Tecton 0.6: Notebook-driven Development // Jason Dunne // MLOps Meetup #122

System Design for Recommendations and Search // Eugene Yan // MLOps Meetup #78

System Design for Recommendations and Search // Eugene Yan // MLOps Meetup #78

Google for Startups Immersion x Antler India : Deploying Serverless AI on Google Cloud

Google for Startups Immersion x Antler India : Deploying Serverless AI on Google Cloud

Generative AI Foundations on AWS | Part 1: Introduction to foundation models

Generative AI Foundations on AWS | Part 1: Introduction to foundation models

OpenShift Coffee Break: MLOps with OpenShift

OpenShift Coffee Break: MLOps with OpenShift

Kubernetes and retiring at the top with Kelsey Hightower

Kubernetes and retiring at the top with Kelsey Hightower

Stanford CS153 Frontier Systems | Scale, AGI, and the Future of Everything

Stanford CS153 Frontier Systems | Scale, AGI, and the Future of Everything

End to End LLMOps with Kubeflow - J. George, G. Prabhu, A. Nagar & A. Raimule, K. Durai

End to End LLMOps with Kubeflow - J. George, G. Prabhu, A. Nagar & A. Raimule, K. Durai

Deploying Many Models Efficiently with Ray Serve

Deploying Many Models Efficiently with Ray Serve

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Jfrog | Jfrog Artifactory | Jfrog Artifactory Tutorial | Artifactory Tutorial | Intellipaat

Jfrog | Jfrog Artifactory | Jfrog Artifactory Tutorial | Artifactory Tutorial | Intellipaat

Building an ML Platform from Scratch: Live Coding Session // Alon Gubkin // MLOps Meetup #67

Building an ML Platform from Scratch: Live Coding Session // Alon Gubkin // MLOps Meetup #67

Building Massive-Scale Generative AI Services with Kubernetes and Open Source - John McBride

Building Massive-Scale Generative AI Services with Kubernetes and Open Source - John McBride

Introduction to Distributed ML Workloads with Ray on Kubernetes - Mofi Rahman & Abdel Sghiouar

Introduction to Distributed ML Workloads with Ray on Kubernetes - Mofi Rahman & Abdel Sghiouar

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Nvidia CEO Jensen Huang Interview| Bloomberg Technology Special

Nvidia CEO Jensen Huang Interview| Bloomberg Technology Special

How Instagram Scaled Postgres to 2 Billion Users

How Instagram Scaled Postgres to 2 Billion Users

Inside YC's AI Playbook

Inside YC's AI Playbook

Stop Prompting Claude. Use Karpathy's Method Instead.

Stop Prompting Claude. Use Karpathy's Method Instead.