LinkML tutorial at ISMB 2024 (3.5 hours)
Tutorial VT3: Using LinkML (Linked data Modeling Language) to model your data. Monday, July 8, 2024. Presenters: Sierra Moxon, software developer, Lawrence Berkeley National Laboratory Kevin Schaper, software developer, University of North Carolina Patrick Kalita, software developer, Lawrence Berkeley National Laboratory Slides: http://bit.ly/LinkML-2024 LinkML (Linked data Modeling Language; linkml.io) is an open, extensible modeling framework that allows computers and people to work cooperatively to model, validate, and distribute data that is reusable and interoperable. It is designed to create interoperable data from the start without the overhead normally required for doing this. LinkML can help even non-techies create better, FAIRer, more reusable data models backed by ontologies. Collecting and organizing biomedical data for an individual project presents a huge challenge; doing so in a way that allows for later reanalysis and reuse across projects is even harder. Many data standards are not machine-actionable, or are defined in isolation, leading to siloization. The quantity and variety of data being generated in biomedical fields is increasing rapidly, but is still often captured in unstructured formats like publications, posters, lab notebooks, or spreadsheets. Researchers at all levels struggle with collecting, managing, and analyzing data and complex knowledge, due to a confusing landscape of schemas, standards, and tools. These challenges impede scientific progress and limit our ability to tailor treatments based on data (precision medicine). AI and ML increasingly enable large-scale data analysis, but lack of data harmonization limits cross-disciplinary applications. LinkML addresses these issues, weaving together elements of the Semantic Web with aspects of conventional modeling languages to provide a pragmatic way to work with a broad range of data types, maximizing interoperability and computability across sources and domains. LinkML meets data producers where they are technically, and speaks many different modeling languages. Data models can be authored in a variety of languages including YAML, JSON Schema, or even spreadsheets. LinkML supports all steps of the data analysis workflow: data generation, submission, cleaning, annotation, integration, and dissemination. LinkML enables even non-developers to create data models that are understandable and usable across the layers from data stores to user interfaces, reducing translation issues and increasing efficiency. LinkML is an easy-to-use framework that both emerging and established data-generating communities can use to generate interoperable, reusable datasets and workflows. It has already seen wide uptake by projects across the biomedical spectrum and beyond, including the German Human Genome-Phenome archive, Critical Path Institute, iSample project, National Microbiome Data Collaborative, Center for Cancer Data Harmonization, INCLUDE project, NCATS Biomedical Data Translator, Reactome, Alliance of Genome Resources, Open Microscopy Environment (Next Generation File Format), and Genomics Standards Consortium. In this tutorial, we will discuss best practices for data modeling; introduce LinkML as a modeling framework and tool suite; work together to set up a LinkML project from scratch; develop a model and validate it with test data; and auto-generate model documentation. If time permits, we will discuss the LinkML tool, Schema Automator, and use of LLMs with LinkML models. Learning Objectives 1. Learn how to author a new data model that exercises some of the main LinkML modeling components. 2. Understand common LinkML schema best practices. 3. Generate documentation for the new model, and get familiar with generating the model in different formats. 4. Time permitting, get familiar with LinkML’s bootstrapping tools that help migrate existing models to LinkML.

BOSC2023 S5b Sierra Moxon, The Linked data Modeling Language LinkML a general purpose data modeling

MIT Just Revealed the AI Bubble's Fatal Flaw

Why Aliens Would NEVER Invade Africa

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

What is SonarQube | Introduction SonarQube | SonarQube Tutorial | SonarQube Basics | Intellipaat

Europe Has Become a War Project — Can It Be Stopped? | Yanis Varoufakis & Jeffrey Sachs

Judge Can’t Stop Laughing At Sovereign Citizen’s Courtroom Meltdown!!!

The Moment That Changed Software Development!

LinkML: an open data modeling framework, grounded with... - Sierra Moxon - ISCBacademy Webinar

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Portugal – Usbekistan Highlights | Gruppe K, FIFA WM 2026 | sportstudio

Conan O’Brien Mocks Trump At Harvard Commencement | Crowd Erupts During Viral Speech

AI Is Creating A Rare Opportunity For Investors. How Jim Roppel Is Playing It. | Investing With IBD

How To Code In Python | Python Tutorial For Beginners | Python Basics | Learn Python | Intellipaat

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Co-Creator of Haskell: Functional Programming, Thinking in Types, Useless Languages | Simon Jones

When an audition changed TV forever

Web Scraping Using Python For Beginners and File Handling in Python | Python Web Scraping

The Uncomfortable Truth About AI “Reasoning” | World Science Festival

