Synthetic Data, GenAI, and What Data Services Staff Need to Know
Slides: https://doi.org/10.5281/zenodo.17013216 Synthetic data are used in a wide variety of situations to help researchers overcome inaccessible “real data”. This inaccessibility can be due to the expense of data collection, sensitive existing data with limited access, or the need for quick data to test a method or develop code. In recent years, Generative AI (GenAI) tools have both leveraged synthetic data in their training and made the creation of synthetic datasets far more accessible. This presentation will explore some of the ways that synthetic data is talked about and used. We will explore aspects of particular interest to data librarians, such as advice to give researchers interested in using GenAI with/for synthetic data. Finally, we’ll share some of the tips and caveats for creating synthetic data with GenAI. Speaker Bios: Joanna Schroeder is a Data Services Librarian at Boston College. She specializes in finding and using data in creative ways to provide insights into complex problems. Previously, she worked as a data science research specialist at the University of Virginia’s Biocomplexity Institute. She received her MLIS from Drexel University. Matt Jansen is the Data Analysis Librarian for the University Libraries at the University of North Carolina at Chapel Hill. He provides support on data preparation and analysis across a variety of data formats and research goals. He also serves as liaison to the School of Data Science and Society. Lorin Bruckner has worked as a Data Visualization Librarian at UNC Chapel Hill since 2016. While assisting researchers at the University Library, she relies on her knowledge and experience in data analysis as well as her 10 year background in visual design. She obtained her MS in Information Science at the University of Illinois at Urbana-Champaign. Michele Hayslett is the Librarian for Numeric Data Services and Data Management in the University Libraries at UNC at Chapel Hill and served in similar positions at NC State University Libraries and the State Library of North Carolina. She received her MSLS from UNC at Chapel Hill.

Introduction to Requal: Increasing Transparency and Reflexivity of Qualitative Coding

Generative AI in Qualitative Data Analysis: introducing the Guided AI Thematic Analysis framework

How to Make Google Trends Data Actually Usable for Machine Learning

Can we use artificial intelligence (AI) for our qualitative analysis?

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

How does QualCoder compare with NVivo, ATLAS.ti, MAXQDA?

Power BI FULL COURSE for Beginners | Learn Dashboards & Reports Fast!

Transformers, the tech behind LLMs | Deep Learning Chapter 5

Designing Data-Intensive Applications: Chapters 1 and 2

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

Appropriate Uses of AI for Qualitative Analysis

Understanding Data Anonymization

Andrew Ng: Building Faster with AI

40Hz Binaural Gamma Waves - Ultra Deep Concentration

Billionaire's WARNING: I'm SELLING. The Crash Is Already Here!

Choosing an appropriate digital tool for a qualitative or mixed-methods analysis

RAG Crash Course for Beginners

Deep Dive into LLMs like ChatGPT

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

