ncibtep@nih.gov

Bioinformatics Training and Education Program

Data Quality for LLMs: Building a Reliable Data Foundation

Data Quality for LLMs: Building a Reliable Data Foundation

 When: Apr. 24th, 2024 11:00 am - 12:00 pm

Learning Level: Any

To Know

Where:
Online Webinar
Organizer:
CBIIT
Presented By:
Dr. Abhishek Jha (Elucidata)
This class has ended.

About this Class

Please join us on Wednesday, April 24, 2024, when Dr. Abhishek Jha, co-founder and CEO of Elucidata, will present " Data Quality for LLMs: Building a Reliable Data Foundation." The presentation starts at 11:00 a.m. ET and ends at noon.
 
If you use large language models (LLMs) in your cancer research, register for this seminar to hear Elucidata’s Dr. Abhishek Jha discuss how data quality impacts LLM performance.
 
A reliable foundation that is well annotated and accessible to an LLM plays a major role in the value of its results.
 
You’ll see examples of how LLM-powered artificial intelligence (AI) agents query across three versions of the same gene expression corpus with differing results, including:

•    unstructured data from the public repository Gene Expression Omnibus.
•    structured data from the Crowd Extracted Expression of Differential Signatures project.
•    clean, linked, and harmonized data.
 
Dr. Jha will use these examples to discuss how the different quality in these data sources impacts LLM performance.