Data Search and Discovery: Building an Amazon.com for Data
When: Oct. 14th, 2022 12:00 pm - 1:00 pm
To Know
Organizer:
Data Sharing and Reuse Seminar Series
About this Class
The call for better data and evidence for decision-making has become very real as evidenced by the Federal Data Strategy, as well as the passage of both the Foundations of Evidence-based Policymaking Act (Evidence Act) and the CHIPS+ Act. The challenge to be addressed is finding out not just what data are produced but how they are used – in essence, to build an Amazon.com for data -so that both governments and researchers can quickly find the data and evidence they need. To paraphrase Lee Platt’s aphorism about HP - “If researchers knew what researchers know, they would be three times more productive"
This talk will provide an overview of a massive effort over the past five years which has been focused on finding out how data are being used, to answer what questions, and find out who are the experts, by mining text documents that are hidden in plain sight - in the text of scientific publications, government reports and public documents.
Just as with Amazon, the results are enormously powerful. The pilot, which is sponsored by agencies such as NSF’s National Center for Science and Engineering Statistics (NCSES) and the Department of Education’s National Center for Education Statistics (NCES) – has generated a prototype API and a dashboard that can be used – so that, for example, agencies can document dataset use for Congress and the public, program managers can identify investment opportunities rapidly and researchers can more easily build on existing knowledge rather than redoing things from scratch.
Speaker:
Dr. Julia Lane is a Professor at the NYU Wagner Graduate School of Public Service. She is founder or co-founder of many data initiatives that have served the public good, including the Longitudinal-Employer Household Dynamics Program at the Census Bureau; the Star Metrics/UMETRICS program that led to the establishment of the Institute for Research on Innovation and Science at the University of Michigan; the New Zealand Integrated Data Infrastructure, which holds data from across various sectors; the NORC Data Enclave supporting research access to confidential data; the Patentsview project to increase the usability of patent data; and the Coleridge Initiative to use data more effectively in government decision-making. She currently serves on the Advisory Committee on Data for Evidence Building and the National AI Research Resources Task Force. Her most
recent paper(link is external) was published in Nature, and used UMETRICS data.