It's the Data, Stupid….

Carl Kesselman, Ph.D.

William M. Keck Professor of Engineering

Professor, Epstein Department of Industrial and Systems Engineering

Director Informatics Systems Research Division, Information Sciences Institute

Viterbi School of Engineering

 

Professor, Department of Population and Public Health Sciences, Keck School of Medicine

Professor, Biomedical Sciences, Ostrow School of Dentistry

 

University of Southern California


Seminar Information

Seminar Date
April 25, 2025 - 2:00 PM

Location
The FUNG Auditorium - PFBH

ck

Abstract

Recent advances in machine learning have yielded astonishing breakthroughs. This year's Nobel Prize in Chemistry, awarded to the creators of AlphaFold, highlights machine learning's capability to address complex scientific challenges—like solving protein folding, a fundamental problem in biology that resisted solution for decades. Crucially, AlphaFold's success hinged on the availability of high-quality, meticulously curated data from resources such as the Protein Data Bank.

As machine learning is increasingly used to solve complex problems in scientific discovery, medicine, and engineering, the importance of validating and reproducing these results has intensified. This validation and reproducibility challenge becomes particularly pronounced in smaller-scale, university-based research settings, where resources and curated data may be more limited. One response has been the emergence of data-centric AI, emphasizing the pivotal role of data quality over the mere innovation of algorithmic models.

In this talk, I will discuss our work on reproducible machine learning through data-centric AI, emphasizing methodologies and tools developed within the Deriva platform. Deriva facilitates the systematic creation, management, and use of high-quality data, enabling robust, reproducible machine learning applications even at modest scales. To illustrate our approach, I will describe an ongoing project developing machine learning methods for early and reliable detection of glaucoma, underscoring the transformative potential of data-centric methodologies in biomedical engineering and clinical practice.

Speaker Bio

Carl Kesselman is the William M. Keck Professor of Engineering in the University of Southern California (USC) Viterbi School of Engineering. He is a professor in the Daniel J. Epstein Department of Industrial and Systems Engineering and Department of Computer Science, Department of Population and Public Health Sciences in the Keck School of Medicine, and Ostrow School of Dentistry. He is a USC Information Sciences Institute (ISI) Fellow, where he directs the Informatics Systems Research Division, and is the director of the Center of Excellence for Discovery Informatics in the Michelson Center for Convergent Biosciences.

Carl Kesselman leads ISI's Informatics Systems Research Division. The division was created to understand how to build informatics systems that can help tackle the hardest problems with great societal impact. Its work spans grid computing, information security, service-oriented architectures, socio-technical systems, and reproducibility. His recent work has focused on creating new methods for collaborative discovery with a particular focus in the area of developing reproducible methods for applications of machine learning to biomedical applications. He has been the principal investigator on collaboration and data management and analysis infrastructure for numerous large-scale National Institutes of Health (NIH) funded initiatives in areas such as craniofacial development, kidney reconstruction, synaptic mapping, and genito-urinary tract development.

Kesselman’s work in large-scale computational infrastructure provided the computing platform that led to two Nobel Prizes in Physics and a Nobel Peace Prize.  He has received numerous honors for his pioneering research, including the Lovelace Medal from the British Computing Society, the Goode Memorial Award from the Institute for Electrical and Electronic Engineers (IEEE) Computer Society, and the IEEE Internet Award. He is a Fellow of the British Computing Society, the  IEEE, and the Association for Computing Machinery.

Kesselman joined ISI in 1997 as a USC Computer Science Department research associate professor. He received his PhD in computer science from the University of California at Los Angeles, a master's in electrical engineering from the University of Southern California, and a bachelor's of electrical engineering from the University at Buffalo. He has also received an honorary doctorate degree from the University of Amsterdam.