Data Science

  • Technology directors:
  • Fatma Kaplan, Ph.D.
  • K. Cameron Schiller, M.S.

Analysis of big data for biologists

Analysis and interpretation of large datasets are still a challenge to many biologists despite wide availability of the technology. By incorporating our research experience in big data and classroom teaching, we will build a new course on “Analysis of big data for biologists”.  We are going to accomplish this with well-prepared lectures and project based coursework to foster student centered learning. We will use the raw data from our publications for in class exercises. At the end of the course, the students will be ready to apply the concepts they learned to their own projects in molecular biology, ecology or biomedical sciences.

Learning Objectives

Overall goal is to understand, analyze, interpret big data and learn to communicate with statisticians or biostatisticians and draft a manuscript for a publication. This is not a statistical course where statistical methods are developed. In this course, students will learn how to form a theoretical framework to analyze the data. This concept can be applied to any discipline from molecular biology to ecology. Additionally, the students will develop biological hypotheses within that framework and test the hypotheses using existing statistical methods and software. Students will learn how to do queries (algorithms) using Microsoft Access. This is hands on course. The students will be prepared to apply what they learned in this course to their own research or any other big data.


Analysis of big data for biologists

*Metabolomics workshop curriculum preparation is in collaboration with Dr. Joachim Kopka and Alexander Erban at the Max Planck Institute Molecular Plant physiology, Golm, Germany.