Learning by doing

“Applied Data Science” (class repo) is a project-based learning (PBL) course that incorporates knowledge and skills covered in a statistical curriculum with topics and projects in data science. Programming will be covered using existing tools in R, while students can use tools from other languages. Computing best practices will be taught using test-driven development, version control, and collaboration. Students finish the class with a portfolio on GitHub, and deeper understanding of several core statistical/machine-learning algorithms. As a project-based hands-on course in data science, no formal instruction on statistics, data science, machine learning will be given. Project cycles run every 2-3 weeks, where we will have mini data projects. Groups will be formed randomly and project products will be peer-reviewed.

About the instructor

Tian Zheng is Professor of Statistics at Columbia University. At the Data Science Institute of Columbia University, Professor Zheng was the Associate Director for Education and chair of the Education committee from 2017-2020. She is currently the co-chair of DSI’s education working group.

She obtained her PhD from Columbia in 2002. Her research is to develop novel methods and improve existing methods for exploring and analyzing interesting patterns in complex data from different application domains. Read more about Professor Zheng..