OpenDS4All project, hosted by LF AI & Data
OpenDS4All is a project created to accelerate the creation of data science curricula at academic institutions. While a great deal of online material is available for data science, including online courses, we recognize that the best way for many students to learn (and for many institutions to deliver) content is through a combination of lectures, recitation or flipped classroom activities, and hands-on assignments.
OpenDS4All attempts to fill this important niche. Our goal is to provide recommendations, slide sets, sample Jupyter notebooks, and other materials for creating, customizing, and delivering data science and data engineering education.
The project hosts educational modules that may be used as building blocks for a data science curriculum.
Note: The link opends4all-resources takes you to the opends4all curriculum building blocks organized by category.
Note: If you adopt all or some of the content, please add your program's details to the ADOPTERS.csv file.
The initial modules were designed to target a broad, cross-university audience at both the undergraduate and graduate levels. Modules contain instructor notes and comments intended to aid in the delivery of the material; the expectation is that instructors will be generally fluent in basic database and machine learning concepts.
The perspective of the materials largely comes from computer science, with an emphasis on data wrangling and engineering as well as machine learning and validation. However, prior versions of the content have been used to teach students ranging from freshmen to PhD students, across a wide range of fields. The emphasis is largely on core concepts and algorithms with grounding in today's technologies and best practices.
Students are expected to come in with two major prerequisites:
To some extent, students with a limited background can follow along with this material, but they will likely need to supplement extensively.
The following topology shows how content is currently organized around categories. This is a living/dynamic taxonomy that is updated as new content is added to the project. Each category contains modules and each module consists of one or more of the following components:
Instructor_Notes.md) and guide to files
Note: The PowerPoint slides are not directly viewable on GitHub. After you clicked on the link to a set of PowePoint slides you need to select the Download button to download and view the slide deck. Two viewable extracts from the slide decks can be seen by clicking on the links below:
- INTRODUCTION-Data-Science-basic.pptx - DATA-WRANGLING-Import-Link-mixed.pptx
There are many ways to interact with this repository:
The project's governance principles clarifies the different roles and describes the processes for becoming a contributor, a committer or a TSC member.
If you are interested in collaborating on the project, please open an issue and one of the members of the TSC will respond to your request. If you do not feel comfortable opening an Issue, email [email protected]
License: CC BY 4.0, Copyright Contributors to the LF AI & DATA OpenDS4All project.