Please enable JavaScript to use CodeHS

Data Science with Python

Description

In this lesson, students will learn what data science is, what a data scientist does, and the different types of questions that can be asked about data. Students will learn that statistical questions include computations or finding a relationship or pattern.

Objective

Students will be able to:

  • Recognize and formulate statistical questions
  • Think critically about data and its sources
Description

In this lesson, students will learn about the data cycle and apply the first two steps of asking questions and considering data. Students will start a mini-project that spans through the rest of the module by asking a statistical question about a field of interest and gathering and structuring the data. They will also learn about and consider both quantitative and qualitative data.

Objective

Students will be able to:

  • Explain and apply the data cycle
  • Consider data as either quantitative or qualitative
  • Structure data into tables of rows and columns
Description

In this lesson, students will learn the basics of Python programming in the context of data science. This includes how to define and use variables and lists, how to use comparison and logical operators, and the importance of knowing the different data types used in Python.

Objective

Students will be able to:

  • Use the basics of Python in the context of data science
  • Define and use variables and lists
  • Use comparison and logical operators
  • Understand the importance of the different data types used in Python
Description

In this lesson, students will learn about Python modules and libraries and how to implement and use them within the editor.

Objective

Students will be able to:

  • Import and use Python modules and libraries
  • Explain the importance of documentation
  • Read and use documentation
Description

In this lesson, students will learn how to create a use a Pandas Series. They will also learn and explore measures of central tendency including the mean, median, and mode.

Objective

Students will be able to:

  • Create a Series using the Pandas library
  • Compute the mean, median, and mode of a Series
  • Decide whether the mean, median, or mode is the best measure of central tendency for a specific dataset
Description

In this lesson, students will expand their statistical knowledge to include the spread of a dataset. They will learn about and apply measures of spread including standard deviation, variance, range, and interquartile range.

Objective

Students will be able to:

  • Use functions to compute the standard deviation and variance of a Series
  • Use variables, functions, and operators to determine the range and interquartile range of a Series
  • Use functions to plot a boxplot and histogram
  • Understand what the measures of spread mean for a dataset
Description

In this lesson, students will learn how to create a data frame using the Pandas library. They will also learn and use functions to explore a data frame further including which data types are included, the shape of the data frame, the descriptive statistics of the data in each column, and more.

Objective

Students will be able to:

  • Create a data frame using Pandas
  • Explore a data frame using key functions
Description

In this lesson, students will learn how to filter a data frame by selecting and displaying only specific columns. They will also learn how to filter rows displayed by using conditionals. Lastly, students will learn how to change the index used in a data frame and set it to a column of their choice.

Objective

Students will be able to:

  • Filter a data frame by displaying specific columns
  • Filter a data frame using conditionals
  • Set and reset the indices of a data frame
Description

In this lesson, students will define and use functions, along with values in a dataset, to calculate and create new columns of data.

Objective

Students will be able to:

  • Define and use functions
  • Use existing data values to create new columns of data
Description

In this lesson, students will practice collecting, explaining, and presenting the important data and details of their mini-project.

Objective

Students will be able to:

  • Interpret meaning from data
  • Extrapolate and present important details from a dataset
Description

In this lesson, students review content with a 15 question end-of-module quiz.

Objective

Students will be able to:

  • Demonstrate their understanding of Python, Pandas, and data science basics
Description

In this lesson, students will explore how data is used in the social sector. They will use this information to help formulate at least three problem statements each with two statistical questions.

Objective

Students will be able to:

  • Formulate a problem statement
  • Define a statistical question regarding data in the social sector
Description

In this lesson, students will learn about big data and cognitive biases. They will reflect on their own potential biases and work forward on their project by finding and considering datasets and further decomposing their problem statement.

Objective

Students will be able to:

  • Explain concepts of “Big Data”
  • Recognize and address cognitive biases
Description

In this lesson, students will learn how to import large datasets. They will also learn how to filter a dataset using index-based selection (iloc) and label-based selection (loc).

Objective

Students will be able to:

  • Import a large dataset using a CSV file
  • Filter a dataset using iloc and loc
Description

In this lesson, students will learn how to conditionally filter a dataset using label-based selection (loc) and comparison operators.

Objective

Students will be able to:

  • Filter a dataset using conditions and loc
Description

In this lesson, students will learn the importance of data cleaning and how to do it. Data cleaning deals with fixing or removing incorrect or missing values.

Objective

Students will be able to:

  • Use functions to explore the completeness of a dataset
  • Decide whether to drop, fix, or replace missing or incorrect data
  • Perform imputation which is the process of fixing or removing incorrect or incomplete data within a dataset
Description

In this lesson, students will explore datasets using visualizations such as pie charts, boxplots, histograms, and scatterplots.

Objective

Students will be able to:

  • Explore and use data visualization functions
  • Read and interpret data visualizations
Description

In this lesson, students will work on analyzing, explaining, and presenting conclusions found in their data exploration.

Objective

Students will be able to:

  • Interpret meaning from data
  • Extrapolate and present important details from a dataset
Description

In this lesson, students review content with a 10 question end-of-module quiz.

Objective

Students will be able to:

  • Demonstrate their understanding of selection, filtering and data cleaning functions
Description

In this lesson, students will learn how to use data to support and add to a story. The data story will combine visuals with a compelling narrative to help audiences understand the importance of the data being explained. This story will be told through the lens of promoting change, convincing people to take action, or compelling the readers or consumers of the data story to start a movement.

Objective

Students will be able to:

  • Create a visually appealing infographic that displays important data visualizations
  • Critically examine and reflect on various data visualizations and infographics
  • Choose an appropriate data narrative for their own data story
Description

In this lesson, students work on their module project by finding and cleaning a dataset that will help them tell their data story.

Objective

Students will be able to:

  • Gather and clean a dataset that will help create a data story
Description

In this lesson, students will learn about the importance of data visualization when telling a data story. Students will be using a variety of charts, graphs, images, and other common data visualizations to help to bring meaning and understanding to otherwise complex data.

Objective

Students will be able to:

  • Recognize and define the most common types of data visualizations
  • Debug programs that include data visualizations such as pie charts and bar graphs
Description

In this lesson, students will learn about univariate data and how to visualize and compare datasets using line and bar charts.

Objective

Students will be able to:

  • Plot and interpret a data visualization using a line graph
  • Plot and interpret a data visualization using a bar chart
Description

In this lesson, students will learn and use the normal distribution curve to predict the likelihood of certain events.

Objective

Students will be able to:

  • Plot a histogram and compare it to a normal distribution curve
  • Use normal distribution percentages to determine the likelihood of events.
Description

In this lesson, students will apply what they have learned about univariate data visualizations to explore how these may help tell their data story for the module project.

Objective

Students will be able to:

  • Explore univariate data using different data visualizations
  • Compare a histogram to a normal distribution curve
Description

In this lesson, students will learn about correlations and causations. They will reflect on whether causation follows a correlation or if a moderating or mediating variable is responsible for the correlation.

Objective

Students will be able to:

  • Use a function to find correlation
  • Determine whether a correlation leads to a causation
  • Reflect on moderating and mediating variables as they relate to correlation
Description

In this lesson, students will learn and apply aspects of linear regression such as finding the line of best fit and using a model to predict the outcome to different values.

Objective

Students will be able to:

  • Determine the line of best fit model for a scatterplot
  • Use a model to make predictions based on different values
Description

In this lesson, students will apply what they have learned about bivariate data visualizations to explore how these may help tell their data story for the module project.

Objective

Students will be able to:

  • Explore bivariate data using a scatterplot
  • Determine correlation and use linear regression when applicable
Description

In this lesson, students will work on their module projects by applying what they have learned to create a data story.

Objective

Students will be able to:

  • Use data visualizations, analysis, and interpretation to create a data story
Description

In this lesson, students review content with a 10 question end-of-module quiz.

Objective

Students will be able to:

  • Demonstrate their understanding of data storytelling
Description

In this lesson, students will be introduced to the module project. They will take a look at how the data is used in the business world to improve aspects of the business as well as predict future outcomes.

Objective

Students will be able to:

  • Explain the benefits of data analytics in the business world
Description

In this lesson, students will learn how to determine the quality of a dataset. They will explore a few raw datasets to access their quality and completeness.

Objective

Students will be able to:

  • Use functions to determine a dataset’s completeness
  • Use functions and a library to check a dataset’s validity and accuracy
Description

In this lesson, students will practice aggregating data by using different sort and group functions and parameters.

Objective

Students will be able to:

  • Group and sort datasets and reflect on the results
  • Sort by multiple columns and analyze and interpret the results
Description

In this lesson, students will practice combining data by using different concatenation and merging techniques.

Objective

Students will be able to:

  • Concatenate two datasets
  • Explain different merge/join methods and determine which method is best given a scenario
  • Use merge/join functions to combine two datasets
Description

In this lesson, students will work on their module projects by gathering and combining data from multiple sources. They will check the quality of the datasets as well as clean, combine and sort them.

Objective

Students will be able to:

  • Assess the quality of data sources and data sets
  • Clean and combine multiple datasets
Description

In this lesson, students will learn different types of bias that can be present and affect data analytics. They will also take time to analyze and interpret their project datasets.

Objective

Students will be able to:

  • Explain and recognize different types of bias that can be present during data analysis and interpretation
Description

In this lesson, students will work on their module projects by creating a business report.

Objective

Students will be able to:

  • Create and present a business report based on data analysis and interpretation
Description

In this lesson, students review content with a 10 question end-of-module quiz.

Objective

Students will be able to:

  • Demonstrate their understanding of data aggregation