Standards in this Framework
Standard | Description |
---|---|
1.1.1 | Demonstrate effective communication skills in both technical and non-technical contexts (e.g., explaining data analysis results to peers, presenting insights to non-technical stakeholders). |
1.1.2 | Demonstrate integrity in data science practices (e.g., citing data sources, ensuring data privacy and security). |
1.1.3 | Develop collaboration and teamwork skills through group data projects (e.g., pair programming, code reviews, collaborative data analysis). |
1.1.4 | Identify and develop traits important for success in data science (e.g., problem-solving, attention to detail, continuous learning, and adaptability). |
1.2.1 | Research various roles within the data science field (e.g., data analyst, machine learning engineer, data engineer, business intelligence analyst). |
1.2.2 | Identify professional certifications relevant to different data science careers (e.g., Google Data Analytics Professional Certificate, IBM Data Science Professional Certificate). |
1.2.3 | Research routes to become a data scientist (e.g., university degree, bootcamps, online courses, internships in data-driven industries). |
2.1.1 | Explain the core concepts of data science (e.g., data collection, cleaning, analysis, and visualization). |
2.1.2 | Describe real-world applications of data science in various fields (e.g., healthcare, finance, marketing, environmental science). |
2.2.1 | Discuss the ethical considerations involved in data collection, storage, and use (e.g., data privacy, data security, informed consent). |
2.2.2 | Evaluate data sources for potential biases (e.g., selection, measurement, confirmation, and reporting) and identify strategies to mitigate their impact. |
2.2.3 | Explain the importance of peer-reviewed and replicated experimental data. |
2.2.4 | Evaluate the credibility of data sources (e.g., author expertise, source reputation, publication date, and objectivity). |
2.3.1 | Differentiate between quantitative (e.g., continuous and discrete) and qualitative (e.g., nominal and ordinal) data. |
2.3.2 | Recognize and classify different types of data (e.g., nominal, ordinal, interval, ratio). |
2.3.3 | Explain the concept and qualities of structured data (e.g., tabular data, JSON, csv). |
3.1.1 | Enter and format data within spreadsheets for business purposes using spreadsheet software. |
3.1.2 | Use formulas and functions (e.g., AVG, MIN, MAX, COUNT, and SUM) within a spreadsheet software to use logical reasoning to draw conclusions and apply mathematical problem-solving skills to create suitable formulas to solve problems. |
3.1.3 | Use spreadsheet software to aid with data analysis and reporting by creating visualizations (e.g., charts). |
3.1.4 | Use advanced functions (e.g., VLOOKUP, COUNTIF, and IFERROR), conditional formatting (e.g., IF, AND, OR, and BETWEEN), and filtering techniques to analyze and manipulate data using a spreadsheet software. |
3.1.5 | Use a text-based programming language to create and read simple spreadsheet files (e.g., csv) using file input and output operations. |
3.1.6 | Create programs using a text-based programming language that solve common business data operations (e.g., average, sum, count, max, min) in analyzing data in a spreadsheet. |
3.2.1 | Compare various types of databases (e.g., relational, NoSQL, and hierarchical). |
3.2.2 | Research different database management systems (e.g, MySQL, SQlite, Microsoft SQL Server, Access, and Oracle Database) to manage and interact with a database. |
3.2.3 | Use a database management system to create basic databases and perform basic operations (e.g., selecting, inserting, updating, and deleting data) on the database. |
3.2.4 | Use a text-based programming language to create basic databases and tables and perform basic operations (e.g., selecting, inserting, updating, and deleting data) on the database. |
3.2.5 | Create and execute statements using a declarative language (e.g., Structured Query Language) to create simple databases and perform basic operations (e.g., selecting, inserting, updating, and deleting data) on the database. |
3.2.6 | Create programs using a text-based programming language to transfer data from a data source (e.g., csv, spreadsheets, and online data repositories) into a database. |
4.1.1 | Discuss various sampling techniques (e.g., random sampling, stratified sampling, cluster sampling). |
4.1.2 | Develop and implement a text-based algorithm (e.g., random number generator) for sampling from a population to ensure data quality. |
4.1.3 | Compute the number of events in a sample space using combinatorics (i.e, fundamental principle of counting, permutations, and combinations). |
4.1.4 | Define probability in both language and mathematics using terms of related vocabulary (e.g., trial, sample space , event, outcome, complement). |
4.1.5 | Develop and implement simple simulations (e.g., coin flips, dice rolls, card draws) using a programming language to generate and analyze simulated data. |
4.1.6 | Leverage web scraping libraries (e.g., Beautiful Soup, Selenium) to retrieve data from the Internet. |
4.1.7 | Create programs that leverage libraries (e.g., Requests, urllib) to connect and retrieve data from public APIs (e.g., NOAA climate data, NASA Open APIs). |
4.2.1 | Assess data quality for consistency, completeness, and accuracy. |
4.2.2 | Identify and handle missing values (e.g., imputation techniques, deletion). |
4.2.3 | Identify outliers using statistical methods (e.g., z-score, interquartile range). |
4.2.4 | Perform various methods of addressing outliers (e.g., removal, transformation, imputation). |
4.2.5 | Normalize and standardize data (e.g., min-max scaling, z-score normalization, decimal scaling). |
4.3.1 | Manipulate existing data to create relevant features including combining columns, extracting information from text, and datetime transformations. |
4.3.2 | Aggregate data and convert into different levels of granularity (e.g., daily to monthly, individual to group). |
5.1.1 | Calculate and interpret descriptive statistics (i.e., mean, median, mode, standard deviation, percentiles). |
5.1.2 | Analyze the distribution of data using visualizations (e.g., histograms, scatter plots, box plots, box and whisker plots). |
5.1.3 | Assess the linear relationship between variables using correlation analysis. |
5.1.4 | Perform and interpret simple linear regression analysis. |
5.1.5 | Calculate various probabilities (i.e., simple events, compound events, using Addition Rule, using Multiplication Rule, and conditional). |
5.1.6 | Discuss and apply the concepts of related events (e.g., independent and dependent events, with and without replacement, mutually exclusive). |
5.2.1 | Leverage data visualization libraries (e.g., Matplotlib, Seaborn) to generate suitable visualizations (e.g., bar charts for comparisons, line charts for trends, scatter plots for relationships, and pie charts for part-to-whole). |
5.2.2 | Interpret insights from visualizations to identify trends, outliers or patterns in the data. |
5.2.3 | Create related visualizations depicting probability (e.g., Venn Diagram, Two-Way Table). |
6.1.1 | Discuss the importance of visualizations to communicate important information to various audiences. |
6.1.2 | Leverage data visualization tools and libraries (e.g., Seaborn, Plotly, Basemap) to create infographics that support key findings and insights. |
6.2.1 | Create an effective narrative from data analysis. |
6.2.2 | Communicate the relevance of findings using basic storytelling techniques (e.g., using analogies, providing context, highlighting key trends). |
6.2.3 | Identify and communicate potential limitations and uncertainties in the data analysis. |
6.3.1 | Identify a computational problem (e.g., prediction, classification, or regression of data from a scientific, industry-focused, or logistics data set) and create a plan for a student-centered project. |
6.3.2 | Acquire relevant data (e.g., sampling, sensor data, online datasets, web scraping) using appropriate methods and manage it effectively as part of a student-centered project. |
6.3.3 | Identify and address data quality issues to prepare data as part of a student-centered project. |
6.3.4 | Extract meaningful insights from the data using statistical methods and visualization techniques as part of a student-centered project. |
6.3.5 | Communicate clearly and effectively through presentations and written reports to both technical and non-technical audiences at the conclusion of a student-centered project. |