21CS644-Data Science and Visualization 2021 scheme
VTU University notes on 6th SEM Computer science and Engineerings 2022 scheme notes 2024. Study materials and previous year question papers on easenotes 2024.
Module 1
Introduction to Data Science
Introduction: What is Data Science? Big Data and Data Science hype – and getting past the hype, Why now? – Datafication, Current landscape of perspectives, Skill sets.
Needed Statistical Inference: Populations and samples, Statistical modelling, probability distributions, fitting a model.
Module 2
Data Analysis and the Data Science Process
Basic tools (plots, graphs and summary statistics) of EDA, Philosophy of EDA, The Data Science Process, Case Study: Real Direct(online realestate firm). ThreeBasic Machine LearningAlgorithms: Linear Regression, k-Nearest Neighbours (k- NN), k-means.
Module 3
Feature Generation and Feature Selection
Extracting Meaning from Data: Motivating application: user (customer) retention. Feature Generation (brainstorming, role of domain expertise, and place for imagination), Feature Selection algorithms. Filters; Wrappers; Decision Trees; Random Forests.
Recommendation Systems: Building a User-Facing Data Product, Algorithmic ingredients of a Recommendation Engine, Dimensionality Reduction, Singular Value Decomposition, Principal Component Analysis, Exercise: build your own recommendation system.
Module 4
Data Visualization and Data Exploration
Introduction: Data Visualization, Importance of Data Visualization, Data Wrangling, Tools and Libraries for Visualization.
Comparison Plots: Line Chart, Bar Chart and Radar Chart.
Relation Plots: Scatter Plot, Bubble Plot , Correlogram and Heatmap.
Composition Plots: Pie Chart, Stacked Bar Chart, Stacked Area Chart, Venn Diagram.
Distribution Plots: Histogram, Density Plot, Box Plot, Violin Plot; Geo Plots: Dot Map, Choropleth Map, Connection Map; What Makes a Good Visualization.
Module 5
A Deep Dive into Matplotlib
Introduction: Overview of Plots in Matplotlib, Pyplot Basics: Creating Figures, Closing Figures, Format Strings, Plotting, Plotting Using pandas DataFrames, Displaying Figures, Saving Figures.
Basic Text and Legend Functions: Labels, Titles, Text, Annotations, Legends; Basic Plots:Bar Chart, Pie Chart, Stacked Bar Chart, Stacked Area Chart, Histogram, Box Plot, Scatter Plot, Bubble Plot; Layouts: Subplots, Tight Layout, Radar Charts, GridSpec; Images: Basic Image Operations, Writing Mathematical Expressions.