Machine learning is a field of computer application which is basically the statistics of the various data that a firm acquires and uses to learn better about its customers. The data that we search for, the surveys that we provide and the reviews which we produce are primarily the source data and Machine learning algorithms use these data to produce different statiscal data that can be used to determine the ratings of a product, the type of content a particular user might like to get notified about, similar sites/services available to the user's choice. Thus Machine learning has its roots with mathematical statistics and shares many similarities.
Common terms and their varied names between ML and Statistics:
Machine Learning
Statistics
Learning
Estimation Fitting
Hypothesis Testing
Confirmatory Data Analysis
Example/Instance
Data Point
Weights
Parameters
Supervised Learning
Regression/ Classification
Unsupervised Learning
Clustering
Feature
Covariate
Label
Response
Data types and their operations:
Regression:
Predicts values or outcomes based on given independent variables.
Dimensional Reduction:
Uses an algorithm to turn raw data or "Unstructured data" into numeric values or "structured data".
Classification:
Predicts the label or category of an item based on a set of data points; labels are then assigned to characteristics.
Clustering:
Groups data points into characteristic clusters, which allows the gathering of valuable insights.
Videos to start with:
Introduction to predictive modeling
Types of predictive modeling
Stages of predictive modeling
Understanding hypothesis generation
Data extraction.
Understanding Data Exploration
Functions to read data in Python(jupyter notebook)
Variable Identification
Univariate analysis for continuous variables
Understanding Univariate Analysis for categorical variables
Understanding Bivariate Analysis
Understanding and treating missing values
Understanding Outlier Treatment
Understanding Variable Transformation
Basics of Model Building
Introduction to Problem Statement
Building first predictive model
Preparing the Dataset
Benchmark regression final
Classification Benchmark
Introduction to Evaluation Metrics
Confusion Matrix
Accuracy
Alternatives of Accuracy
Precision and Recall
Thresholding
AUC ROC
Log loss
Evaluation Metrics for regression Final
R2 and Adjusted R2
Introduction to kNN
Building a kNN model
Determining right value of k
How to calculate distance
Issue with distance based algorithms
Introduction to Overfitting and Underfitting Models.
What is Validation
Understanding Hold-Out Validation
Understanding k-fold cross validation.
Bias Variance Tradeoff.
Introduction to Linear Model
Understanding Gradient descent.
Gradient Des in Linear Regression
Convexity of cost function
Assumptions of Linear Regression
Introduction to Logistic Regression
Odds ratio.
Multiclass using Logistic Regression
Introduction to Decision Tree
Purity in Decision Trees
Terminologies Related to Decision Trees.
How to Select the Best Split Point in Decision Trees.