Wednesday, May 30, 2018

IID Assumption & Machine Learning Models



IID stands for independent and identical distribution in which it is assumed that data points are independent with each other and the are having similar distributed . Because of fulfillment of  IID assumptions we are able to use cross validation to evaluate models.



Since data points are assumed to be IID, we are able to split the data in to training & test type . Thats because we assume both test & training data create from same data generating process







Tuesday, May 29, 2018

9 Types of Machine Learning Problems/Tasks

In this video I have discussed about the 9 types of Machine Learning problems  and tasks. We have discussed about these broad categories in detail in this video



1- Regression



2- Classification



3- Transcription



4- Machine Translation



5- Structured Output



6- Anomaly detection



7 - Missing value Imputation



8 - Denoising



9- Probability density or PMF estimation







Monday, May 28, 2018

General Data Protection Regulation(GDPR) and Machine Learning

GDPR that stands for General Data Protection Regulation that will come in to force in 25th May 2018. It is a data protection regulation to protect personal data of European Union & EEU citizens.



In this video we have discussed about how GDPR will affect model building due to increased regulation on personal data










Saturday, May 26, 2018

Law of Large Numbers (R demo) | Statistics & probability | Machine Learning

The Law of Large Numbers is a fundamental concept in the field of Statistics and Probability which states that if a random experiment is performed over a large number of times, the average of the empirical outcome would be close to the actual or theoretical average outcome.



Here we have taken the example of dice experiment to showcase how the empirical average outcome would converse to theoretical average when performed for infinite number of times (very large N)








Wednesday, May 23, 2018

Stem and Leaf Plot | Data Visualization | Statistics

Stem and leaf plots are similar to histograms but has one major difference. They would retain the actual data points on the plot unlike histograms that only show ranges










Stem and Leaf Plot | Data Visualization | Statistics

Stem and leaf plots are similar to histograms but has one major difference. They would retain the actual data points on the plot unlike histograms that only show ranges










Tuesday, May 22, 2018

Box and Whisker Plot | Application in R | Data Vizualisation

Box and Whisker plot is a way to visualize continuous variable. This helps in understanding how the data is concentrated. The box ranges from the Quantile 1 to Quantile 3 , where as the whiskers range from begging to Quantile 1 and then quantile 3 to the end of the data.










Box and Whisker Plot | Application in R | Data Vizualisation

Box and Whisker plot is a way to visualize continuous variable. This helps in understanding how the data is concentrated. The box ranges from the Quantile 1 to Quantile 3 , where as the whiskers range from begging to Quantile 1 and then quantile 3 to the end of the data.