Two Day’s Workshop on “Machine Learning Using Python”
Sharadchandra Pawar College of Engineering has Organized Four day’s workshop on “Machine Learning Using Python” on 31st January To 3rd February 2020 in association with Board of Student’s Development, Savitribai Phule Pune University, Pune.
- Title: Four day’s State Level Workshop on “Machine learning Using python”
- Resource person:
- Yogesh P Murumkar,
Trainee Engineer, LOMTE & DARADE INFOTECH PVT.LTD
- Santosh Darade
Developer Engineer, LOMTE & DARADE INFOTECH PVT.LTD
- Apurva Dighe,
Trainee Engineer, LOMTE & DARADE INFOTECH PVT.LTD
- Convener: Sunil S. Khatal
(Head, Department of Computer Engineering, SPCOE, Dumbarwadi, Pune)
- Coordinator: Kapil D. Dere
- Co-coordinator: Pooja S. Gholap ,
- About workshop:
Mr.Yogesh Murumkar, Mr.Santosh Darade & Mrs.Apurva Dighe conducted the workshop to explain students in understanding the concepts of the Machine learning using Python. They have taken about 28 hours of theoretical and practical sessions.
They started the workshop with a very creative introductory session on 31st Jan 2020. First of all he has explained the concepts which will be very helpful for use of Python concepts in industry, using power point presentations. In this theoretical explanation part, They explained the Anaconda software and Machine learning use of this software for further practical implementation in Machine Learning Using Python. They have also given a brief idea about challenges of Python along with introduction to Machine Learning design principles.
Machine learning involves a computer to be trained using a given data set, and use this training to predict the properties of a given new data. For example, we can train a computer by feeding it 1000 images of cats and 1000 more images which are not of a cat, and tell each time to the computer whether a picture is cat or not. Then if we show the computer a new image, then from the above training, the computer should be able to tell whether this new image is a cat or not.
The process of training and prediction involves the use of specialized algorithms. We feed the training data to an algorithm, and the algorithm uses this training data to give predictions on a new test data. One such algorithm is K-Nearest-Neighbor classification (KNN classification). It takes a test data, and finds k nearest data values to this data from test data set. Then it selects the neighbor of maximum frequency and gives its properties as the prediction result.
Duration: Four day’s workshop on 31st January To 3rd February 2020
- How many students have attended this workshop: 49
- Topics covered in this Workshop:
- Topic 1: Machine Learning Introduction
- What is Machine Learning?
- Machine Learning Process Flow-Diagram
- Different Categories of Machine Leaning – Supervised, Unsupervised and Reinforcement
- Scikit-Learn Overview
- Scikit-Learn cheat-sheet
- Topic 2: Regression
- Linear Regression
- Robust Regression (RANSAC Algorithm)
- Exploratory Data Analysis (EDA)
- Correlation Analysis and Feature Selection
- Performance Evaluation – Residual Analysis, Mean Square Error (MSE), Co-efficient of Determination R^2, Mean Absolute Error (MAE), Root Mean Square Error (RMSE)
- Polynomial Regression
- Regularized Regression – Ridge, Lasso and Elastic Net Regression
- Bias-Variance Trade-Off
- Cross Validation – Hold Out and K-Fold Cross Validation
- Data Pre-Processing – Standardization, Min-Max, Normalization and Binarization
- Gradient Descent
- Topic 3: Classification – Logistic Regression
- Sigmoid function
- Logistic Regression learning using Stochastic Gradient Descent (SGD)
- SGD Classifier
- Measuring accuracy using Cross-Validation, Stratified k-fold
- Confusion Matrix – True Positive (TP), False Positive (FP), False Negative(FN), True Negative (TN)
- Precision, Recall, F1 Score, Precision/Recall Trade-Off
- Receiver Operating Characteristics (ROC) Curve.
- Topic 4: Classification – k-Nearest Neighbor(KNN)
- Classification and Regression
- Application, Advantages and Disadvantages
- Distance Metric – Euclidean, Manhattan, Chebyshev, Minkowski
- Measuring accuracy using Cross-Validation, Stratified k-fold, Confusion Matrix, Precision, Recall, F1-score.
- Topic 5: Classification – SVM (Support Vector Machine)
- Classification and Regression
- Separating line, Margin and Support Vectors
- Linear SVC Classification
- Polynomial Kernel – Kernel Trick
- Gaussian Radial Basis Function (rbf)
- Grid Search to tune hyper-parameters.
- Support Vector Regression
- Projects
- Breast Cancer Wisconsin (Diagnostic) Project using KNN – https://www.kaggle.com/uciml/breast-cancer-wisconsin-data
- Predicting Boston House Prices – https://www.kaggle.com/schirmerchad/bostonhoustingmlnd
- Ecommerce Project – Company want to decide whether to focus their efforts on Mobile Experience or Website Experience.
- Outcomes of attending workshop:
- This workshop will cover the basic algorithm that helps us to build and apply prediction functions with an emphasis on practical applications. attendees, at the end of this workshop, will be technically competent in the basics and the fundamental concepts of Machine Learning such as:
- Understand components of a machine learning algorithm.
- Apply machine learning tools to build and evaluate predictors.
- How machine learning uses computer algorithms to search for patterns in data
- How to uncover hidden themes in large collections of documents using topic modeling.
- How to use data patterns to make decisions and predictions with real-world examples
- from healthcare involving genomics and preterm birth
- How to prepare data, deal with missing data and create custom data analysis solutions for different industries
- Basic and frequently used algorithmic techniques including sorting, searching, greedy algorithms and dynamic programming