Supervised machine learning
Supervised Machine Learning Techniques are algorithms used to learn a mapping from input data (features) to a target variable (labels), based on labeled training data. These techniques are widely used for classification and regression problems.
Here is a detailed overview of the most common supervised learning techniques:
1. Linear Regression
-
Type: Regression
-
Use Case: Predicting continuous values (e.g., housing prices).
-
How it works: Assumes a linear relationship between input variables (X) and the output (y).
-
Formula:
y=β0+β1x1+β2x2+⋯+βnxn
2. Logistic Regression
-
Type: Classification
-
Use Case: Binary or multi-class classification (e.g., spam detection).
-
How it works: Models the probability that a given input belongs to a particular class using the logistic (sigmoid) function.
3. Decision Trees
-
Type: Classification/Regression
-
Use Case: Fraud detection, credit scoring, etc.
-
How it works: Splits the data into subsets based on feature values using conditions (like yes/no questions), forming a tree structure.
4. Random Forest
-
Type: Classification/Regression
-
Use Case: High-accuracy tasks with large datasets.
-
How it works: Ensemble of multiple decision trees; each tree is trained on a random subset of data and features. Final prediction is made by majority vote or averaging.
5. Support Vector Machines (SVM)
-
Type: Classification/Regression
-
Use Case: Image classification, bioinformatics.
-
How it works: Finds the optimal hyperplane that separates classes with maximum margin. Can handle non-linear data using kernel tricks.
6. K-Nearest Neighbors (K-NN)
-
Type: Classification/Regression
-
Use Case: Pattern recognition, recommendation systems.
-
How it works: Stores all training data; a new input is classified based on the majority class among its k-nearest neighbors.
7. Naïve Bayes
-
Type: Classification
-
Use Case: Text classification, spam filtering.
-
How it works: Applies Bayes' Theorem with strong (naïve) independence assumptions between features.
8. Gradient Boosting Machines (GBM) / XGBoost / LightGBM / CatBoost
-
Type: Classification/Regression
-
Use Case: Kaggle competitions, production-level machine learning.
-
How it works: Sequentially builds models that correct the errors of previous models. Uses decision trees as base learners.
9. Neural Networks (Shallow and Deep)
-
Type: Classification/Regression
-
Use Case: Complex problems like image recognition, natural language processing.
-
How it works: Composed of layers of neurons that transform input data through learned weights and activation functions.