Classification and Regression

07/12/2025

1. What is Classification?

👉 Classification predicts a category or class.

You are answering:
"Which group does this belong to?"

Examples:

  • Spam or Not Spam (2 classes → binary classification)

  • Disease or No Disease

  • Credit Rating: Good / Average / Poor

  • Fruit Type: Apple / Mango / Banana (multi-class classification)

Output Type:

  • Discrete labels (fixed categories)

Algorithms:

  • Logistic Regression

  • Decision Trees

  • Random Forest

  • SVM

  • Naive Bayes

  • Neural Networks (Classifiers)

Classification performance is best measured using precision, recall, F1-score, confusion matrix, ROC-AUC, and log-loss depending on what matters: correctness, confidence, or error type. 

    2. What is Regression?

👉 Regression predicts a numerical value.

You are answering:
"How much?" or "What value?"

Examples:

  • Predicting house price

  • Predicting sales for next month

  • Predicting temperature

  • Predicting employee salary

Output Type:

  • Continuous numbers

Algorithms:

  • Linear Regression

  • Ridge / Lasso

  • Decision Tree Regression

  • Random Forest Regression

  • SVR (Support Vector Regression)

Regression performance is commonly measured using:

  • Error-based metrics (MAE, MSE, RMSE, MAPE, Median Error)

  • Goodness-of-fit metrics (R², Adjusted R², Explained Variance)

Each metric highlights a different aspect of model quality—error size, sensitivity to outliers, or ability to explain variance.

Classification Example

Input:

  • Age

  • Income

  • Existing loans

Output:
👉 Will the person default on a loan? Yes/No

Regression Example

Input:

  • Number of rooms

  • Area in sq. ft

  • Location rating

Output:
👉 Expected House Price: ₹56,40,000

MCQs on Classification vs Regression

1. Which type of machine learning problem predicts a category?

A. Regression
B. Classification
C. Clustering
D. Optimization

Answer: B. Classification
Explanation: Classification outputs discrete labels such as "spam/not spam."

2. Predicting house prices is an example of:

A. Binary classification
B. Regression
C. Multi-class classification
D. Clustering

Answer: B. Regression
Explanation: House prices are continuous numerical values.

3. Which algorithm is commonly used for regression tasks?

A. Logistic Regression
B. Support Vector Regression (SVR)
C. Naive Bayes
D. KNN Classification

Answer: B. Support Vector Regression (SVR)
Explanation: SVR is specifically designed for continuous value prediction.

4. Email spam detection is an example of:

A. Regression
B. Multi-output regression
C. Binary classification
D. Dimensionality reduction

Answer: C. Binary classification
Explanation: The output is either "spam" or "not spam."

5. Which type of model output is continuous?

A. Regression
B. Binary classification
C. Multi-class classification
D. Clustering

Answer: A. Regression
Explanation: Regression produces real-valued outputs.

6. Accuracy is mainly used as an evaluation metric for:

A. Regression
B. Classification
C. Clustering
D. Association rules

Answer: B. Classification
Explanation: Accuracy measures correct vs incorrect class predictions.

7. Mean Squared Error (MSE) is used to evaluate:

A. Classification models
B. Regression models
C. Clustering models
D. Reinforcement learning agents

Answer: B. Regression models
Explanation: MSE measures numerical prediction error.

8. Predicting whether a customer will default on a loan is:

A. Regression
B. Classification
C. Forecasting
D. Reinforcement learning

Answer: B. Classification
Explanation: The output is a category: default or not.

9. Which of the following is NOT a classification algorithm?

A. Random Forest Classifier
B. Logistic Regression
C. Naive Bayes
D. Linear Regression

Answer: D. Linear Regression
Explanation: Linear Regression is used for predicting continuous values.

10. When predicting temperature for the next hour, the problem is:

A. Binary classification
B. Multi-class classification
C. Regression
D. Clustering

Answer: C. Regression
Explanation: Temperature is a continuous numeric value.

11. If the model predicts a probability between 0 and 1, it is most likely used in:

A. Regression
B. Clustering
C. Classification
D. Reinforcement learning

Answer: C. Classification
Explanation: Probability outputs are common in classification (e.g., logistic regression).

12. Which of the following questions represents a regression problem?

A. Will the customer buy the product?
B. Which product category does the customer prefer?
C. How much will the customer spend?
D. Is the transaction fraudulent?

Answer: C. How much will the customer spend?
Explanation: It asks for a numeric value.

13. Multi-class classification deals with:

A. More than two categories
B. Only one category
C. Numerical values
D. Clustering groups

Answer: A. More than two categories
Explanation: Example: classifying fruit as apple/mango/orange.

14. Predicting sales numbers for next month is typically solved using:

A. Regression
B. Binary classification
C. Multi-class classification
D. Dimensionality reduction

Answer: A. Regression
Explanation: Sales quantity is numeric.

15. Which of the following is a regression evaluation metric?

A. Precision
B. Recall
C. R² Score
D. F1 Score

Answer: C. R² Score
Explanation: R² measures how well regression predictions fit true values.