Important Machine Learning Concepts Part - 1

Important Machine Learning Concepts Part – 1

Features

Input data/variables used by the ML model.

Feature Engineering

Transforming input features to be more useful for the models. e.g., mapping categories to buckets, normalizing between -1 and 1, removing null.

Train/Eval/Test

Training is data used to optimize the model, evaluation is used to asses the model on new data during training, test is used to provide the final result.

Classification/Regression

Regression helps predict a continuous quantity (e.g., housing price). Classification predicts discrete class labels (e.g., predicting red/blue/green).

Linear Regression

Predicts an output by multiplying and summing input features with weights and biases.

Logistic Regression

Similar to linear regression but predicts a probability.

Overfitting

Model performs great on the input data but poorly on the test data (combat by dropout, early stopping, or reduce # of nodes or layers).

Underfitting

Model neither perform well on training data nor on testing data, and generates a high error rate on both the training set and unseen data.

Bias/Variance

How much output is determined by the features. More variance often can mean overfitting, more bias can mean a bad model.

Regularization

Variety of approaches to reduce overfitting, including adding the weights to the loss function, randomly dropping layers (dropout).

Nomidl