implemnting all ML algos from scratch (without sklean)
machine learning models often rely on libraries like sklearn, but understanding the math behind them is essential. in this blog, let’s build ML algos from scratch in python without using sklearn.
simple linear regression
simple linear regression is a method to model the relationship between two variables using a straight line. the equation is:
y = m * x + b
where:
mis the slope (how much y changes for a unit change in x)bis the intercept (value of y when x is 0)
mathematical intuition
the goal of linear regression is to find the best-fitting line for the given data points by minimizing the sum of squared errors. this means we try to find values of m and b that minimize the mean squared error:
mse = (1/n) * sum((y_true - y_predicted) ** 2)
using calculus, we derive the formulas for m and b using the least squares method:
m = sum((x - mean_x) * (y - mean_y)) / sum((x - mean_x) ** 2)
b = mean_y - m * mean_x
this formula ensures that the line minimizes the total squared error between predicted and actual values.
implementing in python
here’s the complete code:
import numpy as np
class SimpleLinearRegression:
def __init__(self):
self.m = 0 # slope of the line
self.b = 0 # y-intercept of the line
def fit(self, X, Y):
# calculate the means of X and Y
X_mean, Y_mean = np.mean(X), np.mean(Y)
# calculate the slope (m) using the least squares method
numerator = sum((X - X_mean) * (Y - Y_mean))
denominator = sum((X - X_mean) ** 2)
self.m = numerator / denominator
# calculate the intercept (b)
self.b = Y_mean - self.m * X_mean
def predict(self, X):
# apply the linear equation to predict y values
return self.m * X + self.b
def mean_squared_error(self, Y_true, Y_pred):
# calculate the mean squared error for performance evaluation
return np.mean((Y_true - Y_pred) ** 2)
# example usage
X = np.array([1, 2, 3, 4, 5]) # input features
Y = np.array([2, 4, 5, 4, 5]) # target values
model = SimpleLinearRegression()
model.fit(X, Y) # train the model
predictions = model.predict(X) # generate predictions
print(f"slope (m): {model.m:.3f}, intercept (b): {model.b:.3f}")
print(f"mean squared error: {model.mean_squared_error(Y, predictions):.3f}")
Multiple Linear Regression
Multiple linear regression is an extension of simple linear regression that allows us to model the relationship between two or more features (independent variables) and a target variable (dependent variable) using a linear equation.
The equation for multiple linear regression is:
y = m1 * x1 + m2 * x2 + ... + mn * xn + b
Where:
m1, m2, ..., mnare the coefficients (slopes) for each feature (x1, x2, ..., xn).bis the intercept, the value ofywhen allxvalues are 0.
Mathematical Intuition
The goal of multiple linear regression is to find the best-fitting hyperplane (a line in 2D, a plane in 3D, and a hyperplane in higher dimensions) that minimizes the error between the predicted and actual values. The error is usually minimized using the Least Squares Method.
The Matrix Formulation
Let’s define the elements in matrix form:
Xis the matrix of input features, shaped(m, n)wheremis the number of data points andnis the number of features.yis the vector of target values, shaped(m, 1).mis the vector of coefficients (slopes), shaped(n, 1).bis the intercept.
To estimate the coefficients (m and b), we can use the following formula derived from the Least Squares Method:
m = (X^T * X)^(-1) * X^T * y
Where:
X^Tis the transpose of matrixX(shaped(n, m)).X^T * Xis a square matrix (shaped(n, n)).(X^T * X)^(-1)is the inverse of that matrix (shaped(n, n)).X^T * yis the vector of dot products ofX^Tand the targety(shaped(n, 1)).
Now, we can also calculate the intercept b using the following equation:
b = mean(y) - sum(m * mean(X)) for each feature
Where mean(X) is the vector of means for each feature in X.
Implementing in Python
Let’s implement Multiple Linear Regression from scratch in Python:
import numpy as np
class MultipleLinearRegression:
def __init__(self):
self.m = None # coefficients (slopes)
self.b = None # intercept
def fit(self, X, y):
# Add a column of ones to X to account for the intercept (b)
X_b = np.c_[np.ones((X.shape[0], 1)), X]
# Apply the normal equation to compute the coefficients
# m = (X^T * X)^(-1) * X^T * y
self.m = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
# Extract the intercept from the coefficients
self.b = self.m[0]
self.m = self.m[1:]
def predict(self, X):
# Add a column of ones to X for intercept (b)
X_b = np.c_[np.ones((X.shape[0], 1)), X]
# Calculate predictions
return X_b.dot(np.r_[self.b, self.m])
def mean_squared_error(self, y_true, y_pred):
# Calculate mean squared error (MSE) for performance evaluation
return np.mean((y_true - y_pred) ** 2)
# Example usage
# Independent variables (features)
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
# Dependent variable (target)
y = np.array([2, 3, 4, 5, 6])
# Instantiate the model
model = MultipleLinearRegression()
model.fit(X, y) # Train the model
predictions = model.predict(X) # Generate predictions
# Output model coefficients and performance metrics
print(f"Intercept (b): {model.b:.3f}")
print(f"Coefficients (m): {model.m}")
print(f"Mean Squared Error: {model.mean_squared_error(y, predictions):.3f}")
Explanation:
- Data Preprocessing:
- We start by adding a column of ones to the feature matrix
Xto account for the interceptb. This transformsXintoX_b.
- We start by adding a column of ones to the feature matrix
- Model Fitting:
- The model is fitted by solving the normal equation:
m = (X^T * X)^(-1) * X^T * y. This provides the coefficients of the linear model, including the intercept.
- The model is fitted by solving the normal equation:
- Prediction:
- To make predictions, we multiply
X_b(the modified feature matrix) with the computed coefficientsm.
- To make predictions, we multiply
- Evaluation:
- The mean squared error (MSE) is used to evaluate the model’s performance.