machine_learning.linear_regression¶
Linear regression is the most basic type of regression commonly used for predictive analysis. The idea is pretty simple: we have a dataset and we have features associated with it. Features should be chosen very cautiously as they determine how much our model will be able to make future predictions. We try to set the weight of these features, over many iterations, so that they best fit our dataset. In this particular code, I had used a CSGO dataset (ADR vs Rating). We try to best fit a line through dataset and estimate the parameters.
Functions¶
Collect dataset of CSGO |
|
|
Driver function |
|
Return sum of square error for error calculation |
|
Implement Linear regression over the dataset |
|
Run steep gradient descent and updates the Feature vector accordingly_ |
|
Return sum of square error for error calculation |
Module Contents¶
- machine_learning.linear_regression.collect_dataset()¶
Collect dataset of CSGO The dataset contains ADR vs Rating of a Player :return : dataset obtained from the link, as matrix
- machine_learning.linear_regression.main()¶
Driver function
- machine_learning.linear_regression.mean_absolute_error(predicted_y, original_y)¶
Return sum of square error for error calculation :param predicted_y : contains the output of prediction (result vector) :param original_y : contains values of expected outcome :return : mean absolute error computed from given feature’s
>>> predicted_y = [3, -0.5, 2, 7] >>> original_y = [2.5, 0.0, 2, 8] >>> mean_absolute_error(predicted_y, original_y) 0.5
- machine_learning.linear_regression.run_linear_regression(data_x, data_y)¶
Implement Linear regression over the dataset :param data_x : contains our dataset :param data_y : contains the output (result vector) :return : feature for line of best fit (Feature vector)
- machine_learning.linear_regression.run_steep_gradient_descent(data_x, data_y, len_data, alpha, theta)¶
Run steep gradient descent and updates the Feature vector accordingly_ :param data_x : contains the dataset :param data_y : contains the output associated with each data-entry :param len_data : length of the data_ :param alpha : Learning rate of the model :param theta : Feature vector (weight’s for our model) ;param return : Updated Feature’s, using
curr_features - alpha_ * gradient(w.r.t. feature)
>>> import numpy as np >>> data_x = np.array([[1, 2], [3, 4]]) >>> data_y = np.array([5, 6]) >>> len_data = len(data_x) >>> alpha = 0.01 >>> theta = np.array([0.1, 0.2]) >>> run_steep_gradient_descent(data_x, data_y, len_data, alpha, theta) array([0.196, 0.343])
- machine_learning.linear_regression.sum_of_square_error(data_x, data_y, len_data, theta)¶
Return sum of square error for error calculation :param data_x : contains our dataset :param data_y : contains the output (result vector) :param len_data : len of the dataset :param theta : contains the feature vector :return : sum of square error computed from given feature’s
Example: >>> vc_x = np.array([[1.1], [2.1], [3.1]]) >>> vc_y = np.array([1.2, 2.2, 3.2]) >>> round(sum_of_square_error(vc_x, vc_y, 3, np.array([1])),3) np.float64(0.005)