machine_learning.local_weighted_learning.local_weighted_learning

Locally weighted linear regression, also called local regression, is a type of non-parametric linear regression that prioritizes data closest to a given prediction point. The algorithm estimates the vector of model coefficients β using weighted least squares regression:

β = (XᵀWX)⁻¹(XᵀWy),

where X is the design matrix, y is the response vector, and W is the diagonal weight matrix.

This implementation calculates wᵢ, the weight of the ith training sample, using the Gaussian weight:

wᵢ = exp(-‖xᵢ - x‖²/(2τ²)),

where xᵢ is the ith training sample, x is the prediction point, τ is the “bandwidth”, and ‖x‖ is the Euclidean norm (also called the 2-norm or the L² norm). The bandwidth τ controls how quickly the weight of a training sample decreases as its distance from the prediction point increases. One can think of the Gaussian weight as a bell curve centered around the prediction point: a training sample is weighted lower if it’s farther from the center, and τ controls the spread of the bell curve.

Other types of locally weighted regression such as locally estimated scatterplot smoothing (LOESS) typically use different weight functions.

References:

Attributes

predictions

Functions

load_data(→ tuple[numpy.ndarray, numpy.ndarray, ...)

Load data from seaborn and split it into x and y points

local_weight(→ numpy.ndarray)

Calculate the local weights at a given prediction point using the weight

local_weight_regression(→ numpy.ndarray)

Calculate predictions for each point in the training data

plot_preds(→ None)

Plot predictions and display the graph

weight_matrix(→ numpy.ndarray)

Calculate the weight of every point in the training data around a given

Module Contents

machine_learning.local_weighted_learning.local_weighted_learning.load_data(dataset_name: str, x_name: str, y_name: str) tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]

Load data from seaborn and split it into x and y points >>> pass # No doctests, function is for demo purposes only

machine_learning.local_weighted_learning.local_weighted_learning.local_weight(point: numpy.ndarray, x_train: numpy.ndarray, y_train: numpy.ndarray, tau: float) numpy.ndarray

Calculate the local weights at a given prediction point using the weight matrix for that point

Args:

point: x-value at which the prediction is being made x_train: ndarray of x-values for training y_train: ndarray of y-values for training tau: bandwidth value, controls how quickly the weight of training values

decreases as the distance from the prediction point increases

Returns:

ndarray of local weights

>>> local_weight(
...     np.array([1., 1.]),
...     np.array([[16.99, 10.34], [21.01,23.68], [24.59,25.69]]),
...     np.array([[1.01, 1.66, 3.5]]),
...     0.6
... )
array([[0.00873174],
       [0.08272556]])
machine_learning.local_weighted_learning.local_weighted_learning.local_weight_regression(x_train: numpy.ndarray, y_train: numpy.ndarray, tau: float) numpy.ndarray

Calculate predictions for each point in the training data

Args:

x_train: ndarray of x-values for training y_train: ndarray of y-values for training tau: bandwidth value, controls how quickly the weight of training values

decreases as the distance from the prediction point increases

Returns:

ndarray of predictions

>>> local_weight_regression(
...     np.array([[16.99, 10.34], [21.01, 23.68], [24.59, 25.69]]),
...     np.array([[1.01, 1.66, 3.5]]),
...     0.6
... )
array([1.07173261, 1.65970737, 3.50160179])
machine_learning.local_weighted_learning.local_weighted_learning.plot_preds(x_train: numpy.ndarray, preds: numpy.ndarray, x_data: numpy.ndarray, y_data: numpy.ndarray, x_name: str, y_name: str) None

Plot predictions and display the graph >>> pass # No doctests, function is for demo purposes only

machine_learning.local_weighted_learning.local_weighted_learning.weight_matrix(point: numpy.ndarray, x_train: numpy.ndarray, tau: float) numpy.ndarray

Calculate the weight of every point in the training data around a given prediction point

Args:

point: x-value at which the prediction is being made x_train: ndarray of x-values for training tau: bandwidth value, controls how quickly the weight of training values

decreases as the distance from the prediction point increases

Returns:

m x m weight matrix around the prediction point, where m is the size of the training set

>>> weight_matrix(
...     np.array([1., 1.]),
...     np.array([[16.99, 10.34], [21.01,23.68], [24.59,25.69]]),
...     0.6
... )
array([[1.43807972e-207, 0.00000000e+000, 0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000, 0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000, 0.00000000e+000]])
machine_learning.local_weighted_learning.local_weighted_learning.predictions