machine_learning.k_nearest_neighbours¶
k-Nearest Neighbours (kNN) is a simple non-parametric supervised learning algorithm used for classification. Given some labelled training data, a given point is classified using its k nearest neighbours according to some distance metric. The most commonly occurring label among the neighbours becomes the label of the given point. In effect, the label of the given point is decided by a majority vote.
This implementation uses the commonly used Euclidean distance metric, but other distance metrics can also be used.
Reference: https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
Attributes¶
Classes¶
Module Contents¶
- class machine_learning.k_nearest_neighbours.KNN(train_data: numpy.ndarray[float], train_target: numpy.ndarray[int], class_labels: list[str])¶
- static _euclidean_distance(a: numpy.ndarray[float], b: numpy.ndarray[float]) float ¶
Calculate the Euclidean distance between two points >>> KNN._euclidean_distance(np.array([0, 0]), np.array([3, 4])) 5.0 >>> KNN._euclidean_distance(np.array([1, 2, 3]), np.array([1, 8, 11])) 10.0
- classify(pred_point: numpy.ndarray[float], k: int = 5) str ¶
Classify a given point using the kNN algorithm >>> train_X = np.array( … [[0, 0], [1, 0], [0, 1], [0.5, 0.5], [3, 3], [2, 3], [3, 2]] … ) >>> train_y = np.array([0, 0, 0, 0, 1, 1, 1]) >>> classes = [‘A’, ‘B’] >>> knn = KNN(train_X, train_y, classes) >>> point = np.array([1.2, 1.2]) >>> knn.classify(point) ‘A’
- data¶
- labels¶
- machine_learning.k_nearest_neighbours.iris¶