Implementation of [K-Nearest Neighbors algorithm] (https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm). More...

#include <algorithm>
#include <cassert>
#include <cmath>
#include <iostream>
#include <numeric>
#include <unordered_map>
#include <vector>

Include dependency graph for k_nearest_neighbors.cpp:

Classes
class	machine_learning::k_nearest_neighbors::Knn
	K-Nearest Neighbors (Knn) class using Euclidean distance as distance metric. More...

Namespaces
namespace	machine_learning
	A* search algorithm

namespace	k_nearest_neighbors
	Functions for the [K-Nearest Neighbors algorithm] (https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) implementation.

Functions
template<typename T >
double	machine_learning::k_nearest_neighbors::euclidean_distance (const std::vector< T > &a, const std::vector< T > &b)
	Compute the Euclidean distance between two vectors.

static void	test ()
	Self-test implementations.

int	main (int argc, char *argv[])
	Main function.

Detailed Description

Implementation of [K-Nearest Neighbors algorithm] (https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm).

Author: Luiz Carlos Cosmi Filho

K-nearest neighbors algorithm, also known as KNN or k-NN, is a supervised learning classifier, which uses proximity to make classifications. This implementantion uses the Euclidean Distance as distance metric to find the K-nearest neighbors.

Function Documentation

◆ euclidean_distance()

template<typename T >

double machine_learning::k_nearest_neighbors::euclidean_distance	(	const std::vector< T > &	a,
		const std::vector< T > &	b )

Compute the Euclidean distance between two vectors.

Template Parameters

T	typename of the vector

Parameters

a	first unidimentional vector
b	second unidimentional vector

Returns: double scalar representing the Euclidean distance between provided vectors

                                                                        {
    std::vector<double> aux;
    std::transform(a.begin(), a.end(), b.begin(), std::back_inserter(aux),
                   [](T x1, T x2) { return std::pow((x1 - x2), 2); });
    aux.shrink_to_fit();
    return std::sqrt(std::accumulate(aux.begin(), aux.end(), 0.0));
}

Here is the call graph for this function:

◆ main()

int main	(	int	argc,
		char *	argv[] )

Main function.

Parameters

argc	commandline argument count (ignored)
argv	commandline array of arguments (ignored)

Returns: int 0 on exit

                                 {
    test();  // run self-test implementations
    return 0;
}

Here is the call graph for this function:

◆ test()

static void test ( )

static

Self-test implementations.

Returns: void

                   {
    std::cout << "------- Test 1 -------" << std::endl;
    std::vector<std::vector<double>> X1 = {{0.0, 0.0}, {0.25, 0.25},
                                           {0.0, 0.5}, {0.5, 0.5},
                                           {1.0, 0.5}, {1.0, 1.0}};
    std::vector<int> Y1 = {1, 1, 1, 1, 2, 2};
    auto model1 = machine_learning::k_nearest_neighbors::Knn(X1, Y1);
    std::vector<double> sample1 = {1.2, 1.2};
    std::vector<double> sample2 = {0.1, 0.1};
    std::vector<double> sample3 = {0.1, 0.5};
    std::vector<double> sample4 = {1.0, 0.75};
    assert(model1.predict(sample1, 2) == 2);
    assert(model1.predict(sample2, 2) == 1);
    assert(model1.predict(sample3, 2) == 1);
    assert(model1.predict(sample4, 2) == 2);
    std::cout << "... Passed" << std::endl;
    std::cout << "------- Test 2 -------" << std::endl;
    std::vector<std::vector<double>> X2 = {
        {0.0, 0.0, 0.0}, {0.25, 0.25, 0.0}, {0.0, 0.5, 0.0}, {0.5, 0.5, 0.0},
        {1.0, 0.5, 0.0}, {1.0, 1.0, 0.0},   {1.0, 1.0, 1.0}, {1.5, 1.5, 1.0}};
    std::vector<int> Y2 = {1, 1, 1, 1, 2, 2, 3, 3};
    auto model2 = machine_learning::k_nearest_neighbors::Knn(X2, Y2);
    std::vector<double> sample5 = {1.2, 1.2, 0.0};
    std::vector<double> sample6 = {0.1, 0.1, 0.0};
    std::vector<double> sample7 = {0.1, 0.5, 0.0};
    std::vector<double> sample8 = {1.0, 0.75, 1.0};
    assert(model2.predict(sample5, 2) == 2);
    assert(model2.predict(sample6, 2) == 1);
    assert(model2.predict(sample7, 2) == 1);
    assert(model2.predict(sample8, 2) == 3);
    std::cout << "... Passed" << std::endl;
    std::cout << "------- Test 3 -------" << std::endl;
    std::vector<std::vector<double>> X3 = {{0.0}, {1.0}, {2.0}, {3.0},
                                           {4.0}, {5.0}, {6.0}, {7.0}};
    std::vector<int> Y3 = {1, 1, 1, 1, 2, 2, 2, 2};
    auto model3 = machine_learning::k_nearest_neighbors::Knn(X3, Y3);
    std::vector<double> sample9 = {0.5};
    std::vector<double> sample10 = {2.9};
    std::vector<double> sample11 = {5.5};
    std::vector<double> sample12 = {7.5};
    assert(model3.predict(sample9, 3) == 1);
    assert(model3.predict(sample10, 3) == 1);
    assert(model3.predict(sample11, 3) == 2);
    assert(model3.predict(sample12, 3) == 2);
    std::cout << "... Passed" << std::endl;
}

Here is the call graph for this function:

Classes

Namespaces

Functions

Detailed Description

Function Documentation

◆ euclidean_distance()

◆ main()

◆ test()