machine_learning.sequential_minimum_optimization¶
Sequential minimal optimization (SMO) for support vector machines (SVM)
Sequential minimal optimization (SMO) is an algorithm for solving the quadratic programming (QP) problem that arises during the training of SVMs. It was invented by John Platt in 1998.
- Input:
0: type: numpy.ndarray. 1: first column of ndarray must be tags of samples, must be 1 or -1. 2: rows of ndarray represent samples.
- Usage:
- Command:
python3 sequential_minimum_optimization.py
- Code:
from sequential_minimum_optimization import SmoSVM, Kernel
kernel = Kernel(kernel=’poly’, degree=3., coef0=1., gamma=0.5) init_alphas = np.zeros(train.shape[0]) SVM = SmoSVM(train=train, alpha_list=init_alphas, kernel_func=kernel, cost=0.4,
b=0.0, tolerance=0.001)
SVM.fit() predict = SVM.predict(test_samples)
- Reference:
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/smo-book.pdf https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-98-14.pdf
Attributes¶
Classes¶
Functions¶
|
|
|
We cannot get the optimal w of our kernel SVM model, which is different from a |
|
|
|
Module Contents¶
- class machine_learning.sequential_minimum_optimization.Kernel(kernel, degree=1.0, coef0=0.0, gamma=1.0)¶
- __call__(v1, v2)¶
- __repr__()¶
- _check()¶
- _get_kernel(kernel_name)¶
- _linear(v1, v2)¶
- _polynomial(v1, v2)¶
- _rbf(v1, v2)¶
- _kernel¶
- _kernel_name¶
- coef0¶
- degree¶
- gamma¶
- class machine_learning.sequential_minimum_optimization.SmoSVM(train, kernel_func, alpha_list=None, cost=0.4, b=0.0, tolerance=0.001, auto_norm=True)¶
- _calculate_k_matrix()¶
- _check_obey_kkt(index)¶
- _choose_a1()¶
Choose first alpha Steps:
1: First loop over all samples 2: Second loop over all non-bound samples until no non-bound samples violate
the KKT condition.
- 3: Repeat these two processes until no samples violate the KKT condition
after the first loop.
- _choose_a2(i1)¶
Choose the second alpha using a heuristic algorithm Steps:
1: Choose alpha2 that maximizes the step size (|E1 - E2|). 2: Start in a random point, loop over all non-bound samples till alpha1 and
alpha2 are optimized.
- 3: Start in a random point, loop over all samples till alpha1 and alpha2 are
optimized.
- _choose_alphas()¶
- _e(index)¶
- Two cases:
1: Sample[index] is non-bound, fetch error from list: _error 2: sample[index] is bound, use predicted value minus true value: g(xi) - yi
- _get_new_alpha(i1, i2, a1, a2, e1, e2, y1, y2)¶
- _is_support(index)¶
- _is_unbound(index)¶
- _k(i1, i2)¶
- _norm(data)¶
- _predict(sample)¶
- fit()¶
- predict(test_samples, classify=True)¶
- Kernel¶
- _K_matrix¶
- _all_samples¶
- _auto_norm¶
- _b¶
- _c¶
- _eps = 0.001¶
- _error¶
- _init = True¶
- _tol¶
- _unbound = []¶
- alphas¶
- choose_alpha¶
- property length¶
- samples¶
- property support¶
- tags¶
- property unbound¶
- machine_learning.sequential_minimum_optimization.count_time(func)¶
- machine_learning.sequential_minimum_optimization.plot_partition_boundary(model, train_data, ax, resolution=100, colors=('b', 'k', 'r'))¶
We cannot get the optimal w of our kernel SVM model, which is different from a linear SVM. For this reason, we generate randomly distributed points with high density, and predicted values of these points are calculated using our trained model. Then we could use this predicted values to draw contour map, and this contour map represents the SVM’s partition boundary.
- machine_learning.sequential_minimum_optimization.test_cancer_data()¶
- machine_learning.sequential_minimum_optimization.test_demonstration()¶
- machine_learning.sequential_minimum_optimization.test_linear_kernel(ax, cost)¶
- machine_learning.sequential_minimum_optimization.test_rbf_kernel(ax, cost)¶
- machine_learning.sequential_minimum_optimization.CANCER_DATASET_URL = 'https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data'¶