machine_learning.principle_component_analysis

Principal Component Analysis (PCA) is a dimensionality reduction technique used in machine learning. It transforms high-dimensional data into a lower-dimensional representation while retaining as much variance as possible.

This implementation follows best practices, including: - Standardizing the dataset. - Computing principal components using Singular Value Decomposition (SVD). - Returning transformed data and explained variance ratio.

Functions

apply_pca(→ tuple[numpy.ndarray, numpy.ndarray])

Applies Principal Component Analysis (PCA) to reduce dimensionality.

collect_dataset(→ tuple[numpy.ndarray, numpy.ndarray])

Collects the dataset (Iris dataset) and returns feature matrix and target values.

main(→ None)

Driver function to execute PCA and display results.

Module Contents

machine_learning.principle_component_analysis.apply_pca(data_x: numpy.ndarray, n_components: int) tuple[numpy.ndarray, numpy.ndarray]

Applies Principal Component Analysis (PCA) to reduce dimensionality.

Parameters:
  • data_x – Original dataset (features)

  • n_components – Number of principal components to retain

Returns:

Tuple containing transformed dataset and explained variance ratio

Example: >>> X, _ = collect_dataset() >>> transformed_X, variance = apply_pca(X, 2) >>> transformed_X.shape (150, 2) >>> len(variance) == 2 True

machine_learning.principle_component_analysis.collect_dataset() tuple[numpy.ndarray, numpy.ndarray]

Collects the dataset (Iris dataset) and returns feature matrix and target values.

Returns:

Tuple containing feature matrix (X) and target labels (y)

Example: >>> X, y = collect_dataset() >>> X.shape (150, 4) >>> y.shape (150,)

machine_learning.principle_component_analysis.main() None

Driver function to execute PCA and display results.