machine_learning.t_stochastic_neighbour_embedding ================================================= .. py:module:: machine_learning.t_stochastic_neighbour_embedding .. autoapi-nested-parse:: t-distributed stochastic neighbor embedding (t-SNE) For more details, see: https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding Functions --------- .. autoapisummary:: machine_learning.t_stochastic_neighbour_embedding.apply_tsne machine_learning.t_stochastic_neighbour_embedding.collect_dataset machine_learning.t_stochastic_neighbour_embedding.compute_low_dim_affinities machine_learning.t_stochastic_neighbour_embedding.compute_pairwise_affinities machine_learning.t_stochastic_neighbour_embedding.main Module Contents --------------- .. py:function:: apply_tsne(data_matrix: numpy.ndarray, n_components: int = 2, learning_rate: float = 200.0, n_iter: int = 500) -> numpy.ndarray Apply t-SNE for dimensionality reduction. Args: data_matrix: Original dataset (features). n_components: Target dimension (2D or 3D). learning_rate: Step size for gradient descent. n_iter: Number of iterations. Returns: ndarray: Low-dimensional embedding of the data. >>> features, _ = collect_dataset() >>> embedding = apply_tsne(features, n_components=2, n_iter=50) >>> embedding.shape (150, 2) .. py:function:: collect_dataset() -> tuple[numpy.ndarray, numpy.ndarray] Load the Iris dataset and return features and labels. Returns: tuple[ndarray, ndarray]: Feature matrix and target labels. >>> features, targets = collect_dataset() >>> features.shape (150, 4) >>> targets.shape (150,) .. py:function:: compute_low_dim_affinities(embedding_matrix: numpy.ndarray) -> tuple[numpy.ndarray, numpy.ndarray] Compute low-dimensional affinities (Q matrix) using a Student-t distribution. Args: embedding_matrix: Low-dimensional embedding of shape (n_samples, n_components). Returns: tuple[ndarray, ndarray]: (Q probability matrix, numerator matrix). >>> y = np.array([[0.0, 0.0], [1.0, 0.0]]) >>> q_matrix, numerators = compute_low_dim_affinities(y) >>> q_matrix.shape (2, 2) .. py:function:: compute_pairwise_affinities(data_matrix: numpy.ndarray, sigma: float = 1.0) -> numpy.ndarray Compute high-dimensional affinities (P matrix) using a Gaussian kernel. Args: data_matrix: Input data of shape (n_samples, n_features). sigma: Gaussian kernel bandwidth. Returns: ndarray: Symmetrized probability matrix. >>> x = np.array([[0.0, 0.0], [1.0, 0.0]]) >>> probabilities = compute_pairwise_affinities(x) >>> float(round(probabilities[0, 1], 3)) 0.25 .. py:function:: main() -> None Run t-SNE on the Iris dataset and display the first 5 embeddings. >>> main() # doctest: +ELLIPSIS t-SNE embedding (first 5 points): [[...