machine_learning.apriori_algorithm
==================================

.. py:module:: machine_learning.apriori_algorithm

.. autoapi-nested-parse::

   Apriori Algorithm is a Association rule mining technique, also known as market basket
   analysis, aims to discover interesting relationships or associations among a set of
   items in a transactional or relational database.

   For example, Apriori Algorithm states: "If a customer buys item A and item B, then they
   are likely to buy item C."  This rule suggests a relationship between items A, B, and C,
   indicating that customers who purchased A and B are more likely to also purchase item C.

   WIKI: https://en.wikipedia.org/wiki/Apriori_algorithm
   Examples: https://www.kaggle.com/code/earthian/apriori-association-rules-mining


Attributes
----------

.. autoapisummary::

   machine_learning.apriori_algorithm.frequent_itemsets


Functions
---------

.. autoapisummary::

   machine_learning.apriori_algorithm.apriori
   machine_learning.apriori_algorithm.load_data
   machine_learning.apriori_algorithm.prune


Module Contents
---------------

.. py:function:: apriori(data: list[list[str]], min_support: int) -> list[tuple[list[str], int]]

   Returns a list of frequent itemsets and their support counts.

   >>> data = [['A', 'B', 'C'], ['A', 'B'], ['A', 'C'], ['A', 'D'], ['B', 'C']]
   >>> apriori(data, 2)
   [(['A', 'B'], 1), (['A', 'C'], 2), (['B', 'C'], 2)]

   >>> data = [['1', '2', '3'], ['1', '2'], ['1', '3'], ['1', '4'], ['2', '3']]
   >>> apriori(data, 3)
   []


.. py:function:: load_data() -> list[list[str]]

   Returns a sample transaction dataset.

   >>> load_data()
   [['milk'], ['milk', 'butter'], ['milk', 'bread'], ['milk', 'bread', 'chips']]


.. py:function:: prune(itemset: list, candidates: list, length: int) -> list

   Prune candidate itemsets that are not frequent.
   The goal of pruning is to filter out candidate itemsets that are not frequent.  This
   is done by checking if all the (k-1) subsets of a candidate itemset are present in
   the frequent itemsets of the previous iteration (valid subsequences of the frequent
   itemsets from the previous iteration).

   Prunes candidate itemsets that are not frequent.

   >>> itemset = ['X', 'Y', 'Z']
   >>> candidates = [['X', 'Y'], ['X', 'Z'], ['Y', 'Z']]
   >>> prune(itemset, candidates, 2)
   [['X', 'Y'], ['X', 'Z'], ['Y', 'Z']]

   >>> itemset = ['1', '2', '3', '4']
   >>> candidates = ['1', '2', '4']
   >>> prune(itemset, candidates, 3)
   []


.. py:data:: frequent_itemsets
   :value: []