machine_learning.apriori_algorithm

Apriori Algorithm is a Association rule mining technique, also known as market basket analysis, aims to discover interesting relationships or associations among a set of items in a transactional or relational database.

For example, Apriori Algorithm states: “If a customer buys item A and item B, then they are likely to buy item C.” This rule suggests a relationship between items A, B, and C, indicating that customers who purchased A and B are more likely to also purchase item C.

WIKI: https://en.wikipedia.org/wiki/Apriori_algorithm Examples: https://www.kaggle.com/code/earthian/apriori-association-rules-mining

Attributes

frequent_itemsets

Functions

apriori(→ list[tuple[list[str], int]])

Returns a list of frequent itemsets and their support counts.

load_data(→ list[list[str]])

Returns a sample transaction dataset.

prune(→ list)

Prune candidate itemsets that are not frequent.

Module Contents

machine_learning.apriori_algorithm.apriori(data: list[list[str]], min_support: int) list[tuple[list[str], int]]

Returns a list of frequent itemsets and their support counts.

>>> data = [['A', 'B', 'C'], ['A', 'B'], ['A', 'C'], ['A', 'D'], ['B', 'C']]
>>> apriori(data, 2)
[(['A', 'B'], 1), (['A', 'C'], 2), (['B', 'C'], 2)]
>>> data = [['1', '2', '3'], ['1', '2'], ['1', '3'], ['1', '4'], ['2', '3']]
>>> apriori(data, 3)
[]
machine_learning.apriori_algorithm.load_data() list[list[str]]

Returns a sample transaction dataset.

>>> load_data()
[['milk'], ['milk', 'butter'], ['milk', 'bread'], ['milk', 'bread', 'chips']]
machine_learning.apriori_algorithm.prune(itemset: list, candidates: list, length: int) list

Prune candidate itemsets that are not frequent. The goal of pruning is to filter out candidate itemsets that are not frequent. This is done by checking if all the (k-1) subsets of a candidate itemset are present in the frequent itemsets of the previous iteration (valid subsequences of the frequent itemsets from the previous iteration).

Prunes candidate itemsets that are not frequent.

>>> itemset = ['X', 'Y', 'Z']
>>> candidates = [['X', 'Y'], ['X', 'Z'], ['Y', 'Z']]
>>> prune(itemset, candidates, 2)
[['X', 'Y'], ['X', 'Z'], ['Y', 'Z']]
>>> itemset = ['1', '2', '3', '4']
>>> candidates = ['1', '2', '4']
>>> prune(itemset, candidates, 3)
[]
machine_learning.apriori_algorithm.frequent_itemsets = []