machine_learning.frequent_pattern_growth

The Frequent Pattern Growth algorithm (FP-Growth) is a widely used data mining technique for discovering frequent itemsets in large transaction databases.

It overcomes some of the limitations of traditional methods such as Apriori by efficiently constructing the FP-Tree

WIKI: https://athena.ecs.csus.edu/~mei/associationcw/FpGrowth.html

Examples: https://www.javatpoint.com/fp-growth-algorithm-in-data-mining

Attributes

data_set

Classes

TreeNode

A node in a Frequent Pattern tree.

Functions

ascend_tree(→ None)

Ascend the FP-Tree from a leaf node to its root, adding item names to the prefix

create_tree(→ tuple[TreeNode, dict])

Create Frequent Pattern tree

find_prefix_path(→ dict)

Find the conditional pattern base for a given base pattern.

mine_tree(→ None)

Mine the FP-Tree recursively to discover frequent itemsets.

update_header(→ TreeNode)

Update the header table with a node link.

update_tree(→ None)

Update the FP-Tree with a transaction.

Module Contents

class machine_learning.frequent_pattern_growth.TreeNode

A node in a Frequent Pattern tree.

Args:

name: The name of this node. num_occur: The number of occurrences of the node. parent_node: The parent node.

Example: >>> parent = TreeNode(“Parent”, 1, None) >>> child = TreeNode(“Child”, 2, parent) >>> child.name ‘Child’ >>> child.count 2

__repr__() str
disp(ind: int = 1) None
inc(num_occur: int) None
children: dict[str, TreeNode]
count: int
name: str
parent: TreeNode | None = None
machine_learning.frequent_pattern_growth.ascend_tree(leaf_node: TreeNode, prefix_path: list[str]) None

Ascend the FP-Tree from a leaf node to its root, adding item names to the prefix path.

Args:

leaf_node: The leaf node to start ascending from. prefix_path: A list to store the item as they are ascended.

Example: >>> data_set = [ … [‘A’, ‘B’, ‘C’], … [‘A’, ‘C’], … [‘A’, ‘B’, ‘E’], … [‘A’, ‘B’, ‘C’, ‘E’], … [‘B’, ‘E’] … ] >>> min_sup = 2 >>> fp_tree, header_table = create_tree(data_set, min_sup)

>>> path = []
>>> ascend_tree(fp_tree.children['A'], path)
>>> path # ascending from a leaf node 'A'
['A']
machine_learning.frequent_pattern_growth.create_tree(data_set: list, min_sup: int = 1) tuple[TreeNode, dict]

Create Frequent Pattern tree

Args:

data_set: A list of transactions, where each transaction is a list of items. min_sup: The minimum support threshold. Items with support less than this will be pruned. Default is 1.

Returns:

The root of the FP-Tree. header_table: The header table dictionary with item information.

Example: >>> data_set = [ … [‘A’, ‘B’, ‘C’], … [‘A’, ‘C’], … [‘A’, ‘B’, ‘E’], … [‘A’, ‘B’, ‘C’, ‘E’], … [‘B’, ‘E’] … ] >>> min_sup = 2 >>> fp_tree, header_table = create_tree(data_set, min_sup) >>> fp_tree TreeNode(‘Null Set’, 1, None) >>> len(header_table) 4 >>> header_table[“A”] [[4, None], TreeNode(‘A’, 4, TreeNode(‘Null Set’, 1, None))] >>> header_table[“E”][1] # doctest: +NORMALIZE_WHITESPACE TreeNode(‘E’, 1, TreeNode(‘B’, 3, TreeNode(‘A’, 4, TreeNode(‘Null Set’, 1, None)))) >>> sorted(header_table) [‘A’, ‘B’, ‘C’, ‘E’] >>> fp_tree.name ‘Null Set’ >>> sorted(fp_tree.children) [‘A’, ‘B’] >>> fp_tree.children[‘A’].name ‘A’ >>> sorted(fp_tree.children[‘A’].children) [‘B’, ‘C’]

machine_learning.frequent_pattern_growth.find_prefix_path(base_pat: frozenset, tree_node: TreeNode | None) dict

Find the conditional pattern base for a given base pattern.

Args:

base_pat: The base pattern for which to find the conditional pattern base. tree_node: The node in the FP-Tree.

Example: >>> data_set = [ … [‘A’, ‘B’, ‘C’], … [‘A’, ‘C’], … [‘A’, ‘B’, ‘E’], … [‘A’, ‘B’, ‘C’, ‘E’], … [‘B’, ‘E’] … ] >>> min_sup = 2 >>> fp_tree, header_table = create_tree(data_set, min_sup) >>> fp_tree TreeNode(‘Null Set’, 1, None) >>> len(header_table) 4 >>> base_pattern = frozenset([‘A’]) >>> sorted(find_prefix_path(base_pattern, fp_tree.children[‘A’])) []

machine_learning.frequent_pattern_growth.mine_tree(in_tree: TreeNode, header_table: dict, min_sup: int, pre_fix: set, freq_item_list: list) None

Mine the FP-Tree recursively to discover frequent itemsets.

Args:

in_tree: The FP-Tree to mine. header_table: The header table dictionary with item information. min_sup: The minimum support threshold. pre_fix: A set of items as a prefix for the itemsets being mined. freq_item_list: A list to store the frequent itemsets.

Example: >>> data_set = [ … [‘A’, ‘B’, ‘C’], … [‘A’, ‘C’], … [‘A’, ‘B’, ‘E’], … [‘A’, ‘B’, ‘C’, ‘E’], … [‘B’, ‘E’] … ] >>> min_sup = 2 >>> fp_tree, header_table = create_tree(data_set, min_sup) >>> fp_tree TreeNode(‘Null Set’, 1, None) >>> frequent_itemsets = [] >>> mine_tree(fp_tree, header_table, min_sup, set([]), frequent_itemsets) >>> expe_itm = [{‘C’}, {‘C’, ‘A’}, {‘E’}, {‘A’, ‘E’}, {‘E’, ‘B’}, {‘A’}, {‘B’}] >>> all(expected in frequent_itemsets for expected in expe_itm) True

machine_learning.frequent_pattern_growth.update_header(node_to_test: TreeNode, target_node: TreeNode) TreeNode

Update the header table with a node link.

Args:

node_to_test: The node to be updated in the header table. target_node: The node to link to.

Example: >>> data_set = [ … [‘A’, ‘B’, ‘C’], … [‘A’, ‘C’], … [‘A’, ‘B’, ‘E’], … [‘A’, ‘B’, ‘C’, ‘E’], … [‘B’, ‘E’] … ] >>> min_sup = 2 >>> fp_tree, header_table = create_tree(data_set, min_sup) >>> fp_tree TreeNode(‘Null Set’, 1, None) >>> node1 = TreeNode(“A”, 3, None) >>> node2 = TreeNode(“B”, 4, None) >>> node1 TreeNode(‘A’, 3, None) >>> node1 = update_header(node1, node2) >>> node1 TreeNode(‘A’, 3, None) >>> node1.node_link TreeNode(‘B’, 4, None) >>> node2.node_link is None True

machine_learning.frequent_pattern_growth.update_tree(items: list, in_tree: TreeNode, header_table: dict, count: int) None

Update the FP-Tree with a transaction.

Args:

items: List of items in the transaction. in_tree: The current node in the FP-Tree. header_table: The header table dictionary with item information. count: The count of the transaction.

Example: >>> data_set = [ … [‘A’, ‘B’, ‘C’], … [‘A’, ‘C’], … [‘A’, ‘B’, ‘E’], … [‘A’, ‘B’, ‘C’, ‘E’], … [‘B’, ‘E’] … ] >>> min_sup = 2 >>> fp_tree, header_table = create_tree(data_set, min_sup) >>> fp_tree TreeNode(‘Null Set’, 1, None) >>> transaction = [‘A’, ‘B’, ‘E’] >>> update_tree(transaction, fp_tree, header_table, 1) >>> fp_tree TreeNode(‘Null Set’, 1, None) >>> fp_tree.children[‘A’].children[‘B’].children[‘E’].children {} >>> fp_tree.children[‘A’].children[‘B’].children[‘E’].count 2 >>> header_table[‘E’][1].name ‘E’

machine_learning.frequent_pattern_growth.data_set: list[frozenset]