strings.top_k_frequent_words¶

Finds the top K most frequent words from the provided word list.

This implementation aims to show how to solve the problem using the Heap class already present in this repository. Computing order statistics is, in fact, a typical usage of heaps.

This is mostly shown for educational purposes, since the problem can be solved in a few lines using collections.Counter from the Python standard library:

from collections import Counter def top_k_frequent_words(words, k_value):

return [x[0] for x in Counter(words).most_common(k_value)]

Classes¶

WordCount

Functions¶

top_k_frequent_words(→ list[str])

Returns the k_value most frequently occurring words,

Module Contents¶

class strings.top_k_frequent_words.WordCount(word: str, count: int)¶

__eq__(other: object) → bool¶

>>> WordCount('a', 1).__eq__(WordCount('b', 1))
True
>>> WordCount('a', 1).__eq__(WordCount('a', 1))
True
>>> WordCount('a', 1).__eq__(WordCount('a', 2))
False
>>> WordCount('a', 1).__eq__(WordCount('b', 2))
False
>>> WordCount('a', 1).__eq__(1)
NotImplemented

__lt__(other: object) → bool¶

>>> WordCount('a', 1).__lt__(WordCount('b', 1))
False
>>> WordCount('a', 1).__lt__(WordCount('a', 1))
False
>>> WordCount('a', 1).__lt__(WordCount('a', 2))
True
>>> WordCount('a', 1).__lt__(WordCount('b', 2))
True
>>> WordCount('a', 2).__lt__(WordCount('a', 1))
False
>>> WordCount('a', 2).__lt__(WordCount('b', 1))
False
>>> WordCount('a', 1).__lt__(1)
NotImplemented

count¶

word¶

strings.top_k_frequent_words.top_k_frequent_words(words: list[str], k_value: int) → list[str]¶

Returns the k_value most frequently occurring words, in non-increasing order of occurrence. In this context, a word is defined as an element in the provided list.

In case k_value is greater than the number of distinct words, a value of k equal to the number of distinct words will be considered, instead.

>>> top_k_frequent_words(['a', 'b', 'c', 'a', 'c', 'c'], 3)
['c', 'a', 'b']
>>> top_k_frequent_words(['a', 'b', 'c', 'a', 'c', 'c'], 2)
['c', 'a']
>>> top_k_frequent_words(['a', 'b', 'c', 'a', 'c', 'c'], 1)
['c']
>>> top_k_frequent_words(['a', 'b', 'c', 'a', 'c', 'c'], 0)
[]
>>> top_k_frequent_words([], 1)
[]
>>> top_k_frequent_words(['a', 'a'], 2)
['a']

strings.top_k_frequent_words¶

Classes¶

Functions¶

Module Contents¶

thealgorithms-python

Navigation

Related Topics