Algorithms_in_C++ 1.0.0
Set of algorithms implemented in C++.
Loading...
Searching...
No Matches
windowed_median.cpp File Reference

An implementation of a median calculation of a sliding window along a data stream. More...

#include <cassert>
#include <cstdlib>
#include <ctime>
#include <list>
#include <set>
#include <vector>
Include dependency graph for windowed_median.cpp:

Classes

class  probability::windowed_median::WindowedMedian
 A class to calculate the median of a leading sliding window at the back of a stream of integer values. More...
 

Namespaces

namespace  probability
 Probability algorithms.
 
namespace  windowed_median
 Functions for the Windowed Median algorithm implementation.
 

Typedefs

using probability::windowed_median::Window = std::list<int>
 
using probability::windowed_median::size_type = Window::size_type
 

Functions

static void test (const std::vector< int > &vals, int windowSize)
 Self-test implementations.
 
int main (int argc, const char *argv[])
 Main function.
 

Detailed Description

An implementation of a median calculation of a sliding window along a data stream.

Given a stream of integers, the algorithm calculates the median of a fixed size window at the back of the stream. The leading time complexity of this algorithm is O(log(N), and it is inspired by the known algorithm to [find median from (infinite) data stream](https://www.tutorialcup.com/interview/algorithm/find-median-from-data-stream.htm), with the proper modifications to account for the finite window size for which the median is requested

Algorithm

The sliding window is managed by a list, which guarantees O(1) for both pushing and popping. Each new value is pushed to the window back, while a value from the front of the window is popped. In addition, the algorithm manages a multi-value binary search tree (BST), implemented by std::multiset. For each new value that is inserted into the window, it is also inserted to the BST. When a value is popped from the window, it is also erased from the BST. Both insertion and erasion to/from the BST are O(logN) in time, with N the size of the window. Finally, the algorithm keeps a pointer to the root of the BST, and updates its position whenever values are inserted or erased to/from BST. The root of the tree is the median! Hence, median retrieval is always O(1)

Time complexity: O(logN). Space complexity: O(N). N - size of window

Author
Yaniv Hollander

Function Documentation

◆ main()

int main ( int argc,
const char * argv[] )

Main function.

Parameters
argccommand line argument count (ignored)
argvcommand line array of arguments (ignored)
Returns
0 on exit

A few fixed test cases

Array of sorted values; odd window size

Array of sorted values - decreasing; odd window size

Even window size

Array with repeating values

Array with same values except one

Array that includes repeating values including negatives

Array with large values - sum of few pairs exceeds MAX_INT. Window size is even - testing calculation of average median between two middle values

Random test cases

Array size in the range [5, 20]

Window size in the range [3, 10]

Random array values (positive/negative)

Testing randomized test

198 {
199 /// A few fixed test cases
200 test({1, 2, 3, 4, 5, 6, 7, 8, 9},
201 3); /// Array of sorted values; odd window size
202 test({9, 8, 7, 6, 5, 4, 3, 2, 1},
203 3); /// Array of sorted values - decreasing; odd window size
204 test({9, 8, 7, 6, 5, 4, 5, 6}, 4); /// Even window size
205 test({3, 3, 3, 3, 3, 3, 3, 3, 3}, 3); /// Array with repeating values
206 test({3, 3, 3, 3, 7, 3, 3, 3, 3}, 3); /// Array with same values except one
207 test({4, 3, 3, -5, -5, 1, 3, 4, 5},
208 5); /// Array that includes repeating values including negatives
209
210 /// Array with large values - sum of few pairs exceeds MAX_INT. Window size
211 /// is even - testing calculation of average median between two middle
212 /// values
213 test({470211272, 101027544, 1457850878, 1458777923, 2007237709, 823564440,
214 1115438165, 1784484492, 74243042, 114807987},
215 6);
216
217 /// Random test cases
218 std::srand(static_cast<unsigned int>(std::time(nullptr)));
219 std::vector<int> vals;
220 for (int i = 8; i < 100; i++) {
221 const auto n =
222 1 + std::rand() /
223 ((RAND_MAX + 5u) / 20); /// Array size in the range [5, 20]
224 auto windowSize =
225 1 + std::rand() / ((RAND_MAX + 3u) /
226 10); /// Window size in the range [3, 10]
227 vals.clear();
228 vals.reserve(n);
229 for (int i = 0; i < n; i++) {
230 vals.push_back(
231 rand() - RAND_MAX); /// Random array values (positive/negative)
232 }
233 test(vals, windowSize); /// Testing randomized test
234 }
235 return 0;
236}
T clear(T... args)
static void test()
Self-test implementations.
Definition generate_parentheses.cpp:82
T push_back(T... args)
T rand(T... args)
T reserve(T... args)
T srand(T... args)
T time(T... args)
Here is the call graph for this function:

◆ test()

static void test ( const std::vector< int > & vals,
int windowSize )
static

Self-test implementations.

Parameters
valsStream of values
windowSizeSize of sliding window

Comparing medians: efficient function vs. Naive one

182 {
183 probability::windowed_median::WindowedMedian windowedMedian(windowSize);
184 for (const auto val : vals) {
185 windowedMedian.insert(val);
186
187 /// Comparing medians: efficient function vs. Naive one
188 assert(windowedMedian.getMedian() == windowedMedian.getMedianNaive());
189 }
190}
A class to calculate the median of a leading sliding window at the back of a stream of integer values...
Definition windowed_median.cpp:56
Here is the call graph for this function: