Scaling Timing Bloom Filter

class fuggetaboutit.ScalingTimingBloomFilter(capacity, decay_time, error=0.005, error_tightening_ratio=0.5, growth_factor=2, min_fill_factor=0.2, max_fill_factor=0.8, insert_order='converge', ioloop=None)

A bloom filter that will decay old values and scale up capacity as needed. This bloom filter will automatically decay values such that an item added will be removed from the bloom after decay_time seconds. The underlying bloom has initial capacity capacity and maximum error error. In addition, the bloom will automatically scale using Almeida’s method from “Scalable Bloom Filters” using an error tightening ratio and growth factor given by error_tightening_ratio and growth_factor. A bloom filter will be scaled up when approximatly max_fill_factor percent of the capacity of the bloom filter are in use. Conversely, the bloom will be scaled down if there is one bloom left and it has a fill percentage less than min_fill_factor. Together, these two fill factor conditionals attempt to keep the scaling bloom at the right size for the current stream of data.

Parameters:
  • capacity (integer) – initial capacity
  • decay_time (integer) – time, in seconds, for an item to decay
  • error (float) – maximum error rate
  • error_tightening_ratio (float) – reduction factor for the error rate of scaled blooms
  • growth_factor (float or None) – increasing factor for the capacity of scaled blooms
  • max_fill_factor (0 < float < max_fill_factor or None) – maximum fill factor of a bloom to be considered full
  • max_fill_factor – minimum fill factor of a bloom to be considered active
  • insert_order (‘compact’ or ‘converge’) – Whether to insert in order to optimize compactness or convergence
  • ioloop (tornado.ioloop.IOLoop or None) – an instance of an IOLoop to attatch the periodic decay operation to
add(key, timestamp=None)

Add key to the bloom filter and scale if necissary. The key will be inserted at the current timestamp if parameter timestamp is not provided.

Parameters:
  • key (str) – key to be added
  • timestamp (int) – timestamp of the item
contains(key)

Check if key is contained in the bloom filter

Parameters:key (str) – key to be added
Return type:bool
decay()

Decay the bloom filter and remove items that are older than decay_time. This will also remove empty bloom filters.

expected_error()

Return the current expected error rate from the blooms. This should always be smaller than the maximum error given in the constructor.

Return type:float
classmethod fromfile(f)

Reads the bloom from the given file f.

Parameters:f (file) – File to save the bloom filter to
Return type:ScalingTimingBloomFilter
set_ioloop(ioloop=None)

Set a new IOLoop to attatch the decay operation to. stop() should be called before changing the ioloop.

Parameters:ioloop (tornado.ioloop.IOLoop or None) – an instance of an IOLoop to attatch the periodic decay operation to
size()

Returns the approximate size of the current bloom state.

Return type:float
start()

Start a periodic callback on the IOLoop to decay the bloom at every half decay_time.

stop()

Stop the periodic decay operation

tofile(f)

Saves the current bloom to the file object f.

Parameters:f (file) – File to save the bloom filter to

Previous topic

Installing

Next topic

Examples

This Page