Burst Search in FRETBursts

This section describes details and conventions used to implement burst search in FRETBursts. For a more general explanation of burst search concepts see (Ingargiola PLOS ONE 2016). For usage examples see the μs-ALEX notebook. An analysis of implementation performances of the low-level burst search can be found in this blog post: Optimizing burst search in python.

Defining the rate estimator

Before describing FRETBursts implementation let me introduce an expression for computing rates of random events that will be used later on. A general expression, used by FRETBursts (since version 0.5.6), for estimating the rate using m consecutive timestamps is:

(1)\[\hat{\lambda} = \frac{m - 1 - c}{t_{i + m - 1} - t_{i}}\]

where \(c\) is a parameter that can be passed to all FRETBursts functions that deal with photon rates. Note that \(m\) is the number of photons and \(m - 1\) is the number of inter-photon delays. For example, using \(c=1\), yields an unbiased estimator of the rate for events generated by a stationary Poisson process. See this notebook for a discussion of the different estimator properties as a function of \(c\). In practice, the choice of \(c\) is just a convention and it is provided for flexibility and to match results of other software that may use a different definition.

In FRETBursts version 0.5.5 or earlier, there is no c parameter and the rate is always computed as \(\hat{\lambda} = m / (t_{i + m - 1} - t_{i})\) (equivalent to \(c=-1\)).

The Core Algorithm

The different types of burst search described in the previous sections are implemented calling the same low-level burst search function which implements the core “sliding window” algorithm. Here we explain in details this core algorithm.

The low-level burst search takes as an input the array of (monotonically increasing) photon timestamps, as well as two other arguments m (the number of timestamps) and T (the time window duration). Starting from the the first element of the array, we consider all the m-tuple of timestamps [0..m-1], [1..m], etc.

Point 1. For each m-tuple if the timestamps are contained in a time window smaller or equal to T we mark a burst start at the position of the first timestamp in the current m-tuple. Otherwise we take the next m-tuple and repeat the check.

Once a burst starts, we keep “sliding” the m-tuple one timestamp a time. If the current m-tuple is still contained in a window of duration T the burst continues. When the current m-tuple is contained in a window larger than T the burst ends. When this happens, the last timestamp in a burst is the (m-1)-th timestamp of current m-tuple (i.e. the last timestamps of the previous m-tuple which was still contained in a window T). After the burst ends, we continue as in point 1 checking the next m-tuple. This is shifted by only one timestamp (i.e. there is no jump when the burst ends).

At this point, it can happen that the current m-tuple is contained in T and a new burst starts right away. In this situation the new bursts will have m-2 timestamps overlapping with the previous one.

At the end of the timestamp array, if a burst is currently started we end it by marking the last timestamp as burst stop. The set of bursts obtained in this way has the minimum-rate property, i.e. all the m-tuple of consecutive timestamps in any burst are guaranteed to be contained in a windows T or smaller. Conversely, a few bursts will overlap and thus share some timestamps. If the user wants to avoid overlapping bursts a burst fusion steps must be applied as described in next section. Note, however, that after fusing overlapping bursts at least one m-tuple inside each fused burst will not have the minimum-rate property, i.e. the m-tuple is contained in a window larger than T.

The previous function is implemented in phtools.burstsearch.bsearch_py() (pure python version) and in phtools.burstsearch_c.bsearch_c() (optimized cython version). Several tests make sure that the two functions return numerically identical results. An analysis of performance of of different implementations can be found in this blog post: Optimizing burst search in python.

Burst Fusion

Burst fusion is an operation which fuses consecutive bursts if the start of the second bursts minus the end of the first burst (called burst separation) is <= of a fusion time \(t_f\). When bursts are overlapping (see previous section) the burst separation is negative. Therefore, to avoid overlapping bursts, we need to apply fusion with separation of 0. Note that with this condition, if a bursts ends on a timestamp which is the start of the next burst (i.e. 1 overlapping photon) the two bursts will be fused. Conversely if one burst ends and the next burst starts one photon later (0 overlapping photons) the two bursts will be kept separated. In the latter case, there will be no timestamp between the end of the previous burst and the start of the next one.

To perform burst fusion use the method Data.fuse_bursts().

Low-level burst search functions

The module phtools.burstsearch provides the low-level (or core) burst search and photon counting functions. This module also provides Bursts, a container for a set of bursts. Bursts provides attributes for the main burst quatitites (istart, istop, start, stop, counts, width, etc…). It implements the iterator interface (iterate burst by burst). Moreover Bursts can be indexed ([], i.e. getitem interface) supporting the same indexing as a numpy 1-D array.

The burst search functions return a 2-D array (burst array) of shape Nx4, where N is the number of bursts. This array can used to build a Bursts object using:

Bursts(bursts_array)

As an example, let assume having a burst array bursts. To take a slice of only the first 10 bursts you can do:

bursts10 = bursts[:10]   # new Bursts object with the first 10 bursts

To obtain the burst start of all the bursts:

bursts.start

To obtain the burst counts (number of photons) for the 10-th to 20-th burst:

bursts[10:20].counts

For efficiency, when iterating over Bursts the returned burst is a named tuple Burst, which implements the same attributes as Bursts (istart, istop, start, stop, counts and width). This results in faster iteration and attribute access than using Bursts objects with only one burst.

Three methods allow to transform Bursts to refer to a new timestamps array:

Finally, in order to support fusion of consecutive bursts, we provide the class BurstsGap (and single-burst version BurstGap) which add the attributes gap and gap_counts that contains the duration and the number of photons in gaps inside a burst. The attribute width is the total burst duration minus gap, while counts is the total number of photons minus photons falling inside gaps (gaps are open intervals, do not include edges).

class fretbursts.phtools.burstsearch.Burst(istart, istop, start, stop)

Container for a single burst.

property counts

Number of photons in the burst.

property ph_rate

Photon rate in the burst (total photon counts/duration).

property width

Burst duration in timestamps unit.

class fretbursts.phtools.burstsearch.BurstGap(istart, istop, start, stop, gap, gap_counts)
property counts

Number of photons in the burst, minus gap_counts.

static from_burst(burst)

Build a BurstGap from a Burst object.

property width

Burst duration in timestamps unit, minus gap time.

class fretbursts.phtools.burstsearch.Bursts(burstarray)

A container for burst data.

This class provides a container for burst data. It has a set of attributes (start, stop, istart, istop, counts, width, ph_rate, separation) that can be accessed to obtain burst data. Only a few fundamental attributes are stored, the others are comuputed on-fly using python properties.

Other attributes are dataframe (a pandas.DataFrame with the complete burst data), num_bursts (the number of bursts).

Bursts objects can be built from a list of single Burst objects by using the method Bursts.from_list(), or from 2D arrays containing bursts data (one row per burst; columns: istart, istop, start, stop) such as the ones returned by burst search functions (e.g. bsearch_py()).

Bursts objects are iterable, yielding one burst a time (Burst objects). Bursts can be compared for equality (with ==) and copied (Bursts.copy()).

Additionally basic methods for burst manipulation are provided:

  • recompute_times recompute start and stop times using the current start and stop index and a new timestamps array passed as argument.

  • recompute_index_* recompute start and stop indexes to refer to an expanded or reduced timestamp selection.

Other methods are:

  • and_gate computing burst intersection with a second set of bursts. Used to implement the dual-channel burst search (DCBS).

Methods that may be implemented in the future:

  • or_gate: computing union with a second set of bursts.

  • fuse_bursts: fuse nearby bursts.

and_gate(bursts2)

From 2 burst arrays return bursts defined as intersection (AND rule).

The two input burst-arrays come from 2 different burst searches. Returns new bursts representing the overlapping bursts in the 2 inputs with start and stop defined as intersection (or AND) operator.

Both input and output are Bursts objects.

Parameters

bursts_a (Bursts object) – second set of bursts to be intersected with bursts in self. The number of bursts in self and bursts_a can be different.

Returns

Bursts object containing intersections (AND) of overlapping bursts.

copy()

Return a new copy of current Bursts object.

property counts

Number of photons in each burst.

property dataframe

A pandas.DataFrame containing burst data, one row per burst.

classmethod empty(num_bursts=0)

Return an empty Bursts() object.

classmethod from_list(bursts_list)

Build a new Bursts() object from a list of Burst.

property istart

Index of 1st ph in each burst

property istop

Index of last ph in each burst

join(bursts, sort=False)

Join the current Bursts object with another one. Returns a copy.

classmethod merge(list_of_bursts, sort=False)

Merge Bursts in list_of_bursts, returning a new Bursts object.

property num_bursts

Number of bursts.

property ph_rate

Photon rate in burst (tot size/duration)

recompute_index_expand(mask, out=None)

Recompute istart and istop from selection mask to full timestamps.

This method returns a new Bursts object with recomputed istart and istop. Old istart, istop are assumed to be index of a reduced array timestamps[mask]. New istart, istop are computed to be index of a “full” timestamps array of size mask.size.

This is useful when performing burst search on a timestamps selection and we want to transform the burst data to use the index of the “full” timestamps array.

Parameters
  • mask (bool array) – boolean mask defining the timestamps selection on which the old istart and istop were computed.

  • out (None or Bursts) – if None (default), do computations on a copy of the current object. Otherwise, modify the Bursts object passed (can be used for in-place operations).

Returns

Bursts object with recomputed istart/istop.

recompute_index_reduce(times_reduced, out=None)

Recompute istart and istop on reduced timestamps times_reduced.

This method returns a new Bursts object with same start and stop times and recomputed istart and istop. Old istart, istop are assumed to be index of a “full” timestamps array of size mask.size. New istart, istop are computed to be index of the reduced timestamps array timestamps_reduced.

Note: it is required that all the start and stop times are also contained in the reduced timestamps selection.

This method is the inverse of recompute_index_expand().

Parameters
  • times_reduced (array) – array of selected timestamps used to compute the new istart and istop. This array needs to be a sub-set of the original timestamps array.

  • out (None or Bursts) – if None (default), do computations on a copy of the current object. Otherwise, modify the Bursts object passed (can be used for in-place operations).

Returns

Bursts object with recomputed istart/istop times.

recompute_times(times, out=None)

Recomputes start, stop times using timestamps from a new array.

This method computes burst start, stop using the index of timestamps from the current object and timestamps from the passed array times.

This is useful, for example, when burst search is computed on a “compacted” timestamps array (i.e. removing the gaps outside the alternation period in usALEX experiments), and afterwards the “real” start and stop times needs to be recomputed.

Parameters
  • times (array) – array of photon timestamps

  • out (None or Bursts) – if None (default), do computations on a copy of the current object. Otherwise, modify the Bursts object passed (can be used for in-place operations).

Returns

Bursts object with recomputed start/stop times.

property separation

Separation between nearby bursts

property size

Number of bursts. Used for compatibility with ndarray.size. Use Bursts.num_bursts preferentially.

property start

Time of 1st ph in each burst

property stop

Time of last ph in each burst

property width

Burst duration in timestamps units.

class fretbursts.phtools.burstsearch.BurstsGap(burstarray)

A container for bursts with optional gaps.

This class extend Bursts adding the attributes/properties gap (a duration) and gap_counts (counts in gap) that allow accounting for gaps inside bursts.

property counts

Number of photons in each burst, minus the gap_counts.

property dataframe

A pandas.DataFrame containing burst data, one row per burst.

classmethod from_list(bursts_list)

Build a new BurstsGap() from a list of BurstGap.

property gap

Time gap inside a burst

property gap_counts

Number of photons falling inside gaps of each burst.

property width

Burst duration in timestamps units, minus the gap time.

fretbursts.phtools.burstsearch.bsearch_py(times, L, m, T, slice_=None, label='Burst search', verbose=True)

Sliding window burst search. Pure python implementation.

Finds bursts in the array time (int64). A burst starts when the photon rate is above a minimum threshold, and ends when the rate falls below the same threshold. The rate-threshold is defined by the ratio m/T (m photons in a time interval T). A burst is discarded if it has less than L photons.

Parameters
  • times (array, int64) – array of timestamps on which to perform the search

  • L (int) – minimum number of photons in a bursts. Bursts with size (or counts) < L are discarded.

  • m (int) – number of consecutive photons used to compute the rate.

  • T (float) – max time separation of m photons to be inside a burst

  • slice_ (tuple) – 2-element tuple used to slice times

  • label (string) – a label printed when the function is called

  • verbose (bool) – if False, the function does not print anything.

Returns

Array of burst data Nx4, type int64. Column order is: istart, istop, start, stop.

fretbursts.phtools.burstsearch.count_ph_in_bursts(bursts, mask)

Counts number of photons in each burst counting only photons in mask.

This function takes a Bursts object and a boolean mask (photon selection) and computes the number of photons selected by the mask. It is used, for example, to count donor and acceptor photons in each burst.

For a multi-channel version see mch_count_ph_in_bursts_py().

Parameters
  • bursts (Bursts object) – the bursts used as input

  • mask (1D boolean array) – the photon mask. The boolean mask must be of the same size of the timestamp array used for burst search.

Returns

A 1D array containing the number of photons in each burst counting only photons in the selection mask.

fretbursts.phtools.burstsearch.mch_count_ph_in_bursts_py(Mburst, Mask)

Counts number of photons in each burst counting only photons in Mask.

This multi-channel function takes a list of a Bursts objects and photon masks and computes the number of photons selected by the mask in each channel.

It is used, for example, to count donor and acceptor photons in each burst.

For a single-channel version see count_ph_in_bursts_py().

Parameters
  • Mburst (list Bursts objects) – a list of bursts collections, one per ch.

  • Mask (list of 1D boolean arrays) – a list of photon masks (one per ch), For each channel, the boolean mask must be of the same size of the timestamp array used for burst search.

Returns

A list of 1D array, each containing the number of photons in each burst counting only photons in the selection mask.