EventAnalysis  1.3.0
Static Public Member Functions | List of all members
levenshteinSSE::LevenshteinIterationBase< Vec1, Vec2, Iterator1, Iterator2 > Struct Template Reference

#include <levenshtein-sse.hpp>

Inheritance diagram for levenshteinSSE::LevenshteinIterationBase< Vec1, Vec2, Iterator1, Iterator2 >:
levenshteinSSE::LevenshteinIteration< Vec1, Vec2, Iterator1, Iterator2 >

Static Public Member Functions

static void perform (const Iterator1 &a, const Iterator2 &b, std::size_t &i, std::size_t j, std::size_t bLen, Vec1 &diag, const Vec2 &diag2)
 

Detailed Description

template<typename Vec1, typename Vec2, typename Iterator1, typename Iterator2>
struct levenshteinSSE::LevenshteinIterationBase< Vec1, Vec2, Iterator1, Iterator2 >

Some notes on the algorithm used here:

The standard Levenshtein metric is usually calculated using the Wagner-Fischer algorithm 1. This essentially means calculating Levenshtein(a[:n], b[:m]) for all prefixes a[:n] and b[:m] of a and b, storing the results in a table (and possibly "forgetting" the previous rows to save memory). Normally, this is performed row by row, but this approach has the disadvantage that it is not easily converted to using SIMD instructions.

The "diagonal" algorithm variants in this file calculate the minor diagonals one by one. There, the index i always corresponds to the table row in which we currently are and j always denotes the column. Note that the tables have a trivial i=0 row and j=0 column; The table entry [i,j] corresponds to the comparison a[i-1] == b[j-1]. A diagonal is defined by its index k, yielding the inner loop invariant k == i + j.

The outer loop iterates over all diagonals from k = 0 to k = len(a)+len(b) (compare with the table in 1 to verify), meaning that the length of the diagonals may vary between iterations. The current diagonal and the previous diagonal are kept in memory, referred to as diag and diag2 respectively.

In the inner loop, the rows are treated in descending order, i.e. i decreases from iteration to iteration, to avoid overwriting data needed in the next iteration.

Trivial implementation of one inner loop iteration.

Note that i is passed as a reference; The SIMD version may decrease it by more than 1.

Member Function Documentation

◆ perform()

template<typename Vec1, typename Vec2, typename Iterator1, typename Iterator2>
static void levenshteinSSE::LevenshteinIterationBase< Vec1, Vec2, Iterator1, Iterator2 >::perform ( const Iterator1 &  a,
const Iterator2 &  b,
std::size_t &  i,
std::size_t  j,
std::size_t  bLen,
Vec1 &  diag,
const Vec2 &  diag2 
)
inlinestatic

The documentation for this struct was generated from the following file: