digest
Loading...
Searching...
No Matches
Public Member Functions | List of all members
digest::Syncmer< P, T > Class Template Reference

This class inherits from WindowMinimizer (implementation reasons), but the represent very different things. A Syncmer is defined as a large window where the minimal hash among all kmers in the large window belong to either the leftmost or rightmost kmer. Parameters without a description are the same as the parameters in the Digester parent class. They are simply passed up to the parent constructor. More...

#include <syncmer.hpp>

Inheritance diagram for digest::Syncmer< P, T >:
digest::WindowMin< P, T > digest::Digester< P >

Public Member Functions

 Syncmer (const char *seq, size_t len, unsigned k, unsigned large_window, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
 Syncmer (const std::string &seq, unsigned k, unsigned large_window, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
void roll_minimizer (unsigned amount, std::vector< uint32_t > &vec) override
 adds up to amount of positions of syncmers into vec. Here a large window is considered a syncmer if the smallest hash in the large window is at the leftmost or rightmost position.
 
void roll_minimizer (unsigned amount, std::vector< std::pair< uint32_t, uint32_t > > &vec) override
 adds up to amount of positions and hashes of syncmers into vec. Here a large window is considered a syncmer if the smallest hash in the large window is at the leftmost or rightmost position.
 
- Public Member Functions inherited from digest::WindowMin< P, T >
 WindowMin (const char *seq, size_t len, unsigned k, unsigned large_window, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
 WindowMin (const std::string &seq, unsigned k, unsigned large_window, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
void roll_minimizer (unsigned amount, std::vector< uint32_t > &vec) override
 adds up to amount of positions of minimizers into vec. Here a k-mer is considered a minimizer if its hash is the smallest in the large window. Rightmost index wins in ties
 
void roll_minimizer (unsigned amount, std::vector< std::pair< uint32_t, uint32_t > > &vec) override
 adds up to amount of positions and hashes of minimizers into vec. Here a k-mer is considered a minimizer if its hash is the smallest in the large window. Rightmost index wins in ties
 
void new_seq (const char *seq, size_t len, size_t start) override
 replaces the current sequence with the new one. It's like starting over with a completely new seqeunce
 
void new_seq (const std::string &seq, size_t pos) override
 replaces the current sequence with the new one. It's like starting over with a completely new sequence
 
unsigned get_large_wind_kmer_am ()
 
size_t get_ds_size ()
 gets the size of the internal rmq data structure being used. Mainly used to help with tests (so you probably shouldn't use it).
 
bool get_is_minimized ()
 checks if we have generated the first minimizer. Mainly used to help with tests (so you probably shouldn't use it).
 
- Public Member Functions inherited from digest::Digester< P >
 Digester (const char *seq, size_t len, unsigned k, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
 Digester (const std::string &seq, unsigned k, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
bool get_is_valid_hash ()
 
unsigned get_k ()
 
size_t get_len ()
 
bool roll_one ()
 moves the internal pointer to the next valid k-mer.
Time Complexity: O(1)
 
size_t get_pos ()
 
uint64_t get_chash ()
 
uint64_t get_fhash ()
 
uint64_t get_rhash ()
 
void append_seq (const char *seq, size_t len)
 simulates the appending of a new sequence to the end of the old sequence. The old sequence will no longer be stored, but the rolling hashes will be able to preceed as if the sequences were appended. Can only be called when you've reached the end of the current sequence i.e. if you're current sequence is ACTGAC, and you have reached the end of this sequence, and you call append_seq with the sequence CCGGCCGG, then the minimizers you will get after calling append_seq plus the minimizers you got from going through ACTGAC, will be equivalent to the minimizers you would have gotten from rolling across ACTGACCCGGCCGG
 
void append_seq (const std::string &seq)
 simulates the appending of a new sequence to the end of the old sequence. The old sequence will no longer be stored, but the rolling hashes will be able to preceed as if the sequences were appended. Can only be called when you've reached the end of the current sequence i.e. if you're current sequence is ACTGAC, and you have reached the end of this sequence, and you call append_seq with the sequence CCGGCCGG, then the minimizers you will get after calling append_seq plus the minimizers you got from going through ACTGAC, will be equivalent to the minimizers you would have gotten from rolling across ACTGACCCGGCCGG
 
MinimizedHashType get_minimized_h ()
 
const char * get_sequence ()
 

Detailed Description

template<BadCharPolicy P, class T>
class digest::Syncmer< P, T >

This class inherits from WindowMinimizer (implementation reasons), but the represent very different things. A Syncmer is defined as a large window where the minimal hash among all kmers in the large window belong to either the leftmost or rightmost kmer. Parameters without a description are the same as the parameters in the Digester parent class. They are simply passed up to the parent constructor.

Template Parameters
P
TThe data structure to use for performing range minimum queries to find the minimal hash value.

Constructor & Destructor Documentation

◆ Syncmer() [1/2]

template<BadCharPolicy P, class T >
digest::Syncmer< P, T >::Syncmer ( const char *  seq,
size_t  len,
unsigned  k,
unsigned  large_window,
size_t  start = 0,
MinimizedHashType  minimized_h = MinimizedHashType::CANON 
)
inline
Parameters
seq
len
k
large_windowthe number of kmers in the large window, i.e. the number of kmers to be considered during the range minimum query.
start
minimized_h
Exceptions
BadWindowExceptionThrown when large_window is passed in as 0

◆ Syncmer() [2/2]

template<BadCharPolicy P, class T >
digest::Syncmer< P, T >::Syncmer ( const std::string &  seq,
unsigned  k,
unsigned  large_window,
size_t  start = 0,
MinimizedHashType  minimized_h = MinimizedHashType::CANON 
)
inline
Parameters
seq
k
large_windowthe number of kmers in the large window, i.e. the number of kmers to be considered during the range minimum query.
start
minimized_h
Exceptions
BadWindowExceptionThrown when large_window is passed in as 0

Member Function Documentation

◆ roll_minimizer() [1/2]

template<BadCharPolicy P, class T >
void digest::Syncmer< P, T >::roll_minimizer ( unsigned  amount,
std::vector< std::pair< uint32_t, uint32_t > > &  vec 
)
inlineoverridevirtual

adds up to amount of positions and hashes of syncmers into vec. Here a large window is considered a syncmer if the smallest hash in the large window is at the leftmost or rightmost position.

Parameters
amount
vec

Implements digest::Digester< P >.

◆ roll_minimizer() [2/2]

template<BadCharPolicy P, class T >
void digest::Syncmer< P, T >::roll_minimizer ( unsigned  amount,
std::vector< uint32_t > &  vec 
)
inlineoverridevirtual

adds up to amount of positions of syncmers into vec. Here a large window is considered a syncmer if the smallest hash in the large window is at the leftmost or rightmost position.

Parameters
amount
vec

Implements digest::Digester< P >.


The documentation for this class was generated from the following file: