digest
Loading...
Searching...
No Matches
Public Member Functions | List of all members
digest::WindowMin< P, T > Class Template Reference

Child class of Digester that defines a minimizer as a kmer whose hash is minimal among those in the large window. Parameters without a description are the same as the parameters in the Digester parent class. They are simply passed up to the parent constructor. More...

#include <window_minimizer.hpp>

Inheritance diagram for digest::WindowMin< P, T >:
digest::Digester< P > digest::Syncmer< P, T >

Public Member Functions

 WindowMin (const char *seq, size_t len, unsigned k, unsigned large_window, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
 WindowMin (const std::string &seq, unsigned k, unsigned large_window, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
void roll_minimizer (unsigned amount, std::vector< uint32_t > &vec) override
 adds up to amount of positions of minimizers into vec. Here a k-mer is considered a minimizer if its hash is the smallest in the large window. Rightmost index wins in ties
 
void roll_minimizer (unsigned amount, std::vector< std::pair< uint32_t, uint32_t > > &vec) override
 adds up to amount of positions and hashes of minimizers into vec. Here a k-mer is considered a minimizer if its hash is the smallest in the large window. Rightmost index wins in ties
 
void new_seq (const char *seq, size_t len, size_t start) override
 replaces the current sequence with the new one. It's like starting over with a completely new seqeunce
 
void new_seq (const std::string &seq, size_t pos) override
 replaces the current sequence with the new one. It's like starting over with a completely new sequence
 
unsigned get_large_wind_kmer_am ()
 
size_t get_ds_size ()
 gets the size of the internal rmq data structure being used. Mainly used to help with tests (so you probably shouldn't use it).
 
bool get_is_minimized ()
 checks if we have generated the first minimizer. Mainly used to help with tests (so you probably shouldn't use it).
 
- Public Member Functions inherited from digest::Digester< P >
 Digester (const char *seq, size_t len, unsigned k, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
 Digester (const std::string &seq, unsigned k, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON)
 
bool get_is_valid_hash ()
 
unsigned get_k ()
 
size_t get_len ()
 
bool roll_one ()
 moves the internal pointer to the next valid k-mer.
Time Complexity: O(1)
 
size_t get_pos ()
 
uint64_t get_chash ()
 
uint64_t get_fhash ()
 
uint64_t get_rhash ()
 
void append_seq (const char *seq, size_t len)
 simulates the appending of a new sequence to the end of the old sequence. The old sequence will no longer be stored, but the rolling hashes will be able to preceed as if the sequences were appended. Can only be called when you've reached the end of the current sequence i.e. if you're current sequence is ACTGAC, and you have reached the end of this sequence, and you call append_seq with the sequence CCGGCCGG, then the minimizers you will get after calling append_seq plus the minimizers you got from going through ACTGAC, will be equivalent to the minimizers you would have gotten from rolling across ACTGACCCGGCCGG
 
void append_seq (const std::string &seq)
 simulates the appending of a new sequence to the end of the old sequence. The old sequence will no longer be stored, but the rolling hashes will be able to preceed as if the sequences were appended. Can only be called when you've reached the end of the current sequence i.e. if you're current sequence is ACTGAC, and you have reached the end of this sequence, and you call append_seq with the sequence CCGGCCGG, then the minimizers you will get after calling append_seq plus the minimizers you got from going through ACTGAC, will be equivalent to the minimizers you would have gotten from rolling across ACTGACCCGGCCGG
 
MinimizedHashType get_minimized_h ()
 
const char * get_sequence ()
 

Detailed Description

template<BadCharPolicy P, class T>
class digest::WindowMin< P, T >

Child class of Digester that defines a minimizer as a kmer whose hash is minimal among those in the large window. Parameters without a description are the same as the parameters in the Digester parent class. They are simply passed up to the parent constructor.

Template Parameters
P
TThe data structure to use for performing range minimum queries to find the minimal hash value.

Constructor & Destructor Documentation

◆ WindowMin() [1/2]

template<BadCharPolicy P, class T >
digest::WindowMin< P, T >::WindowMin ( const char *  seq,
size_t  len,
unsigned  k,
unsigned  large_window,
size_t  start = 0,
MinimizedHashType  minimized_h = MinimizedHashType::CANON 
)
inline
Parameters
seq
len
k
large_windowthe number of kmers in the large window, i.e. the number of kmers to be considered during the range minimum query.
start
minimized_h
Exceptions
BadWindowExceptionthrown when large_window is passed in as 0

◆ WindowMin() [2/2]

template<BadCharPolicy P, class T >
digest::WindowMin< P, T >::WindowMin ( const std::string &  seq,
unsigned  k,
unsigned  large_window,
size_t  start = 0,
MinimizedHashType  minimized_h = MinimizedHashType::CANON 
)
inline
Parameters
seq
k
large_windowthe number of kmers in the large window, i.e. the number of kmers to be considered during the range minimum query.
start
minimized_h
Exceptions
BadWindowExceptionthrown when large_window is passed in as 0

Member Function Documentation

◆ get_ds_size()

template<BadCharPolicy P, class T >
size_t digest::WindowMin< P, T >::get_ds_size ( )
inline

gets the size of the internal rmq data structure being used. Mainly used to help with tests (so you probably shouldn't use it).

Returns
size_t, the size of the internal rmq data structure object

◆ get_is_minimized()

template<BadCharPolicy P, class T >
bool digest::WindowMin< P, T >::get_is_minimized ( )
inline

checks if we have generated the first minimizer. Mainly used to help with tests (so you probably shouldn't use it).

Returns
bool, if we have already obtained a minimizer

◆ get_large_wind_kmer_am()

template<BadCharPolicy P, class T >
unsigned digest::WindowMin< P, T >::get_large_wind_kmer_am ( )
inline
Returns
unsigned, the value of large_window

◆ new_seq() [1/2]

template<BadCharPolicy P, class T >
void digest::WindowMin< P, T >::new_seq ( const char *  seq,
size_t  len,
size_t  start 
)
inlineoverridevirtual

replaces the current sequence with the new one. It's like starting over with a completely new seqeunce

Parameters
seqconst char pointer to new sequence to be hashed
lenlength of the new sequence
startposition in new sequence to start from
Exceptions
BadConstructionExceptionthrown if the starting position is greater than the length of the string

Reimplemented from digest::Digester< P >.

◆ new_seq() [2/2]

template<BadCharPolicy P, class T >
void digest::WindowMin< P, T >::new_seq ( const std::string &  seq,
size_t  pos 
)
inlineoverridevirtual

replaces the current sequence with the new one. It's like starting over with a completely new sequence

Parameters
seqconst std string reference to the new sequence to be hashed
startposition in new sequence to start from
Exceptions
BadConstructionExceptionthrown if the starting position is greater than the length of the string

Reimplemented from digest::Digester< P >.

◆ roll_minimizer() [1/2]

template<BadCharPolicy P, class T >
void digest::WindowMin< P, T >::roll_minimizer ( unsigned  amount,
std::vector< std::pair< uint32_t, uint32_t > > &  vec 
)
inlineoverridevirtual

adds up to amount of positions and hashes of minimizers into vec. Here a k-mer is considered a minimizer if its hash is the smallest in the large window. Rightmost index wins in ties

Parameters
amount
vec

Implements digest::Digester< P >.

◆ roll_minimizer() [2/2]

template<BadCharPolicy P, class T >
void digest::WindowMin< P, T >::roll_minimizer ( unsigned  amount,
std::vector< uint32_t > &  vec 
)
inlineoverridevirtual

adds up to amount of positions of minimizers into vec. Here a k-mer is considered a minimizer if its hash is the smallest in the large window. Rightmost index wins in ties

Parameters
amount
vec

Implements digest::Digester< P >.


The documentation for this class was generated from the following file: