|
| WindowMin (const char *seq, size_t len, unsigned k, unsigned large_window, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON) |
|
| WindowMin (const std::string &seq, unsigned k, unsigned large_window, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON) |
|
void | roll_minimizer (unsigned amount, std::vector< uint32_t > &vec) override |
| adds up to amount of positions of minimizers into vec. Here a k-mer is considered a minimizer if its hash is the smallest in the large window. Rightmost index wins in ties
|
|
void | roll_minimizer (unsigned amount, std::vector< std::pair< uint32_t, uint32_t > > &vec) override |
| adds up to amount of positions and hashes of minimizers into vec. Here a k-mer is considered a minimizer if its hash is the smallest in the large window. Rightmost index wins in ties
|
|
void | new_seq (const char *seq, size_t len, size_t start) override |
| replaces the current sequence with the new one. It's like starting over with a completely new seqeunce
|
|
void | new_seq (const std::string &seq, size_t pos) override |
| replaces the current sequence with the new one. It's like starting over with a completely new sequence
|
|
unsigned | get_large_wind_kmer_am () |
|
size_t | get_ds_size () |
| gets the size of the internal rmq data structure being used. Mainly used to help with tests (so you probably shouldn't use it).
|
|
bool | get_is_minimized () |
| checks if we have generated the first minimizer. Mainly used to help with tests (so you probably shouldn't use it).
|
|
| Digester (const char *seq, size_t len, unsigned k, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON) |
|
| Digester (const std::string &seq, unsigned k, size_t start=0, MinimizedHashType minimized_h=MinimizedHashType::CANON) |
|
bool | get_is_valid_hash () |
|
unsigned | get_k () |
|
size_t | get_len () |
|
bool | roll_one () |
| moves the internal pointer to the next valid k-mer.
Time Complexity: O(1)
|
|
size_t | get_pos () |
|
uint64_t | get_chash () |
|
uint64_t | get_fhash () |
|
uint64_t | get_rhash () |
|
void | append_seq (const char *seq, size_t len) |
| simulates the appending of a new sequence to the end of the old sequence. The old sequence will no longer be stored, but the rolling hashes will be able to preceed as if the sequences were appended. Can only be called when you've reached the end of the current sequence i.e. if you're current sequence is ACTGAC, and you have reached the end of this sequence, and you call append_seq with the sequence CCGGCCGG, then the minimizers you will get after calling append_seq plus the minimizers you got from going through ACTGAC, will be equivalent to the minimizers you would have gotten from rolling across ACTGACCCGGCCGG
|
|
void | append_seq (const std::string &seq) |
| simulates the appending of a new sequence to the end of the old sequence. The old sequence will no longer be stored, but the rolling hashes will be able to preceed as if the sequences were appended. Can only be called when you've reached the end of the current sequence i.e. if you're current sequence is ACTGAC, and you have reached the end of this sequence, and you call append_seq with the sequence CCGGCCGG, then the minimizers you will get after calling append_seq plus the minimizers you got from going through ACTGAC, will be equivalent to the minimizers you would have gotten from rolling across ACTGACCCGGCCGG
|
|
MinimizedHashType | get_minimized_h () |
|
const char * | get_sequence () |
|
template<
BadCharPolicy P, class T>
class digest::WindowMin< P, T >
Child class of Digester that defines a minimizer as a kmer whose hash is minimal among those in the large window. Parameters without a description are the same as the parameters in the Digester parent class. They are simply passed up to the parent constructor.
- Template Parameters
-
P | |
T | The data structure to use for performing range minimum queries to find the minimal hash value. |