Switchable tokenizer #776

Merged
KitaitiMakoto merged 20 commits from switchable-tokenizer into master 4 years ago

20 Commits (main)

Author SHA1 Message Date
Kitaiti Makoto f0c8eb836c Fix typos 4 years ago
Kitaiti Makoto ed6a41d131 Make Lindera tokenizer optional 4 years ago
Kitaiti Makoto 8a3a878015 Run cargo fmt 4 years ago
Kitaiti Makoto efcf797ee4 Define SEARCH_LANG env to specify tokenizers set 4 years ago
Kitaiti Makoto 97e59f940b Add test for Lindera tokenizer 4 years ago
Kitaiti Makoto 43aea8339e Add LowerCase filter to Lindera tokenizer 4 years ago
Kitaiti Makoto 5206bbb13b Pass tokenizer config to Searcher methods 4 years ago
Kitaiti Makoto af1f7af4f7 Use determine_tokenizer in SearchTokenizerConfig::init() 4 years ago
Kitaiti Makoto fb98984e5e Define SearchTokenierConfig::determine_tokenizer() 4 years ago
Kitaiti Makoto 917e6ced3a Rename SearchTokenizer to TokenizerKind 4 years ago
Kitaiti Makoto 94d40918a1 Move SearchTokenizer from plume-models to plume-models::search::tokenizer 4 years ago
Kitaiti Makoto 2b6d04f047 Use as_deref() instead of guard 4 years ago
Kitaiti Makoto 001fd67f86 Use guard instead of duplicate default values 4 years ago
Kitaiti Makoto 21ac40caf4 Use enum to hold tokenizer config instead of initializing on config phase 4 years ago
Kitaiti Makoto e300714a46 Use CONFIG for tokenizers 4 years ago
Kitaiti Makoto b78a1aed2f Add search tokenizers to config option 4 years ago
Kitaiti Makoto 077b7f6487 Add SearchTokenizerConfig struct 4 years ago
Kitaiti Makoto 7fddfa0d92 Install lindera-tantivy 4 years ago
Kitaiti Makoto 1e1a5b2db2 Add lindera-tantivy to plume-model's dependencies 4 years ago
Kitaiti Makoto ec65481e9d [REFACTORING]Rename whitespace_tokenizer to tag_tokenizer for
registration

Name representing its purpose is preferred.
4 years ago