Add a search engine into Plume #324
Değerlendirici yok
Etiketler
Etiket Yok
A: API
A: Backend
A: Federation
A: Front-End
A: I18N
A: Meta
A: Security
Build
C: Bug
C: Discussion
C: Enhancement
C: Feature
Compatibility
Dependency
Design
Documentation
Good first issue
Help welcome
Mobile
Rendering
S: Blocked
S: Duplicate
S: Incomplete
S: Instance specific
S: Invalid
S: Needs Voting/Discussion
S: Ready for review
Suggestion
S: Voted on Loomio
S: Wontfix
Kilometre Taşı Yok
Proje yok
Atanan Kişi Yok
2 Katılımcı
Bildirimler
Bitiş Tarihi
Bitiş tarihi atanmadı.
Bağımlılıklar
Bağımlılık yok.
Referans: Plume/Plume#324
Yükleniyor…
Tablo ekle
Yeni konuda referans
Herhangi bir açıklama sağlanmadı.
"search" Dalını Sil
Bir dalı silmek kalıcıdır. Her ne kadar silinen dal tamamen kaldırılana kadar çok kısa bir süre yaşamını sürdürse de, çoğu durumda bu işlem GERİ ALINAMAZ. Devam edilsin mi?
Add support for search through tantivy
fix #149
This is mostly done, current issues are:
query parser might be a bit intolerant, we should probably implement our ownclosing Plume leave the database blocked, currently you have to run plm search unlock -p to unlock it.advanced search is working, but yet undocumented (i.e. filtering by date could be done if you know enough tantivy, but I don't expect common user to)After upgrading to this, you will have to run
plm search init -p <path to plume wokingdir>
to initialize search database, otherwise Plume will panic at launchCodecov Report
0% <0%> (ø)
0% <0%> (ø)
0% <0%> (ø)
0% <0%> (ø)
0% <0%> (ø)
0% <0%> (ø)
0% <0%> (ø)
0% <0%> (ø)
0% <0%> (ø)
0% <0%> (ø)
Continue to review full report at Codecov.
Small todo list of things that are not strictly needed atm, but should still be done:
plume-models/src/search.rs
into multiple files, under a new directory : it's "only" 670 lines, but they are very dense, and although all is related to search, having a sort of database connection, a query parser, and a tokenizer in the same file seems too muchuser1@plume1.org
oruser2@plume2.org
, a post fromuser2@plume1.org
would match.@fdb-hiroshima We have a dangling PR to add a lenient mode to search.
@fdb-hiroshima Let me know if you have any question about tantivy.
@fulmicoton Both "issues" are fixed (you can see they are
crossed out), I've implemented a parser which fit with our exact needs (we know more about what each field represent than Tantivy can).Concerning dates, I don't have any question, it's just how we give date to tantivy is not evident to the end user, but tantivy can't change that. We store date as the result of
num_days_from_ce()
, basically the number of days since some point. This is because Tantivy's RangeQueryiterate over the terms within the range
, so I tried to have the smallest ranges possible, with as few useless/empty elements as possible (iterating over 30 days vs 2,592,000 seconds, one seems faster than the other).I really like your crate, it's very complete yet relatively easy to use. The only issue I had with it was about BooleanQuery, I had a hard time understanding how to build theme. I was looking for some kind of function, and took a lot of time to figure out it implement From. Maybe it could just simply be mentioned in the documentation of
BooleanQuery::new_multiterms_query
, "to create a BooleanQuery from Box<Query> use From<Vec<(Occur, Box)>> instead" or something like thatOpened tantivy-search/tantivy#446 to make creation of
BooleanQuery
easier.Your insight about range is correct. Eventually tantivy will be a bit smarter than what it does right now, but too be honest this is pretty low priority.
Looking forward to seeing Plume grow.
Seems to work well! Thank you! (and thank you @fulmicoton for tantivy as well)
It is not clear which operator has the priority there: is it
(token.contains('@') && field_name=="author") || field_name=="blog"
ortoken.contains('@') && (field_name=="author" || field_name=="blog")
?What is this condition for?
@ -0,0 +14,4 @@
if (input.name === '') {
input.name = input.id
}
if (input.name && !input.value) {
You can add this condition before to fix your issue:
It's the shortest way I through of to not need a special case for the first token, otherwise it would start with an "else if" and fail, but actually I just though of a better way, I'll test if it work and modify accordingly
@ -0,0 +14,4 @@
if (input.name === '') {
input.name = input.id
}
if (input.name && !input.value) {
I'll do it then
Usually && has the priority, which mean this probably does not do what I expect. I'll verify if it's working as intended
I confirm searching a blog with no domain panic at next line
@BaptisteGelez in what order do we merge search ructe and the dependency upgrade?
@fdb-hiroshima I think we could merge the search first, then rebase
ructe
on master and convert the search page to Ructe, and finally the dependencies? I don't know if there is a better way to avoid conflicts, but it seems to be the more logical way to do it for me…I'll do so
There is one thing I forgot. There is nowhere a link to
/search
And there is also no instruction to run plm before
You're right, but I'll a a link the header when converting the template to ructe. And updating the docs can be done in another PR.
if you want you can make a small form with an
<input name="q">
, it is enough to make a simple query, and would probably fit better with the ergonomic people expect (being able to do a basic search from anywhere, and an advanced search if the result were inconclusive)