WIP: Extract DbConn, Searcher & Worker into Riker Actors #813

草稿
igalic 想從 igalic/Plume:refactor/extract-actors 合併 18 個提交至 main
所有者

This PR addresses #799 and #797.

In #805 and #809 we have learned that Searcher and DbConn are intimately connected, especially in Posts.

We want to replace PlumeRocket as the main context being passed around with a Riker ActorSystem, so we'll be able to access all the systems managed by the actor system. This includes the Context of our federation AsObjects.

This PR addresses #799 *and* #797. In #805 and #809 we have learned that `Searcher` and `DbConn` are intimately connected, especially in `Post`s. We want to replace `PlumeRocket` as the main context being passed around with a Riker ActorSystem, so we'll be able to access all the systems managed by the actor system. This includes the Context of our federation `AsObject`s.
所有者

Can you compile this branch? I couldn't...

Can you compile this branch? I couldn't...
所有者

Note: SearcherActor cannot commit indices due to EnterError(https://github.com/tantivy-search/tantivy/issues/898). This issue might be related: https://github.com/rust-lang/futures-rs/issues/2090.

This occurs because perhaps both Riker and Tantivy use futures::excutor::LocalPool and Tantivy's block_on is called in Riker's LocalPool.

Possible options (based on my poor understanding of async) are:

  • SearcherActor doesn't commit indices. Raw Searcher does periodically as current implementation does.
  • Write patches to Riker so that we can choose other async runtime like async-std and use it.
  • Do the same to Tantivy.

Note that we cannot search new posts until their documents are commited not just added.

Note: SearcherActor cannot commit indices due to `EnterError`(https://github.com/tantivy-search/tantivy/issues/898). This issue might be related: https://github.com/rust-lang/futures-rs/issues/2090. This occurs because perhaps both Riker and Tantivy use `futures::excutor::LocalPool` and Tantivy's `block_on` is called in Riker's `LocalPool`. Possible options (based on my poor understanding of async) are: * SearcherActor doesn't commit indices. Raw Searcher does periodically as current implementation does. * Write patches to Riker so that we can choose other async runtime like async-std and use it. * Do the same to Tantivy. Note that we cannot search new posts until their documents are commited not just added.
所有者

I respect your great research and implementation, and completely agree with introducing Riker. In addition, I make a suggestion: use Riker's Channels as a global variable. There are two points:

  • The primary is to make it global, and
  • The secondary is use of pub/sub model using channels

What is solved?

At current attempting (its strategy and current implementation: 8aa4ae4302), we need touch searcher more than ever before. Also, we need flow, some agents (like searcher actor or post ids) through multiple structs.

See updating search document procedure. When a user updates a post, a request handler does two (or more) things: it

  • updates posts and related records in database
  • sends a UpdateDocument message to SearcherActor

Developers need always do them together. This is needed for not only request handlers but AsObject and FromId implementations for models. As such, developers need write more code than now. And they might have forgotten sending message to the actor.

To make matters worse, this introduces more complexity. Assume you're deleting a user. routes::user::delete() request handler is called. It needs to know all blogs and posts that the deleting user owns in order to send DeleteDocument messeges corresponding to deleted posts to searcher actor. How?

One way is: Post::delete() returns itself to Blog, Blog::delete() returns deleted Posts to User, User::delete() returns deleted Blogs' posts to the handler, and fanally the handler sends messeges to the actor. If some posts could not be deleted, the handler need make sure to send DeleteDocument for each only deleted post. That's complex.

The another is: the handler passes the searcher actor to User::delete() as an argument, User::delete() passes it to Blog::delete(), Blog::delete() calls Post::delete() and send a DeleteDocument message to the actor several times. This is essentially same to above.

Introducing a global variable solves this. Using it, Post::insert(), update() and delete() need to neither accept a searcher as an agument nor return back itself (or its ID) to the caller. What it needs to do is getting global searcher (or so) and sending a messege that it is deleted. That's it.

There are options what struct should be global like a searcher actor, an actor system or a channel. I suggest Riker's channel because...

More benefit

Use of channel introduces pub/sub model. In the case above, Post publishes "delete" event to a channel and SearcherActor subscribes to the event and deletes the search index corresponding to the post. More than that, we can add other actors which concern post's deletion without touching any code of request handlers and models. Examples are:

  • an actor (say, "outbox") which subscribes to post's publishing, updating, deleting event and sends ActivityPub requests to other instances
  • an actor ("notifier") which subscribes to post's events and updates notification page of followers and mentioned users
  • similar to the above but sends notification mails
  • and so on.

A proof of concept is here: https://git.joinplu.me/KitaitiMakoto/Plume/compare/igalic/refactor/extract-actors...channel. Pay your attention to

// a `SearcherActor` subscribes to any event (`"*"`)

POST_CHAN.get().unwrap().tell(Subscribe { actor: Box::new(searcher_actor), topic: "*".into() }, None);
// `Post` publishes an event without the need knowing who accepts it

        if post.published {
            crate::POST_CHAN.get().unwrap().tell(Publish { msg: PostUpdated(post.clone()), topic: "post.updated".into() }, None);
        }

Rebasing

What do you think? I you aggree with me, I'm thiking it might be better to restart from current 0.6.0+ code (of course, I will do it). Any questions and discussions are welcome!

I respect your great research and implementation, and completely agree with introducing Riker. In addition, I make a suggestion: use Riker's [Channels][] as a global variable. There are two points: * The primary is to make it global, and * The secondary is use of pub/sub model using channels What is solved? ---------- At current attempting (its strategy and current implementation: 8aa4ae4302), we need touch searcher more than ever before. Also, we need flow, some agents (like searcher actor or post ids) through multiple structs. See updating search document procedure. When a user updates a post, a request handler does two (or more) things: it * updates posts and related records in database * sends a `UpdateDocument` message to `SearcherActor` Developers need always do them together. This is needed for not only request handlers but `AsObject` and `FromId` implementations for models. As such, developers need write more code than now. And they might have forgotten sending message to the actor. To make matters worse, this introduces more complexity. Assume you're deleting a user. `routes::user::delete()` request handler is called. It needs to know all blogs and posts that the deleting user owns in order to send `DeleteDocument` messeges corresponding to deleted posts to searcher actor. How? One way is: `Post::delete()` returns itself to `Blog`, `Blog::delete()` returns deleted `Post`s to `User`, `User::delete()` returns deleted `Blog`s' posts to the handler, and fanally the handler sends messeges to the actor. If some posts could not be deleted, the handler need make sure to send `DeleteDocument` for each only deleted post. That's complex. The another is: the handler passes the searcher actor to `User::delete()` as an argument, `User::delete()` passes it to `Blog::delete()`, `Blog::delete()` calls `Post::delete()` and send a `DeleteDocument` message to the actor several times. This is essentially same to above. Introducing a global variable solves this. Using it, `Post::insert()`, `update()` and `delete()` need to neither accept a searcher as an agument nor return back itself (or its ID) to the caller. What it needs to do is getting global searcher (or so) and sending a messege that it is deleted. That's it. There are options what struct should be global like a searcher actor, an actor system or a channel. I suggest Riker's channel because... More benefit -------- Use of channel introduces pub/sub model. In the case above, `Post` publishes "delete" event to a channel and `SearcherActor` subscribes to the event and deletes the search index corresponding to the post. More than that, we can add other actors which concern post's deletion without touching any code of request handlers and models. Examples are: * an actor (say, "outbox") which subscribes to post's publishing, updating, deleting event and sends ActivityPub requests to other instances * an actor ("notifier") which subscribes to post's events and updates notification page of followers and mentioned users * similar to the above but sends notification mails * and so on. A proof of concept is here: https://git.joinplu.me/KitaitiMakoto/Plume/compare/igalic/refactor/extract-actors...channel. Pay your attention to ```rust // a `SearcherActor` subscribes to any event (`"*"`) POST_CHAN.get().unwrap().tell(Subscribe { actor: Box::new(searcher_actor), topic: "*".into() }, None); ``` ```rust // `Post` publishes an event without the need knowing who accepts it if post.published { crate::POST_CHAN.get().unwrap().tell(Publish { msg: PostUpdated(post.clone()), topic: "post.updated".into() }, None); } ``` Rebasing ------ What do you think? I you aggree with me, I'm thiking it might be better to restart from current 0.6.0+ code (of course, I will do it). Any questions and discussions are welcome! [Channels]: https://riker.rs/channels/
作者
所有者

that's a good idea, i hope I'll have time to do that soon.
but i do feel like every time i attempt this i have learned something in between, and so it becomes a little bit easier

that's a good idea, i hope I'll have time to do that soon. but i do feel like every time i attempt this i have learned something in between, and so it becomes a little bit easier
所有者

Thanks! I will start later.

Thanks! I will start later.
此合併請求有變更和目標分支衝突。
  • Cargo.lock
  • Cargo.toml
  • plume-cli/src/main.rs
  • plume-models/Cargo.toml
  • plume-models/src/lib.rs
  • plume-models/src/plume_rocket.rs
  • plume-models/src/posts.rs
  • plume-models/src/search/mod.rs
  • plume-models/src/search/searcher.rs
  • src/main.rs
檢視 命令列指示

簽出

從您的專案儲存庫 checkout 一個新的分支來測試這些更改。
git fetch -u refactor/extract-actors:igalic-refactor/extract-actors
git checkout igalic-refactor/extract-actors
登入 才能加入這對話。
沒有審核者
沒有里程碑
未選擇專案
沒有負責人
2 位參與者
通知
截止日期
截止日期無效或超出範圍,請使用「yyyy-mm-dd」的格式。

未設定截止日期。

先決條件

未設定先決條件。

參考: Plume/Plume#813
沒有提供敘述。