Credit to @phiresky for this idea, originally posted in comments of #2994

This PR adds community_id to post_aggregates (& a new index on post_aggregates) to enable joining community directly to post_aggregates when querying posts.

On lemm.ee, this optimization speeds up the query for front page of subscribed posts ~1000x, from several seconds to to just milliseconds. You can check a before/after of query plans here: https://gist.github.com/sunaurus/856e03165bb0c0010505afeebde45230

  • issue_tracking_bot@lemm.eeOPB
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    New commits contain the following changes

    1. Remove the unused index
    2. Added creator_id to post_aggregates
    3. Made post_aggregates the “main” table for PostQuery

    Originally posted by sunaurus in #3653

  • issue_tracking_bot@lemm.eeOPB
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    At this point, the difference is very small, but indeed it is a bit faster to treat post_aggregates as the main table completely. Will push additional commits soon.

    Originally posted by sunaurus in #3653

  • issue_tracking_bot@lemm.eeOPB
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    post_aggregates should be the “main” table that the others are joined to, and primarily filter from.

    Is there a specific reason the post_aggregates and post table are spilt? In general I think it would make sense to have everything in the same table. (same for comments /person…). 1:1 relations aren’t that great in postgresql in general imo, tables with a huge amount of columns work pretty well. It would increase contention a bit because every change in votes would lock the post row for a bit (just writes, not reads) but I don’t think that justifies the split. Or is there other reasons for the spilt?

    Originally posted by phiresky in #3653