In case anyone was wondering, yes we were down for ~2 hours or so. I apologize for the inconvenience.
We had a botched upgrade path from 0.17.4
-> 0.18.0
. I spent some time debugging but eventually gave up and restored a snapshot (taken on Saturday Jun 24, 2023 @ 11:00 UTC).
We’ll likely stick to 0.17.4
till I can figure out a safe path to upgrade to a bigger (and up-to-date) instance and carry over all the user data. Any help/advice welcome. Hopefully this doesn’t occur again!
deleted by creator
Well, the lemmy container kept running into:
lemmy | 2023-06-24T23:28:52.716586Z INFO lemmy_server::code_migrations: Running apub_columns_2021_02_02 lemmy | 2023-06-24T23:28:52.760510Z INFO lemmy_server::code_migrations: Running instance_actor_2021_09_29 lemmy | 2023-06-24T23:28:52.763723Z INFO lemmy_server::code_migrations: Running regenerate_public_keys_2022_07_05 lemmy | 2023-06-24T23:28:52.801409Z INFO lemmy_server::code_migrations: Running initialize_local_site_2022_10_10 lemmy | 2023-06-24T23:28:52.803303Z INFO lemmy_server::code_migrations: No Local Site found, creating it. lemmy | thread 'main' panicked at 'couldnt create local user: DatabaseError(UniqueViolation, "duplicate key value violates unique constraint \"local_user_person_id_key\"")', crates/db_schema/src/impls/local_user.rs:157:8
despite the fact that:
lemmy=# select id, site_id from local_site; id | site_id ----+--------- 1 | 1 (1 row)
So you can see that it was unconditionally trying to create a local_site and running into a DB constraint error. I further narrowed it down to this piece of code:
/// /// If a site already exists, the DB migration should generate a local_site row. /// This will only be run for brand new sites. async fn initialize_local_site_2022_10_10( pool: &DbPool, settings: &Settings, ) -> Result<(), LemmyError> { info!("Running initialize_local_site_2022_10_10"); // Check to see if local_site exists if LocalSite::read(pool).await.is_ok() { return Ok(()); } info!("No Local Site found, creating it.");
At this point I gave up because I couldn’t really tell why
LocalSite::read(pool).await.is_ok()
was, well…not ok.