Migrate issuecomment table to i64 ids#348
Migrate issuecomment table to i64 ids#348Mark-Simulacrum wants to merge 1 commit intorust-lang:masterfrom
Conversation
2 years after the stopgap fix in f9d84fa, this writes a one-way migration to move to bigint columns for all the issuecomment IDs. Presuming we like this approach, it would be a good idea to apply the same to other GitHub IDs (issues, users, etc.). The migration has been tested on a data export from production postgres, locally it takes ~7 seconds to run. I'm not sure if it's safe to run it while rfcbot is actively running (in particular, what happens if the diesel schema in the running rfcbot disagrees with the database). We may want to try to stop the app, run the migration, and then start it again (downtime should be comparatively short though not zero). rfcbot isn't currently responding to any new comments I think so it's in practice nearly down anyway.
|
Checking other IDs:
So indeed I think it's safe to defer the other IDs basically indefinitely. |
teor2345
left a comment
There was a problem hiding this comment.
Looks good to me!
Changing column data types before dropping foreign keys will temporarily create foreign key constraints with mismatched types. This is non-standard, but there's no documentation prohibiting it, and apparently it works fine:
https://stackoverflow.com/questions/32267778/setting-up-foreign-key-with-different-datatype/68491233#68491233
|
|
||
| alter table rfc_feedback_request drop constraint rfc_feedback_request_fk_feedback_comment_fkey; | ||
|
|
||
| -- Then update all the values to be u64'd, not i32 storing u32 value. |
There was a problem hiding this comment.
Nitpick: Technically the representation is unsigned, but the range is 0..=i64::MAX:
https://www.postgresql.org/docs/current/datatype-numeric.html#DATATYPE-INT
(But the difference is unlikely to ever matter.)
| -- Then update all the values to be u64'd, not i32 storing u32 value. | |
| -- Then update all the values to be unsigned in an i64 field, not i32 storing u32 value. |
| @@ -0,0 +1,4 @@ | |||
| -- TODO: is there a reasonable rollback procedure? I think we could undo the | |||
| -- data type changes, but it would likely mean deleting some comments from the | |||
| -- database that can no longer be represented. Is that a reasonable tradeoff? | |||
There was a problem hiding this comment.
For rollback deleting out of range comment IDs seems to be the only option.
But we could keep a copy of those comments (in the migrated but un-rolled-back database), and move them across once we have a successful future migration.
| let url = format!( | ||
| "{}/repos/{}/issues/comments/{}", | ||
| BASE_URL, repo, comment_num as u32 | ||
| BASE_URL, repo, comment_num as u64 |
There was a problem hiding this comment.
As a precaution, would try_from() be better, just in case any negative numbers make it through?
| BASE_URL, repo, comment_num as u64 | |
| BASE_URL, repo, u64::try_from(comment_num).expect("id is a non-negative i64") |
| }, | ||
| number = issue.number, | ||
| id = comment_id as u32, | ||
| id = comment_id as u64, |
2 years after the stopgap fix in f9d84fa, this writes a one-way migration to move to bigint columns for all the issuecomment IDs. Presuming we like this approach, it would be a good idea to apply the same to other GitHub IDs (issues, users, etc.), but that can probably be done in follow up work (and maybe can be deferred again -- we should check how close we are to limits on each of those ID spaces), maybe we can move away from this database entirely before we hit those limits.
The migration has been tested on a data export from production postgres, locally it takes ~7 seconds to run. I'm not sure if it's safe to run it while rfcbot is actively running (in particular, what happens if the diesel schema in the running rfcbot disagrees with the database). We may want to try to stop the app, run the migration, and then start it again (downtime should be comparatively short though not zero). rfcbot isn't currently responding to any new comments I think so it's in practice nearly down anyway.