Skip to content

Conversation

@gregorbg
Copy link
Member

This is an attempted fix at an (apparent) race condition in our StripeRecords table.

We have observed that two StripeRecord entities exist with the same exact stripe_id and stripe_record_type. This normally should not happen, because the two aforementioned properties are exactly the two key elements in StripeRecord.create_or_update_from_api. This method internally uses find_or_initialize_by, though, and since the created_at timestamps of the two duplicate rows are very close together, we suspect that the following happened:

  1. An organizer initiated a refund over our Stripe dashboard [We know for certain that this happened because we talked to the organizer. They confirmed that they issued refunds and that they don't even have access to the Stripe Dashboard in the first place]
  2. Our controller endpoint for the admin UI talks to the Stripe API
  3. [SUSPECTED] The response from the Stripe API took so long, that the Webhook fired while we were still waiting for that response
  4. On another thread, the webhook hit find_or_initialize_by and caused a race condition [footnote 1]
  5. The result is that we got two rows in StripeRecord, which (per our business logic) correspond to the same refund logically, but have two different primary key IDs technically.
  6. The code that checks for original_payment.refunding_registration_payments.where(receipt: stored_record).any? and happily concludes that the refund was not recorded yet because of the different primary key IDs [footnote 2]

The suggested solution is to be more strict by introducing a uniqueness constraint on database level. This is the most certain, fool-proof approach to make sure that within the constraints of Stripe's business specific logic, duplicates never happen ("business-specific" means: PayPal may have completely different ways of identifying the uniqueness of a payment). The new create_or_find_by approach induces one more database write than before (once for creating, then immediately updating), but this is worth the reduced risk.

The uniqueness constraint is handled on a low level directly by Rails. From the documentation of create_or_find_by:

# Attempts to create a record with the given attributes in a table that has a unique database constraint
# on one or several of its columns. If a row already exists with one or several of these
# unique constraints, the exception such an insertion would normally raise is caught,
# and the existing record with those attributes is found using #find_by!.

Specifically, the internal Rails AR code catches rescue ActiveRecord::RecordNotUnique, which is thrown by the database driver iff the uniqueness constraint is violated. Ironically enough, we do not want model-level uniqueness validations in this case, because they would raise exceptions that we (as the application) would then need to catch/handle. (Also, it is unclear how to define this constraint: Should one stripe_id be unique per stripe_record_type? Or should one stripe_record_type be unique per stripe_id? These choices seem too arbitrary).

Footnotes

1 -- The race condition happens because find_or_initialize_by creates new records in an unpersisted state if they are not found. Only the subsequent update! call actually writes them. This means that another thread from another request might have just barely enough time to trigger another find_or_initialize_by while the original thread is waiting for the udpate! to actually write.

2 -- Why not check for stripe_id and/or stripe_record_type directly in the "has this refund already been issued" check? Because we (want to) support multiple, diverse payment systems. The relation refund_receipt is intentionally polymorphic, and we don't know whether PayPal or WeChat Pay or IndiaPaymentGateway123 or YourCoolPaymentSolution identifies uniqueness of payments / payment identifiers in the same way that Stripe does. So we don't want to be Stripe-specific at this point in time. and StripeRecord should be the one taking care of uniqueness in itself (see above)

@gregorbg
Copy link
Member Author

As a small "bonus", I added a ! suffix to the nomenclature of create_or_update_from_api!. It had always (even before this PR) been using update!, so this seemed overdue.

@gregorbg
Copy link
Member Author

This also needs some form of intervention on database level, at the very least we need to clear the one duplicate record that caused this investigation. But my hunch tells me that there are also a small number of other cases which either went by unnoticed or had less unpleasant consequences.

@gregorbg gregorbg marked this pull request as draft January 13, 2026 11:50
@dunkOnIT
Copy link
Contributor

dunkOnIT commented Jan 13, 2026

This also needs some form of intervention on database level, at the very least we need to clear the one duplicate record that caused this investigation. But my hunch tells me that there are also a small number of other cases which either went by unnoticed or had less unpleasant consequences.

This query:

SELECT `stripe_id`, COUNT(*) AS `occurrences` FROM `stripe_records` GROUP BY `stripe_id` HAVING COUNT(*) > 1 ORDER BY `occurrences` DESC; 

Yields the [redacted] list of duplicate stripe_id's

@dunkOnIT
Copy link
Contributor

dunkOnIT commented Jan 13, 2026

Looks good! Do we want to introduce a validation at the StripeRecord level while we're at it, which asserts the uniqueness of stripe_id? They'll basically do the same thing, of course, but seems nice to have this embodied in the Rails code as well.

The db index lets us make sure this doesn't happen in future, but the Rails code will catch 99.99...% of the instances, while letting us take advantage of .valid? etc to avoid needing to do error handling if we ever introduce that sort of code

@gregorbg
Copy link
Member Author

I specifically addressed this in my wall of text above. We do not want a Rails validation error in this case, we want the hard SQL level error so that create_or_find_by can do its thing.

@dunkOnIT
Copy link
Contributor

My bad for missing that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants