Back to Notes

Event-Driven Architecture: Solving Payment Retry and Late Success Cases

When building modern digital platforms, handling payments reliably is one of the most critical challenges. While payment gateways provide robust APIs, real-world scenarios often introduce complexities. One common issue is when a payment initially fails, but after retries from the provider's side, it eventually succeeds. This can lead to confusion for both users and businesses:

This problem stems from how asynchronous payment systems work. Payment providers may retry transactions on behalf of the user, and if the retry succeeds, they later send a notification (webhook/event) to the merchant. If the merchant system is not designed to handle these late success events, users end up being charged without getting their product.

This is exactly where Event-Driven Architecture (EDA) becomes a game-changer.


🏗️ What is Event-Driven Architecture?

Event-Driven Architecture is a design approach where services communicate through events rather than direct API calls. Instead of polling or waiting for responses, systems emit events when something happens (e.g., "Payment Succeeded"), and other systems react to those events.

Key Building Blocks

  1. Event Producers – Systems or services that emit events (e.g., a payment provider sending a payment_succeeded webhook).
  2. Event Consumers – Services that listen and act on those events (e.g., your order service updating the database).
  3. Event Broker/Bus – Middleware that routes events reliably (e.g., Kafka, RabbitMQ, AWS SNS/SQS, or even a webhook handler).
Infographic explaining Event-Driven Architecture and payment retry flow

💡 How EDA Solves the Payment Retry Problem

Let's map the payment retry scenario into an event-driven workflow:

1️⃣ User Initiates Payment

2️⃣ Payment Provider Retries

3️⃣ Your Event-Driven System Reacts

4️⃣ User Experience Fixed


🚀 Benefits of Using Event-Driven Architecture for Payments

  1. Asynchronous Reliability – No need to rely on immediate responses from the payment provider.
  2. Scalability – Multiple services can consume the same payment event (e.g., accounting system, notifications, analytics).
  3. Auditability – Every payment event can be logged and replayed for debugging.
  4. Improved User Trust – Users don't feel "robbed" when charged without delivery.

🔧 Implementation Example

Let's say you're building this with Node.js + Kafka (or AWS SQS):

Architecture Components

  1. Webhook Receiver (Producer) – Listens to events from payment gateways and publishes them to your event bus.
  2. Payment Service (Consumer) – Subscribes to payment_succeeded and updates your order DB.
  3. Notification Service (Consumer) – Sends confirmation email/SMS to user.
  4. Analytics Service (Consumer) – Logs the transaction for reporting.

This decoupled architecture ensures that even if one service fails, others can still process events independently.


📊 System Architecture Diagram

1+-----------------------+
2| USER/CLIENT |
3+-----------------------+
4 |
5 v
6+-------------------------------------------------+
7| UI & BACKEND (INITIAL PAYMENT CALL) |
8|-------------------------------------------------|
9| - User clicks "Pay" |
10| - UI shows "Payment Failed" |
11| - User sees failure message |
12+-------------------------------------------------+
13 | (Request)
14 v
15+-----------------------------------------------------------------+
16| PAYMENT GATEWAY |
17|-----------------------------------------------------------------|
18| - Receives initial payment request |
19| - First attempt FAILS |
20| - --------------------------------- |
21| - **INTERNAL RETRY MECHANISM** |
22| - Second attempt SUCCEEDS a few seconds later |
23| - --------------------------------- |
24| - **ASYNC WEBHOOK** sends `payment_succeeded` event |
25+-----------------------------------------------------------------+
26 |
27 | (Event: `payment_succeeded`)
28 v
29+-----------------------------------------------------------------+
30| EVENT BROKER (Kafka, RabbitMQ) |
31|-----------------------------------------------------------------|
32| - Receives the webhook event from the Payment Gateway |
33| - Puts the event on a dedicated queue |
34| - Decouples the sender and receiver |
35+-----------------------------------------------------------------+
36 |
37 | (Event: `payment_succeeded`)
38 v
39+-----------------------------------------------------------------+
40| PAYMENT SERVICE (Consumer) |
41|-----------------------------------------------------------------|
42| - **Listens** for `payment_succeeded` events |
43| - Updates the Order in the DB (mark payment as successful) |
44| - Triggers order fulfillment |
45+-----------------------------------------------------------------+
46 | |
47 | (Event: `order_fulfilled`) |
48 v v
49+---------------------+ +-----------------------------------+
50| ORDER FULFILLMENT | | NOTIFICATION SERVICE |
51| SERVICE | | (Consumer) |
52|---------------------| +-----------------------------------+
53| - Listens for | | - Listens for payment events |
54| fulfillment events| | - Sends confirmation email/SMS |
55| - Delivers product | | to the user |
56| or subscription | +-----------------------------------+
57+---------------------+
58 |
59 | (Event: `order_fulfilled`)
60 v
61+------------------------------------+
62| ANALYTICS SERVICE (Consumer) |
63|------------------------------------|
64| - Listens for all payment events |
65| - Logs transactions for reporting |
66+------------------------------------+

🎯 Closing Thoughts

Payment retries and delayed confirmations are inevitable in real-world payment systems. Instead of treating them as errors, we should design our systems to handle them gracefully.

With Event-Driven Architecture, we:

If you're building any system that involves payments, subscriptions, or external APIs with retries, adopting EDA can save you from frustrated users and endless support tickets.


Takeaway: In payments, the first response is not always the final truth. Events are.

Back to Notes