This is a guest post by Prashant Kumar, Software Engineer at Cashfree Payments.
In modern payment processing systems, managing high-throughput transaction requests efficiently is crucial. Various technologies are available to enhance performance by effectively handling asynchronous requests, with Kafka being a prominent solution for optimizing data flow.
At Cashfree Payments, we use Kill Bill as a critical tool for our pricing engine, supporting a wide range of billing needs for merchants. It excels in subscription plan management, versioning, plan modifications, and supports various billing cycles such as daily, monthly, annually, quarterly, biannually, and weekly. It also features role-based authorisation for a secure and controlled access. To know more about this read what is kill bill.
We use Kill Bill to charge our merchants based on how much they use our services. This is called metered usage billing. We offer different pricing levels or tiers for this.
For example, we have three tiers for API usage:
- Tier 1: Up to 1,000 API calls at Rs. 5 each.
- Tier 2: Between 1,001 and 1,00,000 API calls at Rs. 2 each.
- Tier 3: Over 1,00,000 API calls at Rs.1 each.
Our system tracks how many API calls a merchant makes and Kill Bill calculates their bill based on the tier they fall into. Other parts of our business send information about API usage to Kill Bill, which generates the final invoice. There are many metered usage billing use cases for which we use Kill Bill.
Cashfree Payments relies on Kill Bill to process metered usage data for our merchants. This data is pushed to Kill Bill via the Record Usage for Subscription API.
To enhance Kill Bill’s reliability and performance, we have implemented a Kafka plugin. Kill Bill may experience production failures due to various reasons. While quick reversals and restorations are possible, downtime poses significant challenges, especially for services pushing metrics via the API. This plugin acts as a buffer, protecting Kill Bill from unexpected failures and high traffic loads.
Critical scenarios include:
- High-traffic periods: Multiple services pushing events to Kill Bill APIs can overwhelm the service, causing downtime.
- Dependence on Kill Bill: As a priority service, downtime is unacceptable, as it impacts other Cashfree internal services.
- Operational disruptions: Inability to access Kill Bill during downtime leads to operational issues for developers and businesses, affecting product services and business operations.
In essence, ensuring Kill Bill’s uptime is crucial to maintaining the stability and reliability of Cashfree’s internal services and overall business operations.
When Kill Bill experiences downtime, Cashfree services attempting to interact with it face significant disruptions. This can lead to operational challenges for developers and businesses, impacting product services and overall business operations.
To mitigate this issue, we have implemented Kafka as a solution. Each service publishes its metrics to a Kafka topic, allowing Kill Bill to consume these messages via the Kafka Consumer Plugin. This setup ensures that even if Kill Bill experiences downtime, services can continue to send their usage metrics to Kafka. Once Kill Bill is back online, it can process the messages stored in Kafka during the outage. This approach enhances fault tolerance and resilience, reducing disruptions to business operations.
Kafka’s ability to partition data enables parallel processing, significantly boosting performance and fault tolerance. By dividing data into multiple partitions and assigning different Kill Bill instances to each, we can process usage data concurrently. However, increasing the number of partitions and Kill Bill instances to maximize parallelism might not always be feasible due to resource constraints in some services.
To enable parallel consumption within a single instance, we’ve implemented a multithreaded Kafka consumer. This allows us to leverage multiple threads within a single Kill Bill instance to consume from each assigned partition. For example, with three partitions of a topic, three threads will be used to consume from each partition in parallel.
Kill Bill supports plugin development through the OSGi framework, a Java-based modular system that enables dynamic component management. This framework provides flexibility by allowing interactions with Kill Bill without altering its core functionality, ensuring that plugins can be added or modified while maintaining loose coupling and preserving the integrity of the main system. For details on configuring and using plugins, refer to the https://github.com/cashfree/killbill-kafka-consumer-plugin, which serves as a useful reference for adding your custom processes.