Skip to main content Link Menu Expand (external link) Document Search Copy Copied

hlf-connector GitHub

PR #72 Increment version after release
auto-version-increment Automated changes by [create-pull-request]( GitHub action
Created At 2022-12-26 08:39:26 +0000 UTC
PR #71 Process Data-ingestion messages through Batched async mechanism to improve throughput.
**Current Problem :** Presently, only a single Listener container is generated to consume messages from the Connectors ingestion Topic (across all the partitions) . This Listener sequentially processes each record returned by the internal ```poll()``` method which eventually affects the overall throughput of Connector, since the downstream ```TransactionConsumer#listen``` does a blocking call for writing transactions which could span for few seconds. Therefore given a scenario where ```TransactionConsumer#listen``` takes 2 seconds complete, in order to process 100 incoming records fetched by the ```poll()``` method it takes around 50 seconds. **Proposed Fix :** Assign a dedicated Listener Container for each partition in the Topic, per connector instance (capped to a max of 6 Listeners, in order to avoid spawning a large number of Listeners for high-partitioned Topics ). Each Listener gets a batch of Messages from the Partition it is assigned to, this batch is processed asynchronously by submitting it to a task executor in one go. The Listener thread defers the next poll until the entire records are processed parallelly. Once the batch is processed, Listener gets the next Batch from poll() In case one of the records encounters an exception while processing parallelly, we perform a partial Batch commit and the failed and unprocessed records are sent again in the next poll()
Created At 2022-12-26 08:12:47 +0000 UTC