Large number of API calls - processing time reduction ideas

I have a large number of transactions (~20k) that needs to be submitted via API call.
All transactions are loaded in a datatable.
Currently the processing takes about 4hrs.

I would like to shrink this time a bit therefore I am approaching this comunity for an advice.
I already tried parallel for each but this seems to be not a good idea.
Splitting the job and involving multiple robots is not an option.

Anyone has an idea?

Cheers

@J0ska

Do you have any bulk api option? if multi bot or parallel is not an option then from API side we need to check if any bulk add options are present

cheers

Hi,
unfortunately no bulk API option. :frowning:
Cheers

@J0ska

Check If there are any delays specified, as there are no bulk option
Other alternatives I don’t see because those are based on the output we receive from API unless you place any delays

Thanks,
Srini

There are no delays, simply the large number of transactions results in the long run time.

I managed to make working the “parallel for each”. It required that any working variables are defined inside the “parallel for each” scope.

However the “parallel for each” approach didn’t bring any benefit as it failed with “System.OutOfMemoryException” which was probably result of large number of parallel threads allocated.

It would probably require spliting the batch into a smaller chunks, but not sure if this would bring any performance benefit…

So unless some another idea appears I will need to stay with sequencial processing

Cheers

You can try the following approaches to optimize the performance of your API calls and reduce the processing time:

  1. Batch processing: Instead of making an API call for each transaction, check if the API supports batch processing. This will allow you to send multiple transactions in a single API call, reducing the overhead of initiating and closing individual connections.
  2. Asynchronous processing: Use asynchronous methods for making API calls. This will allow your program to process other requests while waiting for the response from the API call. In this way, you won’t have to wait for each API call to complete before starting the next one. You can use libraries like asyncio and aiohttp in Python or Task and HttpClient in C# for async operations.
  3. Throttling and rate limiting: Make sure you are aware of any rate limits on your API calls. If you try to make more API calls than allowed, your requests may be throttled, which can slow down the processing. Adjust your code to respect the rate limits and spread the calls over a longer period if necessary. Also, consider implementing retries with exponential backoff for handling temporary issues with the API.
  4. Connection pooling and reuse: Reusing connections can help you avoid the overhead of creating a connection for each API call. If you’re using an HTTP client library, like requests in Python or HttpClient in C#, ensure that you’re reusing the same client instance instead of creating a new one for each request. These libraries often provide connection pooling and keep-alive features by default.
  5. Optimize serialization and deserialization: If you’re spending a significant amount of time serializing or deserializing the data sent to and received from the API, consider using a more efficient data format or optimizing your serialization/deserialization code.
  6. Compress data: If the API allows compressed payloads, you may want to compress your data before making the API call. This can help reduce the amount of data sent over the network and improve the transfer time.
  7. Profile and optimize code: Use profiling tools to identify bottlenecks in your code and optimize them. There might be issues in other parts of your code besides the API calls that can contribute to the overall execution time.

Remember that using parallel processing is not always a bad idea. The key to making it work is carefully controlling the number of parallel executions to avoid overwhelming the API or your own system. You can use semaphores, thread pools, or task schedulers to limit the concurrency and make parallel processing work for your scenario.

1 Like