Hello, I’d like to ask for help with a performance issue we’re experiencing when running a robot that runs a python ETL significantly slower when run from the orchestrator vs run from studio.
For context and from my understanding we run the orchestrator from the internet and the virtual machine run on-premise. The machine should have 8gb of ram, around 80gb storage and runs windows 10. Currently, I don’t have the details about the cpu.
The robot invokes powershell to run the python ETL. This a temporary solution while work on implementing a proper architecture for python ETL execution.
During the extraction step we need to get around 1 million rows from a database. To avoid high memory usage, I’ve code the query to fetch data in batches keeping ram usage at around 400mb.
Comparison:
Run via studio directly on the machine:
- The query takes in about 5 minutes.
- Entire ETL finishes in about 1 hour.
Run via orchestrator:
- The query alone takes around 2 hours.
Since we can’t log in to the machine to check the task manager we don’t know how the machine is handling the load.
I suspect the issue is related to running the robot through the orchestrator, as we’ve run this slowdown consistently for almost a year whenever the robot is triggered this way.
I’ve found several posts about similar orchetrator related slowdowns, but none with a solution that applies to our case.
- Has anyone of you had a similar problem than can share some light on how to solve it?
- Are there any documentation, blogs or resources that discuss orchestrator’s performance impact on machines? Understanding this better would help us address the problem.