The code may behave differently than developer expects. I mean not the correctness, but the performance. The performance bottleneck may be easy to found by running the profiler. cProfile is the way to do it in Python.

Goal

I am going to run my binary pipeline code twice using the profiler. First time in my local computer where the main bottleneck is my internet connection. The second time, on AWS ECS node, where the bottleneck should be CPU.

Benchmark

I ran locally the script as I used to do, but wrapped it around the cProfiler:

And here are local results:

local run before any changes

The ECS run was changed to be wrapped with cProfiler and output it in S3.

ECS run before any changes

The outcome indicates that the slowest piece in the local run as expected was the read and write (networking), but in the ECS the bottleneck turned out to be the JSON parsing, which is also visible in the local run.

After few minutes of looking for any better alternative for built-in Python JSON reader I found orjson. Few lines of changes, rerun and I got following results:

local run after JSON changes
ECS run after JSON changes

The change did not introduce overall performance boost in the local run, but the numbers improved a bit. The ECS run improved performance a lot. The overall run took 832 seconds (instead of 1209), where the JSON parsing took only 45 seconds (instead of 412 seconds).

Conclusion

It is worth to invest a bit time to understand your bottlenecks instead of increasing the service tier. I am pretty sure the code might be even performing much better.

Software Developer, Data Engineer with solid knowledge of Business Intelligence. Passionate about programming.