Profiling CPU in Python
The code may behave differently than developer expects. I mean not the correctness, but the performance. The performance bottleneck may be easy to found by running the profiler. cProfile is the way to do it in Python.
I am going to run my binary pipeline code twice using the profiler. First time in my local computer where the main bottleneck is my internet connection. The second time, on AWS ECS node, where the bottleneck should be CPU.
I ran locally the script as I used to do, but wrapped it around the cProfiler:
pipenv run python -m cProfile -s tottime index.py
And here are local results:
The ECS run was changed to be wrapped with cProfiler and output it in S3.
The outcome indicates that the slowest piece in the local run as expected was the read and write (networking), but in the ECS the bottleneck turned out to be the JSON parsing, which is also visible in the local run.
After few minutes of looking for any better alternative for built-in Python JSON reader I found orjson. Few lines of changes, rerun and I got following results:
The change did not introduce overall performance boost in the local run, but the numbers improved a bit. The ECS run improved performance a lot. The overall run took 832 seconds (instead of 1209), where the JSON parsing took only 45 seconds (instead of 412 seconds).
It is worth to invest a bit time to understand your bottlenecks instead of increasing the service tier. I am pretty sure the code might be even performing much better.