FTP download and big files in Python

Recently I was playing with downloading some files and even uploading them to S3. It was working pretty well with file around 300MB in size. Today I wanted to transfer 3.5GB between FTP and S3 and it was not completing.

Background

FTP protocol is using two types of connections: control and transfer. You are sending commands and receiving responses using control connection. If you are storing or retrieving data you are additionally using transfer connection.

What happened?

The transfer was initiated over control connection (RETR command) and the entire traffic started happening using transfer connection. After few minutes either the server or client timed out the control connection, because it was not used.

Solution

Generally you should avoid timeouts. To do it you need to communicate over control channel to keep connection constantly open. FTP protocol supports NOOP command which I send after each 32 MB of received data.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store