FTP download and big files in Python

Adrian Macal
Nov 11, 2020

--

Recently I was playing with downloading some files and even uploading them to S3. It was working pretty well with file around 300MB in size. Today I wanted to transfer 3.5GB between FTP and S3 and it was not completing.

Background

FTP protocol is using two types of connections: control and transfer. You are sending commands and receiving responses using control connection. If you are storing or retrieving data you are additionally using transfer connection.

What happened?

The transfer was initiated over control connection (RETR command) and the entire traffic started happening using transfer connection. After few minutes either the server or client timed out the control connection, because it was not used.

Solution

Generally you should avoid timeouts. To do it you need to communicate over control channel to keep connection constantly open. FTP protocol supports NOOP command which I send after each 32 MB of received data.

--

--

Adrian Macal
Adrian Macal

Written by Adrian Macal

Software Developer, Data Engineer with solid knowledge of Business Intelligence. Passionate about programming.

No responses yet