AWS S3 Last Modified date

File systems tend to expose creation and modification dates of each entry. AWS S3 is not a file system, but exposes “Last Modified” date which is a bit confusing, because S3 object is not modifiable, but can be overwritten.

Goal

The goal of the experiment is to figure out how it really behaves, especially in the multipart upload scenario. I have an output file which I am going to write in 128MB parts using my slow internet connection. It is going to take several minutes to upload 1GB.

Log

The process created multipart upload at 14:00, the first part completed at 14:01 and the multi-part upload was closed at 14:07.

Inconclusive

When I checked the object from AWS CLI I got the following output:

The value did not match to any result, but I do not see everything in my logs. I cannot see the timestamp of starting sending the first part.

Again

I changed the code to do extra 30 seconds sleep between creating multipart upload and starting sending the first part and got following logs and object’s metadata.

The value is still not the exact match, but I am pretty sure it was not the start of uploading the first part, but rather creating multipart upload.

One shot

I am bit confused, because I do not know what to expect in single upload by calling put object method. I did another test to verify it.

The outcome says that the “Last Modified” represents somehow the upload start timestamp. At least very consistent with previous experiments.

Lesson learned

I know the meaning of “Last Modified” value and I should not use this value to assume that one file became earlier available than the other one. The column can only indicate when the file started being uploaded.

Software Developer, Data Engineer with solid knowledge of Business Intelligence. Passionate about programming.