Used dropbox/drive to upload files?
But, ever wondered, how does an upload occur?
1. Most of the file-sharing systems support an upload of big files too >= 2 GB. Now, suppose you have to create a system of supporting such upload of a file. A simple approach would be to read a file in the form of a stream of bytes, load it in the server memory and then finally store it in the desired location.
2. There is an issue with this approach though. you are loading such a big file in memory. Suppose, there are multiple users uploading these big files. The infrastructure required to build such a system would be very high. Also, uploading big files more time, because multithreading cannot be applied. It has to be uploaded in a single thread. How do we prevent this?
3. One simple solution that comes into mind is to break the file into multiple small files. But who will do this job? Will it happen on the front end or the backend? If, on the frontend, then how will the parts be uploaded to the backend?
4. The job of breaking the file into parts is usually done by the client which is either a desktop client or web app. A hash of each part is also created and the metadata is sent to a message queuing service like SQS, which further updates it in the metadata database.
5. The client then directly calls the upload API of google drive and uploads the chunk on the cloud using streaming upload API which can be a simple REST API.
6. In Desktop clients, like that of dropbox, there is a process that keeps running and checks for updates in a particular file. In cases of updates, the desktop clients can determine which chunk is updated and update only its information in the metadata database and update only that chunk in the storage.