Resumable Uploads Using tus
Large file uploads are automatically supported in
skynet-nodejs. Any file over 40MB will automatically use the built-in tus upload client.
The basic upload endpoint is
Upon making a POST request to the above endpoint, the response will contain a
locationheader where PATCH requests are made. It should address a specific folder and have an upload identifier like this:
When the Upload is complete, you need to make a HEAD request to the
locationaddress mentioned above. The response should have a
skynet-skylinkheader containing the upload's skylink.
We recommend using a chunk size matching
const TUS_CHUNK_SIZE = (1 << 22) * 10
All requests should have the
Tus-Resumableheader set to the value of
Chris has written a full desciption of the tus protocol on Skynet, reproduced below.
Unstable connections are a major concern when uploading files to a server. Especially when uploading large files over HTTP since it does not have any support for resuming an upload if it failed halfway through. You might be uploading a 50GB recording to share with your friends, to have your connection drop for a second after 49GB, and being forced to do it all over again. Many companies offering data storage have their own solution to this problem built into their APIs, SDKs, or sync clients. For Skynet, we have implemented the TUS protocol. An open protocol for resumable uploads.
TUS is an open, minimalistic, and extensible protocol that any client or server can implement. It splits features into so-called extensions. Apart from the core protocol, clients and servers can implement any number of extensions but are encouraged to implement as many as possible.
Each upload has a unique ID used by the core protocol to get information about the upload from the server in case the upload was interrupted. The following example from the official documentation demonstrates how the core protocol resumes uploads. The example assumes that a 100-byte upload was created and its ID obtained using the Creation extension but was interrupted after 70 bytes.
The first request after the interruption is a HEAD request using the upload’s id. It also contains the version of the protocol used by the client.
HEAD /files/24e533e02ec3bc40c387f1a0e460e216 HTTP/1.1
The server responds with the last known offset of the upload and its version.
HTTP/1.1 200 OK
Afterward, the client uses a PATCH request to send the remaining 30 bytes starting at offset 70. If the offset doesn’t match the server’s expectation, it will return an error.
PATCH /files/24e533e02ec3bc40c387f1a0e460e216 HTTP/1.1
Tus-Resumable: 1.0.0[remaining 30 bytes]
Upon success, the server responds with the new offset and a status code 204.
HTTP/1.1 204 No Content
Usually, a successful upload will consist of multiple sequential PATCH calls until the upload is finished. The amount of data uploaded with each call is called the chunk size. The larger the chunk size, the fewer PATCH calls are needed to finish the upload, but as they get larger, the chance that a chunk gets interrupted and needs to be reuploaded increases as well. Different services may place additional restrictions on the chunk size or the maximum allowed file size. For example, an object storage provider might enforce a minimum chunk size.
As mentioned above, a server can implement a set of extensions to add features. Not every extension may be suitable for every service. The following extensions exist at the time of writing.
- Creation: Creates a new upload
- Creation with Upload: Creates a new upload with initial payload
- Expiration: Set expiration for unfinished uploads
- Checksum: Consistency check for PATCH requests
- Termination: Client-side termination of completed/unfinished uploads
- Concatenation: Combination of uploads to enable parallel uploading
Right now, Skynet implements the Creation and Creation with Upload extensions without deferrable upload sizes. While it doesn’t currently support the full Expiration extension, it will prune failed uploads right away and uploads after 20 minutes.
Since Skynet uses a special way of uploading and requesting files, we introduced some adaptions to the protocol.
- 1.Since Skylinks are not known until after the upload, we use a temporary upload id for all uploads. After the upload is complete, the Syklink can be obtained from the “Upload-Metadata” response header that contains all the TUS-related metadata. The key is “Skylink”, and the value is the base64-encoded Skylink. It is base64-encoded because that’s what TUS requires for its metadata.
- 2.Skynet uploads data in chunks. The size of these chunks depends on erasure coding settings specified for the fanout and the specified encryption type. The formula for the size of these chunks is
chunkSize := (4MiB — encryptionOverhead) * fanoutDataPieces.By default, data uploaded to Skynet uses 10 data pieces for its fanout and Threefish for encryption which doesn’t have any overhead. As a result, the default chunk size is 40MiB. Since portals have limited amounts of RAM, they can’t keep these chunks in memory while waiting for users to resume their uploads. That’s why the chunk size specified in TUS needs to be a multiple of the Skynet chunk size. As long as they match, the portal can upload the chunks and free up memory while waiting for more data.
Adapting the TUS protocol for resumable uploads allows users and developers to upload much larger files to Skynet without worrying about sudden interruptions causing their uploads to fail. Due to the open protocol, developers can leverage a wide ecosystem of existing SDKs and applications for building their Skapps or uploading their collections of large files to Skynet.