This page discusses resumable uploads in Cloud Storage. Resumable uploads are the recommended method for uploading large files, because you don't have to restart them from the beginning if there is a network failure while the upload is underway.
Introduction
A resumable upload lets you resume data transfer operations to Cloud Storage after a communication failure has interrupted the flow of data. Resumable uploads work by sending multiple requests, each of which contains a portion of the object you're uploading. This is different from a single-request upload, which contains all of the object's data in a single request and must restart from the beginning if it fails part way through.
Use a resumable upload if you are uploading large files or uploading over a slow connection. For example file size cutoffs for using resumable uploads, see upload size considerations.
A resumable upload must be completed within a week of being initiated, but can be cancelled at any time.
Only a completed resumable upload appears in your bucket and, if applicable, replaces an existing object with the same name.
The creation time for the object is based on when the upload completes.
Object metadata set by the user is specified in the initial request. This metadata is applied to the object once the upload completes.
The JSON API also supports setting custom metadata in the final request if you include headers prefixed with
X-Goog-Meta-
in that request.
- A completed resumable upload is considered one Class A operation.
How tools and APIs use resumable uploads
Depending on how you interact with Cloud Storage, resumable uploads might be managed automatically on your behalf. This section describes the resumable upload behavior for different tools and provides guidance on configuring the appropriate buffer size for your application.
Console
The Google Cloud console manages resumable uploads automatically on your behalf. However, if you refresh or navigate away from the Google Cloud console while an upload is underway, the upload is cancelled.
Command line
The gcloud CLI uses resumable uploads in the
gcloud storage cp
and gcloud storage rsync
commands when
uploading data to Cloud Storage. If your upload is interrupted,
you can resume it by running the same command that you used to start the
upload. When resuming such an upload that includes multiple files, use
the --no-clobber
flag to prevent re-uploading files that already
completed successfully.
Client libraries
When performing resumable uploads, client libraries function as wrappers around the Cloud Storage JSON API.
C++
Functions in storage::Client
perform with different behavior:
Client::WriteObject()
always performs a resumable upload.Client::InsertObject()
always performs a simple or multipart upload.Client::UploadFile()
can perform a resumable upload, simple upload, or multipart upload.
By default, UploadFile()
performs a resumable upload when the object
is larger than 20 MiB. Otherwise, it performs a simple upload or
multipart upload. You can configure this threshold by setting
MaximumSimpleUploadsSizeOption
when creating a
storage::Client
.
8 MiB is the default buffer size, which you can
modify with the UploadBufferSizeOption
option.
The C++ client library uses a buffer size that's equal to the chunk
size. The buffer size must be a multiple of 256 KiB (256 x 1024 bytes).
When using WriteObject()
and UploadFile()
, you might want to
consider the tradeoffs between upload speed and memory usage. Using
small buffers to upload large objects can make the upload slow. For more
information on the relationship between upload speed and buffer size for
C++, see the detailed analysis in GitHub.
C#
When uploading, the C# client library always performs resumable uploads.
You can initiate a resumable upload with
CreateObjectUploader
.
The C# client library uses a buffer size that's equal to the chunk size.
The default buffer size is 10 MB and you can change this value by
setting ChunkSize
on UploadObjectOptions
. The
buffer size must be a multiple of 256 KiB (256 x 1024 bytes). Larger
buffer sizes typically make uploads faster, but note that there's a
tradeoff between speed and memory usage.
Go
By default, resumable uploads occur automatically when the file is
larger than 16 MiB. You change the cutoff for performing resumable
uploads with Writer.ChunkSize
. Resumable uploads are
always chunked when using the Go client library.
Multipart uploads occur when the object is smaller than
Writer.ChunkSize
or when Writer.ChunkSize
is set to 0, where
chunking becomes disabled. The Writer
is
unable to retry requests if ChunkSize
is set to 0.
The Go client library uses a buffer size that's equal to the chunk size.
The buffer size must be a multiple of 256 KiB (256 x 1024 bytes). Larger
buffer sizes typically make uploads faster, but note that there's a
tradeoff between speed and memory usage. If you're running several
resumable uploads concurrently, you should set Writer.ChunkSize
to a
value that's smaller than 16 MiB to avoid memory bloat.
Note that the object is not finalized in Cloud Storage until
you call Writer.Close()
and receive a success
response. Writer.Close
returns an error if the request isn't
successful.
Java
The Java client library has separate methods for multipart and resumable uploads. The following methods always perform a resumable upload:
Storage#createFrom(BlobInfo, java.io.InputStream, Storage.BlobWriteOption...)
Storage#createFrom(BlobInfo, java.io.InputStream, int, Storage.BlobWriteOption...)
Storage#createFrom(BlobInfo, java.nio.file.Path, Storage.BlobWriteOption...)
Storage#createFrom(BlobInfo, java.nio.file.Path, int, Storage.BlobWriteOption...)
Storage#writer(BlobInfo, Storage.BlobWriteOption...)
Storage#writer(java.net.URL)
The default buffer size is 15 MiB. You can set the buffer size either by
using the WriteChannel#setChunkSize(int)
method, or
by passing in a bufferSize
parameter to the
Storage#createFrom
method. The buffer size has a
hard minimum of 256KiB. When calling
WriteChannel#setChunkSize(int)
internally, the
buffer size is shifted to a multiple of 256 KiB.
Buffering for resumable uploads functions as a minimum flush threshold, where writes smaller than the buffer size are buffered until a write pushes the number of buffered bytes above the buffer size.
If uploading smaller amounts of data, consider using
Storage#create(BlobInfo, byte[], Storage.BlobTargetOption...)
or Storage#create(BlobInfo, byte[], int, int, Storage.BlobTargetOption...)
.
Node.js
Resumable uploads occur automatically. You can turn off resumable
uploads by setting resumable
on UploadOptions
to
false
. Resumable uploads are automatically managed when using the
createWriteStream
method.
There is no default buffer size and chunked uploads must be
manually invoked by setting the chunkSize
option on
CreateResumableUploadOptions. If chunkSize
is
specified, the data is sent in separate HTTP requests, each with a
payload of size chunkSize
. If no chunkSize
is specified and the
library is performing a resumable upload, all data is streamed into a
single HTTP request.
The Node.js client library uses a buffer size that's equal to the chunk size. The buffer size must be a multiple of 256 KiB (256 x 1024 bytes). Larger buffer sizes typically make uploads faster, but note that there's a tradeoff between speed and memory usage.
PHP
By default, resumable uploads occur automatically when the object size
is larger than 5 MB. Otherwise, multipart uploads occur. This threshold
cannot be changed. You can force a resumable upload by setting the
resumable
option in the upload
function.
The PHP client library uses a buffer size that's equal to the chunk
size. 256KiB is the default buffer size for a resumable upload,
and you can change the buffer size by setting the chunkSize
property.
The buffer size must be a multiple of 256 KiB (256 x 1024 bytes). Larger
buffer sizes typically make uploads faster, but note that there's a
tradeoff between speed and memory usage.
Python
Resumable uploads occur when the object is larger than 8 MiB,
and multipart uploads occur when the object is smaller than 8 MiB
This threshold cannot be changed. The Python client library uses a
buffer size that's equal to the chunk size. 100 MiB is the
default buffer size used for a resumable upload, and you
can change the buffer size by setting the
blob.chunk_size
property.
To always perform a resumable upload
regardless of object size, use the class
storage.BlobWriter
or the method
storage.Blob.open(mode='w')
. For these methods, the
default buffer size is 40 MiB. You can also use Resumable Media to
manage resumable uploads.
The chunk size must be a multiple of 256 KiB (256 x 1024 bytes). Larger chunk sizes typically make uploads faster, but note that there's a tradeoff between speed and memory usage.
Ruby
The Ruby client library treats all uploads as non-chunked resumable uploads.
REST APIs
JSON API
The Cloud Storage JSON API uses a POST Object
request that
includes the query parameter uploadType=resumable
to initiate the
resumable upload. This request returns as session URI that you
then use in one or more PUT Object
requests to upload the object data.
For a step-by-step guide to building your own logic for resumable
uploading, see Performing resumable uploads.
XML API
The Cloud Storage XML API uses a POST Object
request that
includes the header x-goog-resumable: start
to initiate the
resumable upload. This request returns as session URI that you
then use in one or more PUT Object
requests to upload the object data.
For a step-by-step guide to building your own logic for resumable
uploading, see Performing resumable uploads.
Resumable uploads of unknown size
The resumable upload mechanism supports transfers where the file size is not known in advance. This can be useful for cases like compressing an object on-the-fly while uploading, since it's difficult to predict the exact file size for the compressed file at the start of a transfer. The mechanism is useful either if you want to stream a transfer that can be resumed after being interrupted, or if chunked transfer encoding does not work for your application.
For more information, see Streaming uploads.
Upload performance
Choosing session regions
Resumable uploads are pinned in the region where you initiate them. For example, if you initiate a resumable upload in the US and give the session URI to a client in Asia, the upload still goes through the US. To reduce cross-region traffic and improve performance, you should keep a resumable upload session in the region in which it was created.
If you use a Compute Engine instance to initiate a resumable upload, the instance should be in the same location as the Cloud Storage bucket you upload to. You can then use a geo IP service to pick the Compute Engine region to which you route customer requests, which helps keep traffic localized to a geo-region.
Uploading in chunks
If possible, avoid breaking a transfer into smaller chunks and instead upload the entire content in a single chunk. Avoiding chunking removes added latency costs and operations charges from querying the persisted offset of each chunk as well as improves throughput. However, you should consider uploading in chunks when:
Your source data is being generated dynamically and you want to limit how much of it you need to buffer client-side in case the upload fails.
Your clients have request size limitations, as is the case for many browsers.
If you're using the JSON or XML API and your client receives an error, they can query the server for the persisted offset and resume uploading remaining bytes from that offset. The Google Cloud console, Google Cloud CLI, and client libraries handle this automatically on your behalf. See How tools and APIs use resumable uploads for more guidance on chunking for specific client libraries.
Considerations
This section is useful if you are building your own client that sends resumable upload requests directly to the JSON or XML API.
Session URIs
When you initiate a resumable upload, Cloud Storage returns a session URI, which you use in subsequent requests to upload the actual data. An example of a session URI in the JSON API is:
https://storage.googleapis.com/upload/storage/v1/b/my-bucket/o?uploadType=resumable&name=my-file.jpg&upload_id=ABg5-UxlRQU75tqTINorGYDgM69mX06CzKO1NRFIMOiuTsu_mVsl3E-3uSVz65l65GYuyBuTPWWICWkinL1FWcbvvOA
An example of a session URI in the XML API is:
https://storage.googleapis.com/my-bucket/my-file.jpg?upload_id=ABg5-UxlRQU75tqTINorGYDgM69mX06CzKO1NRFIMOiuTsu_mVsl3E-3uSVz65l65GYuyBuTPWWICWkinL1FWcbvvOA
This session URI acts as an authentication token, so the requests that use it don't need to be signed and can be used by anyone to upload data to the target bucket without any further authentication. Because of this, be judicious in sharing the session URI and only share it over HTTPS.
A session URI expires after one week but can be cancelled prior to expiring. If you make a request using a session URI that is no longer valid, you receive one of the following errors:
- A
410 Gone
status code if it's been less than a week since the upload was initiated. - A
404 Not Found
status code if it's been more than a week since the upload was initiated.
In both cases, you have to initiate a new resumable upload, obtain a new session URI, and start the upload from the beginning using the new session URI.
Integrity checks
We recommend that you request an integrity check of the final uploaded object
to be sure that it matches the source file. You can do this by calculating the
MD5 digest of the source file and adding it to the Content-MD5
request
header.
Checking the integrity of the uploaded file is particularly important if you are uploading a large file over a long period of time, because there is an increased likelihood of the source file being modified over the course of the upload operation.
Retries and resending data
Once Cloud Storage persists bytes in a resumable upload, those bytes cannot be overwriten, and Cloud Storage ignores attempts to do so. Because of this, you should not send different data when rewinding to an offset that you sent previously.
For example, say you're uploading a 100,000 byte object, and your connection is interrupted. When you check the status, you find that 50,000 bytes were successfully uploaded and persisted. If you attempt to restart the upload at byte 40,000, Cloud Storage ignores the bytes you send from 40,000 to 50,000. Cloud Storage begins persisting the data you send at byte 50,001.
What's next
- Perform a resumable upload.
- Learn about retrying requests to Cloud Storage.
- Read about other types of uploads in Cloud Storage.