Parallel S3 and local filesystem execution tool.

Overview

Go Report Github Actions Status

s5cmd

Overview

s5cmd is a very fast S3 and local filesystem execution tool. It comes with support for a multitude of operations including tab completion and wildcard support for files, which can be very handy for your object storage workflow while working with large number of files.

There are already other utilities to work with S3 and similar object storage services, thus it is natural to wonder what s5cmd has to offer that others don't.

In short, s5cmd offers a very fast speed. Thanks to Joshua Robinson for his study and experimentation on s5cmd; to quote his medium post:

For uploads, s5cmd is 32x faster than s3cmd and 12x faster than aws-cli. For downloads, s5cmd can saturate a 40Gbps link (~4.3 GB/s), whereas s3cmd and aws-cli can only reach 85 MB/s and 375 MB/s respectively.

If you would like to know more about performance of s5cmd and the reasons for its fast speed, refer to benchmarks section

Features

s5cmd supports wide range of object management tasks both for cloud storage services and local filesystems.

  • List buckets and objects
  • Upload, download or delete objects
  • Move, copy or rename objects
  • Set Server Side Encryption using AWS Key Management Service (KMS)
  • Set Access Control List (ACL) for objects/files on the upload, copy, move.
  • Print object contents to stdout
  • Create buckets
  • Summarize objects sizes, grouping by storage class
  • Wildcard support for all operations
  • Multiple arguments support for delete operation
  • Command file support to run commands in batches at very high execution speeds
  • Dry run support
  • S3 Transfer Acceleration support
  • Google Cloud Storage (and any other S3 API compatible service) support
  • Structured logging for querying command outputs
  • Shell auto-completion

Installation

Binaries

The Releases page provides pre-built binaries for Linux and macOS.

Homebrew

For macOS, a homebrew tap is provided:

brew tap peak/s5cmd https://github.com/peak/s5cmd
brew install s5cmd

Build from source

You can build s5cmd from source if you have Go 1.13+ installed.

go get github.com/peak/s5cmd

⚠️ Please note that building from master is not guaranteed to be stable since development happens on master branch.

Docker

Hub

$ docker pull peakcom/s5cmd
$ docker run --rm -v ~/.aws:/root/.aws peakcom/s5cmd <S3 operation>

Build

$ git clone https://github.com/peak/s5cmd && cd s5cmd
$ docker build -t s5cmd .
$ docker run --rm -v ~/.aws:/root/.aws s5cmd <S3 operation>

Usage

s5cmd supports multiple-level wildcards for all S3 operations. This is achieved by listing all S3 objects with the prefix up to the first wildcard, then filtering the results in-memory. For example, for the following command;

s5cmd cp 's3://bucket/logs/2020/03/*' .

first a ListObjects request is send, then the copy operation will be executed against each matching object, in parallel.

Examples

Download a single S3 object

s5cmd cp s3://bucket/object.gz .

Download multiple S3 objects

Suppose we have the following objects:

s3://bucket/logs/2020/03/18/file1.gz
s3://bucket/logs/2020/03/19/file2.gz
s3://bucket/logs/2020/03/19/originals/file3.gz
s5cmd cp 's3://bucket/logs/2020/03/*' logs/

s5cmd will match the given wildcards and arguments by doing an efficient search against the given prefixes. All matching objects will be downloaded in parallel. s5cmd will create the destination directory if it is missing.

logs/ directory content will look like:

$ tree
.
└── logs
    ├── 18
    │   └── file1.gz
    └── 19
        ├── file2.gz
        └── originals
            └── file3.gz

4 directories, 3 files

ℹ️ s5cmd preserves the source directory structure by default. If you want to flatten the source directory structure, use the --flatten flag.

s5cmd cp --flatten 's3://bucket/logs/2020/03/*' logs/

logs/ directory content will look like:

$ tree
.
└── logs
    ├── file1.gz
    ├── file2.gz
    └── file3.gz

1 directory, 3 files

Upload a file to S3

s5cmd cp object.gz s3://bucket/

by setting server side encryption (aws kms) of the file:

s5cmd cp -sse aws:kms -sse-kms-key-id <your-kms-key-id> object.gz s3://bucket/

by setting Access Control List (acl) policy of the object:

s5cmd cp -acl bucket-owner-full-control object.gz s3://bucket/

Upload multiple files to S3

s5cmd cp directory/ s3://bucket/

Will upload all files at given directory to S3 while keeping the folder hierarchy of the source.

Delete an S3 object

s5cmd rm s3://bucket/logs/2020/03/18/file1.gz

Delete multiple S3 objects

s5cmd rm s3://bucket/logs/2020/03/19/*

Will remove all matching objects:

s3://bucket/logs/2020/03/19/file2.gz
s3://bucket/logs/2020/03/19/originals/file3.gz

s5cmd utilizes S3 delete batch API. If matching objects are up to 1000, they'll be deleted in a single request. However, it should be noted that commands such as

s5cmd rm s3://bucket-foo/object s3://bucket-bar/object

are not supported by s5cmd and result in error (since we have 2 different buckets), as it is in odds with the benefit of performing batch delete requests. Thus, if in need, one can use s5cmd run mode for this case, i.e,

$ s5cmd run
rm s3://bucket-foo/object
rm s3://bucket-bar/object

more details and examples on s5cmd run are presented in a later section.

Copy objects from S3 to S3

s5cmd supports copying objects on the server side as well.

s5cmd cp 's3://bucket/logs/2020/*' s3://bucket/logs/backup/

Will copy all the matching objects to the given S3 prefix, respecting the source folder hierarchy.

⚠️ Copying objects (from S3 to S3) larger than 5GB is not supported yet. We have an open ticket to track the issue.

Count objects and determine total size

$ s5cmd du --humanize 's3://bucket/2020/*'

30.8M bytes in 3 objects: s3://bucket/2020/*

Run multiple commands in parallel

The most powerful feature of s5cmd is the commands file. Thousands of S3 and filesystem commands are declared in a file (or simply piped in from another process) and they are executed using multiple parallel workers. Since only one program is launched, thousands of unnecessary fork-exec calls are avoided. This way S3 execution times can reach a few thousand operations per second.

s5cmd run commands.txt

or

cat commands.txt | s5cmd run

commands.txt content could look like:

cp s3://bucket/2020/03/* logs/2020/03/

# line comments are supported
rm s3://bucket/2020/03/19/file2.gz

# empty lines are OK too like above

# rename an S3 object
mv s3://bucket/2020/03/18/file1.gz s3://bucket/2020/03/18/original/file.gz

# list all buckets
ls # inline comments are OK too

Dry run

--dry-run flag will output what operations will be performed without actually carrying out those operations.

s3://bucket/pre/file1.gz
...
s3://bucket/last.txt

running

s5cmd --dry-run cp s3://bucket/pre/* s3://another-bucket/

will output

cp s3://bucket/pre/file1.gz s3://another-bucket/file1.gz
...
cp s3://bucket/pre/last.txt s3://anohter-bucket/last.txt

however, those copy operations will not be performed. It is displaying what s5cmd will do when ran without --dry-run

Note that --dry-run can be used with any operation that has a side effect, i.e., cp, mv, rm, mb ...

Specifying credentials

s5cmd uses official AWS SDK to access S3. SDK requires credentials to sign requests to AWS. Credentials can be provided in a variety of ways:

  • Environment variables
  • AWS credentials file
  • If s5cmd runs on an Amazon EC2 instance, EC2 IAM role
  • If s5cmd runs on EKS, Kube IAM role

The SDK detects and uses the built-in providers automatically, without requiring manual configurations.

Shell auto-completion

Shell completion is supported for bash, zsh and fish.

To enable auto-completion, run:

s5cmd --install-completion

This will add a few lines to your shell configuration file. After installation, restart your shell to activate the changes.

Google Cloud Storage support

s5cmd supports S3 API compatible services, such as GCS, Minio or your favorite object storage.

s5cmd --endpoint-url https://storage.googleapis.com ls

will return your GCS buckets.

s5cmd will use virtual-host style bucket resolving for S3, S3 transfer acceleration and GCS. If a custom endpoint is provided, it'll fallback to path-style.

Retry logic

s5cmd uses an exponential backoff retry mechanism for transient or potential server-side throttling errors. Non-retriable errors, such as invalid credentials, authorization errors etc, will not be retried. By default, s5cmd will retry 10 times for up to a minute. Number of retries are adjustable via --retry-count flag.

ℹ️ Enable debug level logging for displaying retryable errors.

Using wildcards

Most shells can attempt to expand wildcards before passing the arguments to s5cmd, resulting in surprising no matches found errors.

To avoid this problem, surround the wildcarded expression with single quotes.

Output

s5cmd supports both structured and unstructured outputs.

  • unstructured output
$ s5cmd cp s3://bucket/testfile .

cp s3://bucket/testfile testfile
$ s5cmd cp --no-clobber s3://somebucket/file.txt file.txt

ERROR "cp s3://somebucket/file.txt file.txt": object already exists
  • If --json flag is provided:
{
    "operation": "cp",
    "success": true,
    "source": "s3://bucket/testfile",
    "destination": "testfile",
    "object": "[object]"
}
{
    "operation": "cp",
    "job": "cp s3://somebucket/file.txt file.txt",
    "error": "'cp s3://somebucket/file.txt file.txt': object already exists"
}

Benchmarks

Some benchmarks regarding the performance of s5cmd are introduced below. For more details refer to this post which is the source of the benchmarks to be presented.

Upload/download of single large file

get/put performance graph

Uploading large number of small-sized files

multi-object upload performance graph

Performance comparison on different hardware

s3 upload speed graph

So, where does all this speed come from?

There are mainly two reasons for this:

  • It is written in Go, a statically compiled language designed to make development of concurrent systems easy and make full utilization of multi-core processors.
  • Parallelization. s5cmd starts out with concurrent worker pools and parallelizes workloads as much as possible while trying to achieve maximum throughput.

Advanced Usage

Some of the advanced usage patterns provided below are inspired by the following article (thank you! @joshuarobinson)

Integrate s5cmd operations with Unix commands

Assume we have a set of objects on S3, and we would like to list them in sorted fashion according to object names.

$ s5cmd ls s3://bucket/reports/ | sort -k 4
2020/08/17 09:34:33              1364 antalya.csv
2020/08/17 09:34:33                 0 batman.csv
2020/08/17 09:34:33             23114 istanbul.csv
2020/08/17 09:34:33             26154 izmir.csv
2020/08/17 09:34:33               112 samsun.csv
2020/08/17 09:34:33             12552 van.csv

For a more practical scenario, let's say we have an avocado prices dataset, and we would like to take a peek at the few lines of the data by fetching only the necessary bytes.

$ s5cmd cat s3://bucket/avocado.csv.gz | gunzip | xsv slice --len 5 | xsv table
    Date        AveragePrice  Total Volume  4046     4225       4770   Total Bags  Small Bags  Large Bags  XLarge Bags  type          year  region
0   2015-12-27  1.33          64236.62      1036.74  54454.85   48.16  8696.87     8603.62     93.25       0.0          conventional  2015  Albany
1   2015-12-20  1.35          54876.98      674.28   44638.81   58.33  9505.56     9408.07     97.49       0.0          conventional  2015  Albany
2   2015-12-13  0.93          118220.22     794.7    109149.67  130.5  8145.35     8042.21     103.14      0.0          conventional  2015  Albany
3   2015-12-06  1.08          78992.15      1132.0   71976.41   72.58  5811.16     5677.4      133.76      0.0          conventional  2015  Albany
4   2015-11-29  1.28          51039.6       941.48   43838.39   75.78  6183.95     5986.26     197.69      0.0          conventional  2015  Albany

Beast Mode s5cmd

s5cmd allows to pass in some file, containing list of operations to be performed, as an argument to the run command as illustrated in the above example. Alternatively, one can pipe in commands into the run:

BUCKET=s5cmd-test; s5cmd ls s3://$BUCKET/*test | grep -v DIR | awk ‘{print $NF}’
| xargs -I {} echo “cp s3://$BUCKET/{} /local/directory/” | s5cmd run

The above command performs two s5cmd invocations; first, searches for files with test suffix and then creates a copy to local directory command for each matching file and finally, pipes in those into the run.

Let's examine another usage instance, where we migrate files older than 30 days to a cloud object storage:

find /mnt/joshua/nachos/ -type f -mtime +30 | xargs -I{} echo “mv {} s3://joshuarobinson/backup/{}”
| s5cmd run

It is worth to mention that, run command should not be considered as a silver bullet for all operations. For example, assume we want to remove the following objects:

s3://bucket/prefix/2020/03/object1.gz
s3://bucket/prefix/2020/04/object1.gz
...
s3://bucket/prefix/2020/09/object77.gz

Rather than executing

rm s3://bucket/prefix/2020/03/object1.gz
rm s3://bucket/prefix/2020/04/object1.gz
...
rm s3://bucket/prefix/2020/09/object77.gz

with run command, it is better to just use

rm s3://bucket/prefix/2020/0*/object*.gz

the latter sends single delete request per thousand objects, whereas using the former approach sends a separate delete request for each subcommand provided to run. Thus, there can be a significant runtime difference between those two approaches.

LICENSE

MIT. See LICENSE.

Issues
  • Unexpected subdirectory structure when running cp

    Unexpected subdirectory structure when running cp

    I'm running:

    AWS_REGION=my-region /home/ubuntu/go/bin/s5cmd cp -u -s --parents s3://my-bucket/my-subdirectory/.local/* /home/ubuntu/.local/

    I'm expecting the contents from .local inside my subdirectory to be copied into /home/ubuntu/.local/, instead, they are getting copied to /home/ubuntu/.local/my-subdirectory/.local

    Is this an expected behaviour? as per the command option documentation, the dir structure is created from the first wildcard onwards, and I recall it working like that in previous versions of s5cmd

    Please advise

    opened by agonzalezv 13
  • Add

    Add "sync" command

    Would be nice to be able to have a smart sync command like s3cmd sync or aws s3 sync, which will upload the files that changed and remove the deleted ones.

    feature-request 
    opened by Trane9991 10
  • sync command not found

    sync command not found

    Is sync currently disabled? I see the commit to main but not having any luck...

    ❯ s5cmd version
    v1.4.0-d7a0dda
    ❯ s5cmd sync
    ERROR "sync": command not found
    
    opened by brokosz 9
  • Backblaze b2 doesn't work

    Backblaze b2 doesn't work

    I attempted to use the ls command and got this ERROR

    ERROR "ls ": InvalidAccessKeyId: The key '[redacted]' is not valid status code: 403, request id: [redacted], host id: [redacted]
    

    but I've used other cli applications and didn't get an InvalidAccessKeyID error? Including, backblaze-b2, goofys, and s3fs.

    opened by worldofpeace 8
  • Support AWS SSO profiles

    Support AWS SSO profiles

    I set up my environment variables and check that the session is valid using the AWS CLI.

    $ export AWS_PROFILE=sandbox-logging
    $ export AWS_DEFAULT_REGION=eu-west-1
    $ aws sts get-caller-identity
    {
        "UserId": "AROAXXXXXXXXXXXXXXXXX:iain",
        "Account": "111111111111",
        "Arn": "arn:aws:sts::111111111111:assumed-role/AWSReservedSSO_AdministratorAccess_aaaaaaaaaaaaaaaa/iain"
    }
    

    s5cmd seems to ignore my environment variables and instead tries to query the EC2 metadata service.

    $ s5cmd --log debug ls
    DEBUG retryable error: RequestError: send request failed
    caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    DEBUG retryable error: RequestError: send request failed
    caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    DEBUG retryable error: RequestError: send request failed
    caused by: Put "http://169.254.169.254/latest/api/token": dial tcp 169.254.169.254:80: connect: no route to host
    

    I was expecting output like this:

    $ aws s3 ls
    2021-11-23 15:29:46 aws-sam-cli-managed-default-samclisourcebucket-xxxxxxxxxxxx
    2021-11-16 17:47:47 cf-templates-xxxxxxxxxxxxx-eu-west-1
    2021-11-30 20:11:58 org-trail-xxxxxxxx
    2021-12-01 12:36:42 org-trail-yyyyyyyy
    ...
    

    I'm using an SSO profile. Does that matter?

    [profile sandbox-logging]
    sso_start_url = https://d-1111111111.awsapps.com/start
    sso_region = eu-west-1
    sso_account_id = 111111111111
    sso_role_name = AdministratorAccess
    region = eu-west-1
    
    opened by iainelder 7
  • Improved URL Parsing

    Improved URL Parsing

    • Use shellquote to split lines in commands.txt files so filenames with spaces can be used
    • Escape regex characters in prefixes when initializing ObjectURLs. These may be a part of valid S3 prefixes.
    opened by brendan-matroid 7
  • Add --force-path-style flag to use path-style addressing.

    Add --force-path-style flag to use path-style addressing.

    Many/most non-AWS object stores only support path-style addressing, but recent changes switched to virtual-host addressing which broke access to those object stores. This adds a flag to switch back to path style addressing.

    For more info, a virtual-host address looks like "http://bucketname.endpoint", resulting in a DNS lookup. Path-style addressing looks like "http://endpoint/bucketname" instead. The benefit of path-style addressing is simplicity; you do not need to interact with DNS servers.

    opened by joshuarobinson 7
  • Workers gets stuck after processing a single command when processing a list of commands

    Workers gets stuck after processing a single command when processing a list of commands

    I have a file "commands.txt" with a list of commands like this:

    cp -n s3://bucket1/file1 s3://bucket2/file1 cp -n s3://bucket1/file2 s3://bucket2/file2 cp -n s3://bucket1/file3 s3://bucket2/file3 ...

    When I call s5cmd like

    s5cmd -numworkers 2 -f commands.txt

    I see output like

    2020/02/17 23:24:47 # Using 2 workers 2020/02/17 23:24:47 +OK "cp s3://bucket1/file1 s3://bucket2/file1" 2020/02/17 23:24:47 +OK "cp s3://bucket1/file2 s3://bucket2/file2"

    and then it gets stuck, until I hit Ctrl-C and see the following output

    2020/02/17 23:23:31 # Got signal, cleaning up... 2020/02/17 23:23:31 # Exiting with code 0 2020/02/17 23:23:31 # Stats: S3 3 0 ops/sec 2020/02/17 23:23:31 # Stats: Total 3 0 ops/sec 1m19.084946532s

    The first 2 files are actually copied correctly, the rest of the commands from the file is not executed. Same with taking the commands from the standard input.

    I just installed the latest version of s5cmd - v0.7.0.

    Deleting all the objects from the second bucket works fine with a command like:

    s5cmd rm s3://bucket2/*

    Any suggestions for workaround?

    Thanks.

    bug 
    opened by Boreas7 7
  • Warning: Calling bottle :unneeded is deprecated! There is no replacement.

    Warning: Calling bottle :unneeded is deprecated! There is no replacement.

    Just got a warning from brew outdated:

    Warning: Calling bottle :unneeded is deprecated! There is no replacement.
    Please report this issue to the peak/s5cmd tap (not Homebrew/brew or Homebrew/core):
      /home/linuxbrew/.linuxbrew/Homebrew/Library/Taps/peak/homebrew-s5cmd/Formula/s5cmd.rb:9
    
    opened by yermulnik 6
  • Got

    Got "read: connection reset by peer" error when using version 1.0.0+

    Tested on OpenStack Swift using s5cmd version 1.0.0 and 1.2.1 with the following command:

    s5cmd --endpoint-url http://objstor.testserver.com --no-verify-ssl  cp -n -s -u  aaaatest2 s3://aaaatest01/aaaajfg006/
    

    And got errors like this:

    ERROR "cp aaaatest2/2ms/2m_file_1 s3://aaaatest01/aaaajfg006/aaaatest2/2ms/2m_file_1": RequestError: send request failed caused by: Head "http://objstor.testserver.com/aaaatest01/aaaajfg006/aaaatest2/2ms/2m_file_1": read tcp xxx.xxx.xxx.xxx:38924->xxx.xxx.xxx.xxx:80: read: connection reset by peer
    

    However, it's all good using s5cmd v0.7.0 with the same command. No error returned at all.

    I aslo found an issue on aws-sdk-go repo (see below) complainting about the same error.

    https://github.com/aws/aws-sdk-go/issues/3027

    Is it possible that it's the new version of aws-sdk-go causes the problem?

    opened by kevin-wyx 6
  • add cache-control and expires options

    add cache-control and expires options

    Hello, this PR add support for setting cache-control and expires header to s3 object, same as aws cli

    s5cmd cp --acl "public-read" --expires "2024-10-01T20:30:00Z" --cache-control "public, max-age=345600, s-maxage=345600" foobar
      s3://example-bucket/foobar
    
    opened by tombokombo 5
  • Local file is lost even if the download fails

    Local file is lost even if the download fails

    Issue & steps to reproduce

    $ echo "Example text." > tmp
    $ cat tmp
    Example text.
    $ s5cmd cp s3://nonexistentbucket/key tmp
    ERROR "cp s3://nonexistentbucket/key tmp": InvalidAccessKeyId: The AWS Access Key Id you provided does not exist in our records. status code: 403, request id: PAYWDH7HHVPAHC8G, host id: cnI/cBhYZKKJ159f8mMs4KHOn1+jYtvfpYnXgMuKMRd9pZ10FBdi/cGuVxyr+iwrKbk2kP7Opx4=
    $ cat tmp
    cat: tmp: No such file or directory
    

    We would expect the local file to remain untouched since the download operation did not even started.

    Planned Solution

    Instead of creating the file before the download request we can use a custom WriterAt which will create the file once the first write request came. Following snippets are going to change: https://github.com/peak/s5cmd/blob/123b1c7fc9c614aa214a6795468fa140a38ad05e/command/cp.go#L511-L517 https://github.com/peak/s5cmd/blob/0431f503d99953e1809bf0e86d73750c0c1f561e/storage/fs.go#L208-L214

    opened by Kucukaslan 0
  • Migration aws sdk v2

    Migration aws sdk v2

    Working on migrating from aws-sdk-for-go to aws-sdk-for-go-v2

    Important changes:

    1. There is no session in aws-sdk-v2. Because of this, s5cmd will not have sessionCache anymore. Everything related to sessionCache including tests needs to be changed. This might create performance issues. I am still not sure how new sdk handles reuse of sessions. After completion, a benchmark will be made between this PR and another stable version to measure the new performance.
    2. There is no s3iface.S3API in v2. Instead of s3iface.S3API, s5cmd will have s3Client interface.
    3. As session structure has been changed, unit tests also require changes. Instead of unit.session, mockgen will be used for s3client interface.

    Changes worth to mention:

    1. There is no WithDisableRestProtocolURICleaning setting anymore as v2 doesn't do any cleaning or url joining.
    2. New SDK support back off delay. It might be useful to add as an additional optional value in the future.

    Files that will be changed:

    1. s3.go
    2. s3_test.go
    3. util_test.go
    4. mb_test.go
    5. rb_test.go
    opened by boraberke 0
  • command: fix target file is created despite the download failure

    command: fix target file is created despite the download failure

    Before downloading a file from s3 a local target file is created. If the download fails the created file should've been deleted. But since the file was not closed, the delete operation fails in Windows.

    Fixes #348

    opened by Kucukaslan 0
  • add versioning support

    add versioning support

    This commit adds partial versioning support to the s5cmd.

    • add all-versions flag to following subcommands:
      • ls
      • rm
      • du
    • add version-id flag to following sub commands:
      • cp
      • cat
      • rm
      • du

    Fixes: #386 Fixes: #218

    opened by Kucukaslan 2
  • Add benchmark script to compare two different builds of s5cmd

    Add benchmark script to compare two different builds of s5cmd

    This PR adds a bench.py file under a new folder benchmark. This python script allow us to compare two different build (from either version tag, PR number or commit tag) performance under various scenarios. These scenarios include:

    1. Upload a large file
    2. Upload many small sized file
    3. Download a large file
    4. Download many small sized file
    5. Remove a large file
    6. Remove many small sized file

    To change the scenarios, you should edit it inside the bench.py for now. In the future, this could be read from a file. From each scenario, user should not forget to change the file size and file count keeping in mind the restrictions of their system.

    To run use the following syntax:

    usage: bench.py [-h] -s OLD NEW [-w WARMUP] [-r RUNS] -b BUCKET [-p PREFIX] [-hf HYPERFINE_EXTRA_FLAGS] [-sf S5CMD_EXTRA_FLAGS]
    
    Compare performance of two different builds of s5cmd.
    
    optional arguments:
      -h, --help            show this help message and exit
      -s OLD NEW, --s5cmd OLD NEW
                            Reference to old and new s5cmd. It can be a decimal indicating PR number, any of the version tags like v2.0.0 or commit tag.
      -w WARMUP, --warmup WARMUP
                            Number of program executions before the actual benchmark:
      -r RUNS, --runs RUNS  Number of runs to perform for each command
      -b BUCKET, --bucket BUCKET
                            Name of the bucket in remote
      -p PREFIX, --prefix PREFIX
                            Key prefix to be used while uploading to a specified bucket
      -hf HYPERFINE_EXTRA_FLAGS, --hyperfine-extra-flags HYPERFINE_EXTRA_FLAGS
                            hyperfine global extra flags.Write in between quotation marks and start with a space to avoid bugs.
      -sf S5CMD_EXTRA_FLAGS, --s5cmd-extra-flags S5CMD_EXTRA_FLAGS
                            s5cmd global extra flags. Write in between quotation marks and start with a space to avoid bugs.
    
    

    Examples

    ./bench.py --bucket tempbucket --s5cmd v2.0.0 456 --warmup 2 --runs 10 
    

    Above command will compare v2.0.0 to PR:456 with 2 warmup runs and 10 benchmark runs.

    ./bench.py --bucket tempbucket --s5cmd v2.0.0 456 --warmup 2 --runs 10 -sf " --log error" -hf " --show-output"
    

    When using -hf and -sf flags, use quotes like above and start with an empty space. If not started with an empty space, it might give an error. This is a known issue with argparse and this discussion can be useful to understand the problem deeper.

    opened by boraberke 3
  • Feature Request - download intervals/sleep

    Feature Request - download intervals/sleep

    Hello,

    A bit of context for the feature request. I've been using s5cmd to download a large data set for work and I've experienced an issue associated with my internet provider. According to my provider, there are no download restrictions on the 3Gbps line that I'm using, however every time I use s5cmd to download a ~4.5Tb package, their "distribution center" seems to be applying a blocking rule, which essentially cuts off my entire internet for about 6 minutes before letting me go back online.

    At this point, I'm experiencing 5 min download, 6 min downtime consistently if I let it run.

    The feature request would be to be able to set a download interval of X amount of min/secs and a sleep time of x amount of min/secs.

    In my specific use case, this would about to letting s5cmd run the download for 4 minutes, take a 20s break then start over until it finishes (applicable to the sync or cp commands).

    opened by autronix 0
Releases(v2.0.0)
Owner
Peak
Peak
Open Source runtime tool which help to detect malware code execution and run time mis-configuration change on a kubernetes cluster

Kube-Knark Project Trace your kubernetes runtime !! Kube-Knark is an open source tracer uses pcap & ebpf technology to perform runtime tracing on a de

Chen Keinan 31 Jul 27, 2022
sget is a keyless safe script retrieval and execution tool

sget ⚠️ Not ready for use yet! sget is a keyless safe script retrieval and execution tool Security Should you discover any security issues, please ref

sigstore 19 Aug 4, 2022
DepCharge is a tool designed to help orchestrate the execution of commands across many directories at once.

DepCharge DepCharge is a tool that helps orchestrate the execution of commands across the many dependencies and directories in larger projects. It als

Andrew LeTourneau 22 Jul 16, 2022
Parallel processing through go routines, copy and delete thousands of key within some minutes

redis-dumper CLI Parallel processing through go routines, copy and delete thousands of key within some minutes copy data by key pattern from one redis

David Koller 0 Dec 26, 2021
Kubernetes-native framework for test definition and execution

████████ ███████ ███████ ████████ ██ ██ ██ ██ ██████ ███████ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ █████

kubeshop 387 Jul 30, 2022
Go Trusted Execution Environment (TEE)

Introduction The GoTEE framework implements concurrent instantiation of TamaGo based unikernels in privileged and unprivileged modes, interacting with

F-Secure Foundry 59 May 9, 2022
KNoC is a Kubernetes Virtual Kubelet that uses an HPC cluster as the container execution environment

Kubernetes Node on Cluster KNoC is a Virtual Kubelet Provider implementation that manages real pods and containers in a remote container runtime by su

Computer Architecture and VLSI Systems (CARV) Laboratory 5 Aug 2, 2022
Test-at-scale - TAS - An intelligent test execution platform for engineering teams to achieve high development velocity

Test At Scale Test Smarter, Release Faster with test-at-scale. Status Table of c

LambdaTest 189 Jul 22, 2022
⚡️ A dev tool for microservice developers to run local applications and/or forward others from/to Kubernetes SSH or TCP

Your new microservice development environment friend. This CLI tool allows you to define a configuration to work with both local applications (Go, Nod

Vincent Composieux 1.3k Jul 19, 2022
Dominik Robert 0 Jan 4, 2022
Openshift's hpessa-exporter allows users to export SMART information of local storage devices as Prometheus metrics, by using HPE Smart Storage Administrator tool

hpessa-exporter Overview Openshift's hpessa-exporter allows users to export SMART information of local storage devices as Prometheus metrics, by using

Shachar Sharon 0 Jan 17, 2022
topolvm operator provide kubernetes local storage which is light weight and high performance

Topolvm-Operator Topolvm-Operator is an open source cloud-native local storage orchestrator for Kubernetes, which bases on topolvm. Supported environm

Alauda.io 21 Jun 29, 2022
An high performance and ops-free local storage solution for Kubernetes.

Carina carina 是一个CSI插件,在Kubernetes集群中提供本地存储持久卷 项目状态:开发测试中 CSI Version: 1.3.0 Carina architecture 支持的环境 Kubernetes:1.20 1.19 1.18 Node OS:Linux Filesys

BoCloud 10 May 18, 2022
Carina: an high performance and ops-free local storage for kubernetes

Carina English | 中文 Background Storage systems are complex! There are more and more kubernetes native storage systems nowadays and stateful applicatio

null 358 Aug 8, 2022
Help developer to sync between local file and remote apollo portal web since portal web is so messy to use

apollo-synchronizer Help developer to sync between local file and remote apollo portal web since portal web is so messy to use Features download names

yeqown 3 May 16, 2022
A best practices Go source project with unit-test and integration test, also use skaffold & helm to automate CI & CD at local to optimize development cycle

Dependencies Docker Go 1.17 MySQL 8.0.25 Bootstrap Run chmod +x start.sh if start.sh script does not have privileged to run Run ./start.sh --bootstrap

Quang Nguyen 4 Apr 4, 2022
CSI Driver for dynamic provisioning of Persistent Local Volumes for Kubernetes using LVM.

OpenEBS LVM CSI Driver CSI driver for provisioning Local PVs backed by LVM and more. Project Status Currently the LVM CSI Driver is in alpha

OpenEBS 96 Jul 24, 2022
Open URL in your local web browser from the SSH-connected remote environment.

opener Open URL in your local web browser from the SSH-connected remote environment. How does opener work? opener is a daemon process that runs locall

Kazuki Suda 55 Jun 29, 2022
A cli that exposes your local resources to kubernetes

ktunnel Expose your local resources to kubernetes ?? Table of Contents About Getting Started Usage Documentation Contributing Authors Acknowledgments

Omri Eival 533 Aug 1, 2022