Experiment - Sync files to S3, fast. Go package and CLI.

Overview

Build Status

gosync

I want to be the fastest way to concurrently sync files and directories to/from S3.

Gosync will concurrently transfer your files to and from S3 (or across different S3 buckets). It will validate checksyms to ensure that only new or changed files are synced.

Installation

Ensure you have Go 1.2 or greater installed and your GOPATH is set.

Clone the repo:

go get github.com/brettweavnet/gosync

Change into the gosync directory and run make:

cd $GOPATH/src/github.com/brettweavnet/gosync/
make

Setup

Set environment variables (Security Token is optional):

AWS_SECRET_ACCESS_KEY=yyy
AWS_ACCESS_KEY_ID=xxx
AWS_SECURITY_TOKEN=xxx

Usage

gosync OPTIONS SOURCE TARGET

Syncing from local directory to S3

gosync /files s3://bucket/files

Syncing from S3 to local directory

gosync s3://bucket/files /files

Syncing from S3 to S3

gosync s3://source_bucket s3://target_bucket

Syncing from S3 to another directory in S3

gosync s3://source_bucket/dir s3://target_bucket/another_dir

Help

For full list of options and commands:

gosync -h

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request
Comments
  • set env sometime occur error

    set env sometime occur error

    #cat backup-mongodb.sh

    #!/bin/bash
    # need root 
    export AWS_ACCESS_KEY_ID=xxx && \
            export AWS_SECRET_ACCESS_KEY=xxx && \
            /home/tx/workspace/go/bin/gosync -l "error" /mnt/mongodb s3://qianlima/mongodb
    

    sh backup-mongodb.sh #first time ok

    sh backup-mongodb.sh #again error

    1410594494377402743 [Error] Received error 'The AWS Access Key Id you provided does not exist in our records.'
    

    sh backup-mongodb.sh #again ok

    sh backup-mongodb.sh #again error

    1410594469147053830 [Error] Received error 'The AWS Access Key Id you provided does not exist in our records.'
    
    opened by txthinking 5
  • add flags to keys

    add flags to keys

    beside log-level and concurrent, there should be flags to set aws key pairs so that gosync can be used in another program easily without setting the ENV variables.

    opened by ttback 4
  • convert to slash

    convert to slash

    To support Windows, convert backslashes in path to slashes.

    Problem gosync supposed that directory separetor is slash. But backslash is used on Windows.

    Solution Convert backslash (directory separetor) to slash by using filepath.ToSlash() function.

    opened by koron 2
  • If bucket name begins with one or more 's', it is cut on lookup. Causes error on incorrect lookup.

    If bucket name begins with one or more 's', it is cut on lookup. Causes error on incorrect lookup.

    For example:

    [email protected]:~/sudandola-builder/output$ gosync -l 'debug' . s3://samplebucket
    1412960077220223640 [Info] Setting log level 'debug'.
    1412960077220322948 [Info] Setting source to '.'.
    1412960077220329709 [Info] Setting target to 's3://samplebucket'.
    1412960077220337792 [Info] Setting concurrent transfers to '20'.
    1412960077220351742 [Info] Syncing to S3.
    1412960077895114765 [Debug] Loaded '3266' files from '.'.
    1412960077895127120 [Info] Loading local files complete.
    1412960077895136487 [Info] Looking up region for bucket 'amplebucket'.
    1412960077895142600 [Debug] Looking for bucket 'amplebucket' in 'ap-southeast-1'.
    1412960083887147934 [Error] Received error 'The specified bucket does not exist'
    

    Multiple 's':

    [email protected]:~/sudandola-builder/output$ gosync -l 'debug' . s3://sssamplebucket
    1412960329501528504 [Info] Setting log level 'debug'.
    1412960329501623024 [Info] Setting source to '.'.
    1412960329501629625 [Info] Setting target to 's3://sssamplebucket'.
    1412960329501639645 [Info] Setting concurrent transfers to '20'.
    1412960329501655483 [Info] Syncing to S3.
    1412960330173879163 [Debug] Loaded '3266' files from '.'.
    1412960330173890884 [Info] Loading local files complete.
    1412960330173900326 [Info] Looking up region for bucket 'amplebucket'.
    1412960330173905764 [Debug] Looking for bucket 'amplebucket' in 'ap-southeast-2'.
    1412960335638316002 [Error] Received error 'The specified bucket does not exist'
    
    opened by willfixlater 2
  • Prefixed downloads fixed

    Prefixed downloads fixed

    The prefixed downloads was trimming the prefix off key paths. This cause downloads formatted as s3://mybucket/mysubbucket to not download properly.

    Fixed by returning full file path rather than relative.

    opened by ericchiang 1
  • Add support for China & Gov Region

    Add support for China & Gov Region

    Currently the region searching logic will skip over them, this could be fixed by adding code to correctly lookup a bucket region.

    https://github.com/brettweavnet/gosync/blob/master/gosync/s3.go#L73-L78

    bug 
    opened by weavenet 0
  • 2GB file upload fail.

    2GB file upload fail.

    Currently gosync fails if upload file size is more than 2GB. I think the cause is that ioutil.ReadFile is called in sync.go : line 173.

    Reference URL: https://code.google.com/p/go/issues/detail?id=2743

    Here is my results of execution.

    $ gosync sync isos s3://s3sync-test/test1
    Syncing isos with s3://s3sync-test/test1
    panic: read isos/2Gdd.img: invalid argument
    
    goroutine 1 [running]:
    github.com/brettweavnet/gosync/gosync.func·001(0xc2000b2750, 0xd, 0xc2000af4b0, 0xc2000af5f0, 0x0, ...)
        /Users/ryuta/.go/src/github.com/brettweavnet/gosync/gosync/sync.go:175 +0x123
    path/filepath.walk(0xc2000b2750, 0xd, 0xc2000af4b0, 0xc2000af5f0, 0x7607f8, ...)
        /usr/local/go/src/pkg/path/filepath/path.go:341 +0x70
    path/filepath.walk(0x7fff5fbff8ba, 0x4, 0xc2000af4b0, 0xc2000af500, 0x7607f8, ...)
        /usr/local/go/src/pkg/path/filepath/path.go:359 +0x32a
    path/filepath.Walk(0x7fff5fbff8ba, 0x4, 0x7607f8, 0x300000050, 0x1ffbe, ...)
        /usr/local/go/src/pkg/path/filepath/path.go:380 +0xb5
    github.com/brettweavnet/gosync/gosync.loadLocalFiles(0x7fff5fbff8ba, 0x4, 0x720040)
        /Users/ryuta/.go/src/github.com/brettweavnet/gosync/gosync/sync.go:184 +0x8e
    github.com/brettweavnet/gosync/gosync.(*SyncPair).syncDirToS3(0xc2000af410, 0x4, 0xc2000af400)
        /Users/ryuta/.go/src/github.com/brettweavnet/gosync/gosync/sync.go:59 +0x3d
    github.com/brettweavnet/gosync/gosync.(*SyncPair).Sync(0xc2000af410, 0xc2000af410, 0x760ac8)
        /Users/ryuta/.go/src/github.com/brettweavnet/gosync/gosync/sync.go:30 +0x119
    main.func·001(0xc2000c5520)
        /Users/ryuta/Repos/gosync/gosync.go:39 +0x38f
    github.com/codegangsta/cli.Command.Run(0x29cf80, 0x4, 0x0, 0x0, 0x2d8390, ...)
        /Users/ryuta/.go/src/github.com/codegangsta/cli/command.go:25 +0x2a5
    github.com/codegangsta/cli.(*App).Run(0xc2000c7150, 0xc200090000, 0x4, 0x4)
        /Users/ryuta/.go/src/github.com/codegangsta/cli/app.go:57 +0x5f7
    main.main()
        /Users/ryuta/Repos/gosync/gosync.go:49 +0x15f
    
    goroutine 2 [syscall]:
    $
    

    Regards

    opened by takipone 1
Owner
null
Cloud cost estimates for Terraform in your CLI and pull requests 💰📉

Infracost shows cloud cost estimates for Terraform projects. It helps developers, devops and others to quickly see the cost breakdown and compare different options upfront.

Infracost 8.4k Jan 2, 2023
A Cloud Foundry cli plugin that offers a faster and customizable alternative for cf apps

Panzer cf cli plugin A plugin for faster interaction (less API calls) with Cloud Foundry, and choose the columns you want in your output. Instead of "

Harry Metske 0 Feb 14, 2022
A Cloud Native Buildpack that contributes the Syft CLI which can be used to generate SBoM information

gcr.io/paketo-buildpacks/syft The Paketo Syft Buildpack is a Cloud Native Buildpack that contributes the Syft CLI which can be used to generate SBoM i

Paketo Buildpacks 4 Dec 14, 2022
Production-Grade Container Scheduling and Management

Kubernetes (K8s) Kubernetes, also known as K8s, is an open source system for managing containerized applications across multiple hosts. It provides ba

Kubernetes 94.7k Dec 28, 2022
Run the same Docker images in AWS Lambda and AWS ECS

serverlessish tl;dr Run the exact same image for websites in Lambda as you do in ECS, Kubernetes, etc. Just add this to your Dockerfile, listen on por

Glass Echidna 183 Dec 22, 2022
JuiceFS is a distributed POSIX file system built on top of Redis and S3.

JuiceFS is an open-source POSIX file system built on top of Redis and object storage (e.g. Amazon S3), designed and optimized for cloud native environ

Juicedata, Inc 7.2k Jan 2, 2023
Fleex allows you to create multiple VPS on cloud providers and use them to distribute your workload.

Fleex allows you to create multiple VPS on cloud providers and use them to distribute your workload. Run tools like masscan, puredns, ffuf, httpx or anything you need and get results quickly!

null 177 Jan 6, 2023
☁️🏃 Get up and running with Go on Google Cloud.

Get up and running with Go and gRPC on Google Cloud Platform, with this lightweight, opinionated, batteries-included service SDK.

Einride 31 Dec 20, 2022
Elkeid is a Cloud-Native Host-Based Intrusion Detection solution project to provide next-generation Threat Detection and Behavior Audition with modern architecture.

Elkeid is a Cloud-Native Host-Based Intrusion Detection solution project to provide next-generation Threat Detection and Behavior Audition with modern architecture.

Bytedance Inc. 1.6k Dec 30, 2022
Sample apps and code written for Google Cloud in the Go programming language.

Google Cloud Platform Go Samples This repository holds sample code written in Go that demonstrates the Google Cloud Platform. Some samples have accomp

Google Cloud Platform 3.7k Jan 9, 2023
Use Google Cloud KMS as an io.Reader and rand.Source.

Google Cloud KMS Go io.Reader and rand.Source This package provides a struct that implements Go's io.Reader and math/rand.Source interfaces, using Goo

Seth Vargo 5 Dec 1, 2022
A Cloud Native Buildpack that contributes SDKMAN and uses it to install dependencies like the Java Virtual Machine

gcr.io/paketo-buildpacks/sdkman A Cloud Native Buildpack that contributes SDKMAN and uses it to install dependencies like the Java Virtual Machine. Be

Daniel Mikusa 1 Jan 8, 2022
Microshift is a research project that is exploring how OpenShift1 Kubernetes can be optimized for small form factor and edge computing.

Microshift is a research project that is exploring how OpenShift1 Kubernetes can be optimized for small form factor and edge computing.

Oleg Silkin 0 Nov 1, 2021
Contentrouter - Protect static content via Firebase Hosting with Cloud Run and Google Cloud Storage

contentrouter A Cloud Run service to gate static content stored in Google Cloud

G. Hussain Chinoy 0 Jan 2, 2022
grafana-sync Keep your grafana dashboards in sync.

grafana-sync Keep your grafana dashboards in sync. Table of Contents grafana-sync Table of Contents Installing Getting Started Pull Save all dashboard

Maksym Postument 169 Dec 14, 2022
Go-api-cli - Small CLI to fetch data from an API sync and async

Async API Cli CLI to fetch data on "todos" from a given API in a number of ways.

Pete Robinson 0 Jan 13, 2022
Split multiple Kubernetes files into smaller files with ease. Split multi-YAML files into individual files.

Split multiple Kubernetes files into smaller files with ease. Split multi-YAML files into individual files.

Patrick D'appollonio 204 Dec 29, 2022
Split multiple Kubernetes files into smaller files with ease. Split multi-YAML files into individual files.

kubectl-slice: split Kubernetes YAMLs into files kubectl-slice is a neat tool that allows you to split a single multi-YAML Kubernetes manifest into mu

Patrick D'appollonio 205 Jan 3, 2023
Rclone ("rsync for cloud storage") is a command line program to sync files and directories to and from different cloud storage providers.

Rclone ("rsync for cloud storage") is a command line program to sync files and directories to and from different cloud storage providers.

rclone 36.4k Jan 5, 2023
Rclone ("rsync for cloud storage") is a command-line program to sync files and directories to and from different cloud storage providers.

Website | Documentation | Download | Contributing | Changelog | Installation | Forum Rclone Rclone ("rsync for cloud storage") is a command-line progr

null 0 Nov 5, 2021