A tera-scale file uploader

Overview

GoSƐ logo

GoSƐ - A terascale file-uploader

goreportcard Codacy grade License GitHub go.mod Go version Go Reference

GoSƐ is a modern and scalable file-uploader focusing on scalability and simplicity. It is a little hobby project I’ve been working on over the last weekends.

The only requirement for GoSƐ is a S3 storage backend which allows to it to scale horizontally without the need for additional databases or caches. Uploaded files a divided into equally sized chunks which are hashed with a MD5 digest in the browser for upload. This allows GoSƐ to skip chunks which already exist. Seamless resumption of interrupted uploads and storage savings are the consequence.

And either way both upload and downloads are always directed directly at the S3 server so GoSƐ only sees a few small HTTP requests instead of the bulk of the data. Behind the scenes, GoSƐ uses many of the more advanced S3 features like Multi-part Uploads and Pre-signed Requests to make this happen.

Users have a few options to select between multiple pre-configured S3 buckets/servers or enable browser & mail notifications about completed uploads. A customisable retention / expiration time for each upload is also selectable by the user and implemented by S3 life-cycle policies. Optionally, users can also opt-in to use an external service to shorten the URL of the uploaded file.

Currently a single concurrent upload of a single file is supported. Users can observe the progress via a table of details statistics, a progress-bar and a chart showing the current transfer speed.

GoSƐ aims at keeping its deployment simple and by bundling both front- & backend components in a single binary or Docker image. GoSƐ has been tested with AWS S3, Ceph’s RadosGW and Minio. Pre-built binaries and Docker images of GoSƐ are available for all major operating systems and architectures at the release page.

GoSƐ is open-source software licensed under the Apache 2.0 license.

Check our my blog article for more background info.

Features

  • De-duplication of uploaded files based on their content-hash
    • Uploads of existing files will complete in no-time without re-upload
  • S3 Multi-part uploads
    • Resumption of interrupted uploads
  • Drag & Drop of files
  • Browser notifications about failed & completed uploads
  • User-provided object expiration/retention time
  • Copy URL of uploaded file to clip-board
  • Detailed transfer statistics and progress-bar / chart
  • Installation via single binary or container
    • JS/HTML/CSS Frontend is bundled into binary
  • Scalable to multiple replicas
    • All state is kept in the S3 storage backend
    • No other database or cache is required
  • Direct up & download to Amazon S3 via presigned URLs
    • Gose deployment does not see an significant traffic
  • UTF-8 filenames
  • Multiple user-selectable buckets / servers
  • Optional link shortening via an external service
  • Optional notification about new uploads via shoutrrr
    • Mail notifications to user-provided recipient
  • Cross-platform support:
    • Operating systems: Windows, macOS, Linux, BSD
    • Architectures: arm64, amd64, armv7, i386

Roadmap

Checkout the Github issue tracker.

Demo (click for Live-Demo)

Gose demo screencast

Installation

Pre-compiled binaries from GitHub releases

Take the download link for your OS/Arch from the Releases Page and run:

export RELEASE_URL=https://github.com/stv0g/gose/releases/download/v0.0.2/gose_0.0.2_linux_amd64
wget "${RELEASE_URL}" -O gose
chmod +x gose
mv gose /usr/local/bin

Kubernetes / Kustomize

  1. Copy default configuration file: cp config.yaml kustomize/config.yaml
  2. Adjust config: nano kustomize/config.yaml
  3. Apply configuration: kubectl apply -k kustomize

Docker

Via environment variables in .env file:

docker run --env-file=.env --publish=8080:8080 ghcr.io/stv0g/gose

or via a configuration file:

docker run -v$(pwd)/config.yaml:/config.yaml --publish=8080:8080 ghcr.io/stv0g/gose -config /config.yaml

Configuration

Gose can be configured via a configuration file and/or environment variables

File

For reference have a look at the example configuration file.

Environment variables

All settings from the configuration file can also be set via environment variables:

Variable Example Value Description
GOSE_LISTEN ":8080" Listen address and port of Gose
GOSE_BASE_URL "http://localhost:8080" Base URL at which Gose is accessible
GOSE_STATIC "./dist" Directory of frontend assets if not bundled into the binary
GOSE_BUCKET gose-uploads Name of S3 bucket
GOSE_ENDPOINT s3.0l.de Hostname of S3 server
GOSE_REGION s3 Region of S3 server
GOSE_PATH_STYLE true Prepend bucket name to path
GOSE_NO_SSL false Disable SSL encryption for S3
GOSE_ACCESS_KEY S3 Access Key
GOSE_SECRET_KEY S3 Secret Key
AWS_ACCESS_KEY_ID alias for GOSE_S3_ACCESS_KEY
AWS_SECRET_ACCESS_KEY alias for AWS_SECRET_ACCESS_KEY
GOSE_S3_MAX_UPLOAD_SIZE 5TB Maximum upload size
GOSE_S3_PART_SIZE 5MB Part-size for multi-part uploads
GOSE_S3_EXPIRATION_DEFAULT_CLASS 1week # one of the tags below Default expiration class
GOSE_SHORTENER_ENDPOINT "https://shlink-api/rest/v2/short-urls/shorten?apiKey=<your-api-token>&format=txt&longUrl={{.UrlEscaped}}" API Endpoint of link shortener
GOSE_SHORTENER_METHOD GET HTTP method for link shortener
GOSE_SHORTENER_RESPONSE raw Response type of link shortener
GOSE_NOTIFICATION_URLS pushover://shoutrrr:<api-token>@<user-key>?devices=laptop1&title=Upload Service URLs for shoutrrr notifications
GOSE_NOTIFICATION_TEMPLATE "New Upload: {{.URL}}" Notification message template
GOSE_NOTIFICATION_MAIL_URL smtp://user:[email protected]:port/[email protected] Service URLs for shoutrrr notifications
GOSE_NOTIFICATION_MAIL_TEMPLATE "New Upload: {{.URL}}" Notification message template

Author

GoSƐ has been written by Steffen Vogel.

License

GoSƐ is licensed under the Apache 2.0 license.

Issues
  • Test supported S3 implementations

    Test supported S3 implementations

    This looks very well designed and handles many edge cases others ignore - great work

    i was wondering is we could also support minio and google cloud storage .. I have not texted it yet but it would be useful to know if you have ?

    opened by gedw99 13
  • no atomicity guarantees for object operations

    no atomicity guarantees for object operations

    This is a new issue but continues on from our discussion in another issue I closed.

    The proposal was to add Redis as a global counter system .

    Here is my Use case.

    i have clients ( browsers ) that store files and do mutations on those files

    I have a server running in google cloud run , that talks to google storage / s3

    The server is using s3 as it’s global store. Each web client has its own file store that persists offline ( using indexeddb ).

    So we have an interesting situation with a file instance existing outside s3.

    The notification system is what I am really interested in so that when a file on s3 changes we can notify all servers and clients. I think gose can do that now ?

    when the client or server gets this event it should then request the file so it can get up to date.

    when it asks for the file however it would be much more performant if it could get a binary byte array of the file difference . Yeah this is rather hard

    So regarding Redis , I think that the file change events and the file change diff could be stored on this global message bus. So essentially the notification system uses this distributed message bus and can store the binary diff of the file .

    Now I know that offline clients will not really work in this architecture , in the same way that git merges don’t always work because someone else changes the same part of the file but I think it’s possible to make that crazy aspect by using be tie clocks as well as a Chronos style clock system. But that’s more of a layer thing .

    The main thing is to see if you think it’s possible to store the file diff in the message bus.

    As i think about it it would mean that the Clients and Servers don’t even talk to s3 for sending file mutations . They would calculate the binary diff and send that into the message bus.

    Importantly creation of new files would be send directly to S3.

    I know this is probably out of scope for what Gose was created but I figured it’s worth explaining to see if you or others are interested in this use case .

    https://flurryhead.medium.com/vector-clock-applications-965677624b94

    opened by gedw99 6
  • Minio uploading fails

    Minio uploading fails

    When trying to upload a file to a minio bucket i get Upload failed: [object Object] in gose and this in the minio traces raw

    docker-compose.yml
    version: '3.7'
    services:
      minio:
        image: minio/minio:RELEASE.2022-06-03T01-40-53Z.fips
        command: server /mnt/data --console-address ":9001"
        ports:
          - 8200:9000 # API
          - 8201:9001 # Webinterface
        environment:
          MINIO_ROOT_USER: "<redacted>"
          MINIO_ROOT_PASSWORD: "<redacted>"
          MINIO_SERVER_URL: "<redacted>"
          MINIO_SITE_REGION: "s3"
        volumes:
          - minio-data:/mnt/data
        healthcheck:
          test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
          interval: 30s
          timeout: 20s
          retries: 3
      gose:
        image: ghcr.io/stv0g/gose:v0.4.0
        ports:
          - 8203:8080
        command: -config /config.yml
        volumes:
          - ./config.yml:/config.yml
        #environment:
          #GOSE_LISTEN: ":8080"
          #GOSE_BASE_URL: "<redacted>"
          #GOSE_BUCKET: "gose-uploads"
          #GOSE_ENDPOINT: "minio:9000"
          #GOSE_REGION: "s3"
          #GOSE_PATH_STYLE: "true"
          #GOSE_NO_SSL: "true"
          #GOSE_ACCESS_KEY: "<redacted>" # MINIO_ROOT_USER
          #GOSE_SECRET_KEY: "<redacted>" # MINIO_ROOT_PASSWORD
          #GOSE_S3_MAX_UPLOAD_SIZE: "50GB"
          #GOSE_S3_PART_SIZE: "5MB"
          #GOSE_S3_EXPIRATION_DEFAULT_CLASS: "1week"
        depends_on: [ minio ]
    
    volumes:
      minio-data:
    
    config.yml
    listen: ":8080"
    
    base_url: <redacted>
    
    max_upload_size: 50GB
    part_size: 5MB
    
    servers:
    - bucket: gose-uploads
    
      endpoint: minio:9000
      region: s3
    
      path_style: true
      no_ssl: true
    
      access_key: "<redacted>" # MINIO_ROOT_USER
      secret_key: "<redacted>" # MINIO_ROOT_PASSWORD
    
      create_bucket: true
    
      expiration:
      - id: 1day
        title: 1 day
        days: 1
      - id: 1week
        title: 1 week
        days: 7
      - id: 1month
        title: 1 month
        days: 31
      - id: 1year
        title: 1 year
        days: 365
    

    passing the config settings as environment variables seems to not work as well, settings like the bucket and max_upload/part_size arent getting set for me thats why i'm using the config

    opened by LeagueRaINi 6
  • Using

    Using "Server" HTTP header is not a reliable mechanism for detecting S3 implementation

    So, I fixed the final issue with the final checksum mismatch and tested it with the included docker-compose.yml file.

    Let me know if works for you :)

    got yet another issue i'm afraid, since i have my domain routed through cloudflare gose now thinks i'm using cloudflare s3 and not minio

    Originally posted by @LeagueRaINi in https://github.com/stv0g/gose/issues/23#issuecomment-1152749492

    opened by stv0g 5
  • Not MinIO compatible?

    Not MinIO compatible?

    So either i'm doing something terrible wrong or gose is not compatible with the latest version of MinIO.

    Gose is throwing me this error on startup

    gose_1   | 2022/06/02 22:31:33 Failed to setup servers: failed to set bucket gose-uploads's CORS rules: MalformedXML: The XML you provided was not well-formed or did not validate against our published schema.
    gose_1   |      status code: 400, request id: 16F4EE6811CC0219, host id:
    

    and MinIO complains about

    minio_1  | API: SYSTEM()
    minio_1  | Time: 22:31:33 UTC 06/02/2022
    minio_1  | DeploymentID: 6fd95624-9ba0-4420-9c8e-e97c46253cf9
    minio_1  | Error: expected element type <CreateBucketConfiguration> but have <CORSConfiguration> (xml.UnmarshalError)
    minio_1  |        6: internal/logger/logger.go:278:logger.LogIf()
    minio_1  |        5: cmd/handler-utils.go:55:cmd.parseLocationConstraint()
    minio_1  |        4: cmd/auth-handler.go:344:cmd.checkRequestAuthTypeCredential()
    minio_1  |        3: cmd/auth-handler.go:296:cmd.checkRequestAuthType()
    minio_1  |        2: cmd/bucket-handlers.go:731:cmd.objectAPIHandlers.PutBucketHandler()
    minio_1  |        1: net/http/server.go:2047:http.HandlerFunc.ServeHTTP()
    
    bug 
    opened by LeagueRaINi 3
  • Support server-side encryption via SSE-C

    Support server-side encryption via SSE-C

    We could add support for server-side encryption as supported by S3 SSE-C.

    However this currently has the disadvantage that a user could not simple use the shortened URL anymore to download it via CURL/Wget, as SSE-C requires custom HTTP Headers for download.

    This could be possibly worked around by a custom Javascript landing page which passes the headers via fetch() and then streams the response.

    However, in this case we could already implement full end-to-end encryption which would make server-side encryption obsolete.

    enhancement 
    opened by stv0g 1
  • Allow users to specify a per-file retention time

    Allow users to specify a per-file retention time

    We currently use S3 bucket lifecycle rules to define a set of rentention classes from which the user could choose. Each uploaded file gets a tag assign which is matched by one of the life-cycle rules.

    enhancement 
    opened by stv0g 1
  • Url shortener bugged

    Url shortener bugged

    Despite not having set a url shortener and not being shown the setting in the gose ui it defaults to enabled leading to Upload failed: Failed API request: shortened URL requested but nut supported raw

    opened by LeagueRaINi 0
  • Add project description

    Add project description

    GoSƐ is a modern file-uploader focusing on scalability and simplicity. It only depends on an S3 storage backend and can scale horizontally without the need of an additional database or cache. GoSƐ aims at keeping its deployment simple and by bundling both front end backend components in a single binary and Docker image. GoSƐ has been tested with AWS S3, Ceph's RadosGW and Minio. Pre-built binaries and Docker images of GoSƐ are available for all major operating systems and architectures at the release page.

    opened by stv0g 0
  • Announce project

    Announce project

    • [x] https://noteblok.net/2022/04/03/gos%c9%9b-a-terascale-file-uploader/
    • [x] https://twitter.com/stv0g/status/1510612743811964946
    • [x] https://selfhosted.libhunt.com/gose-alternatives
    • [x] https://alternativeto.net/software/gos-/about/
    • [x] https://www.linkedin.com/posts/stv0g_in-my-latest-weekend-project-ive-developed-activity-6916379303098085376
    • [x] https://www.reddit.com/r/selfhosted/comments/tv9sd9/gos%C9%9B_a_selfhosted_terascale_fileuploader/ 9q8K
    • [ ] https://github.com/awesome-selfhosted/awesome-selfhosted/pull/2949
    • [ ] ~~https://github.com/donnemartin/awesome-aws~~
    opened by stv0g 0
  • Resumable uploads

    Resumable uploads

    We can check if there is an existing and incomplete multi-part upload (MPU) and instruct the frontend to continue with the missing parts.

    An open question is how we find a matching MPU. This could be based on the filename and client-ip or the checksum of the whole file.

    enhancement 
    opened by stv0g 0
  • Investigate to upload files to IPFS

    Investigate to upload files to IPFS

    I have the idea on using S3 as a storage provider for IPFS. Given the fact the main IPFS client is implemented in Go, we could incorporate it into Gose and announce uploaded files to the IPFS DHT.

    However for this to work we would need to make sure that IPFS can serve files from S3. In addition we would need to check if we can work around hashing the file itself in Gose without re-downloading it for hashing.

    Related: https://github.com/ipfs/go-ipfs/blob/master/docs/datastores.md

    enhancement 
    opened by stv0g 0
  • Show more details after upload

    Show more details after upload

    I might be worth while to create a separate page to display upload details:

    • [ ] File size
    • [ ] File name
    • [ ] File expiration time
    • [ ] Number of allowed downloads (if configured)
    • [ ] Checksums
      • MD5
      • SAH256
    • [ ] Instructions for checksum validation
    • [ ] Manual expiration / deletion link
    enhancement 
    opened by stv0g 0
  • Branding

    Branding

    To allow customization of:

    • [ ] Logo
    • [ ] Logo dimensions
    • [ ] Title
    • [ ] Sub Title
    • [ ] Footer
    • [ ] Colors
    • [ ] Custom CSS

    Furthermore:

    • [ ] Translation
    • [x] Version Number
    • [ ] Available capacity
    enhancement 
    opened by stv0g 0
Releases(v0.6.0)
Owner
Steffen Vogel
research associate working on real-time simulation and automation protocol for next-generation power systems
Steffen Vogel
go-fastdfs 是一个简单的分布式文件系统(私有云存储),具有无中心、高性能,高可靠,免维护等优点,支持断点续传,分块上传,小文件合并,自动同步,自动修复。Go-fastdfs is a simple distributed file system (private cloud storage), with no center, high performance, high reliability, maintenance free and other advantages, support breakpoint continuation, block upload, small file merge, automatic synchronization, automatic repair.(similar fastdfs).

中文 English 愿景:为用户提供最简单、可靠、高效的分布式文件系统。 go-fastdfs是一个基于http协议的分布式文件系统,它基于大道至简的设计理念,一切从简设计,使得它的运维及扩展变得更加简单,它具有高性能、高可靠、无中心、免维护等优点。 大家担心的是这么简单的文件系统,靠不靠谱,可不

小张 3.2k Jun 30, 2022
Abstract File Storage

afs - abstract file storage Please refer to CHANGELOG.md if you encounter breaking changes. Motivation Introduction Usage Matchers Content modifiers S

Viant, Inc 186 Jun 19, 2022
a tool for handling file uploads simple

baraka a tool for handling file uploads for http servers makes it easier to make operations with files from the http request. Contents Install Simple

Enes Furkan Olcay 43 Jun 4, 2022
Bigfile -- a file transfer system that supports http, rpc and ftp protocol https://bigfile.site

Bigfile ———— a file transfer system that supports http, rpc and ftp protocol 简体中文 ∙ English Bigfile is a file transfer system, supports http, ftp and

null 225 Jun 13, 2022
Go file operations library chasing GNU APIs.

flop flop aims to make copying files easier in Go, and is modeled after GNU cp. Most administrators and engineers interact with GNU utilities every da

The Home Depot 31 Feb 10, 2022
Read csv file from go using tags

go-csv-tag Read csv file from Go using tags The project is in maintenance mode. It is kept compatible with changes in the Go ecosystem but no new feat

Louis 94 Apr 9, 2022
File system event notification library on steroids.

notify Filesystem event notification library on steroids. (under active development) Documentation godoc.org/github.com/rjeczalik/notify Installation

Rafal Jeczalik 743 Jun 12, 2022
Pluggable, extensible virtual file system for Go

vfs Package vfs provides a pluggable, extensible, and opinionated set of file system functionality for Go across a number of file system types such as

C2FO 174 Jun 17, 2022
An epoll(7)-based file-descriptor multiplexer.

poller Package poller is a file-descriptor multiplexer. Download: go get github.com/npat-efault/poller Package poller is a file-descriptor multiplexer

Nick Patavalis 105 Apr 5, 2022
QueryCSV enables you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to a CSV file

QueryCSV enable you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to CSV file

Mohamed Shapan 100 Dec 22, 2021
Goful is a CUI file manager written in Go.

Goful Goful is a CUI file manager written in Go. Works on cross-platform such as gnome-terminal and cmd.exe. Displays multiple windows and workspaces.

anmitsu 279 Jun 20, 2022
Read a tar file contents using go1.16 io/fs abstraction

go-tarfs Read a tar file contents using go1.16 io/fs abstraction Usage ⚠️ go-tarfs needs go>=1.16 Install: go get github.com/nlepage/go-tarfs Use: pac

Nicolas Lepage 17 Mar 26, 2022
Open Source Continuous File Synchronization

Goals Syncthing is a continuous file synchronization program. It synchronizes files between two or more computers. We strive to fulfill the goals belo

The Syncthing Project 45.4k Jun 26, 2022
Cross-platform file system notifications for Go.

File system notifications for Go fsnotify utilizes golang.org/x/sys rather than syscall from the standard library. Ensure you have the latest version

fsnotify 7k Jun 22, 2022
The best HTTP Static File Server, write with golang+vue

gohttpserver Goal: Make the best HTTP File Server. Features: Human-friendly UI, file uploading support, direct QR-code generation for Apple & Android

Sound Sun 1.7k Jun 28, 2022
Dragonfly is an intelligent P2P based image and file distribution system.

Dragonfly Note: The master branch may be in an unstable or even broken state during development. Please use releases instead of the master branch in o

dragonflyoss 5.8k Jun 30, 2022
Fast, dependency-free, small Go package to infer the binary file type based on the magic numbers signature

filetype Small and dependency free Go package to infer file and MIME type checking the magic numbers signature. For SVG file type checking, see go-is-

Tom 1.5k Jun 23, 2022
📂 Web File Browser

filebrowser provides a file managing interface within a specified directory and it can be used to upload, delete, preview, rename and edit your files.

File Browser 16.3k Jun 27, 2022
Plik is a scalable & friendly temporary file upload system ( wetransfer like ) in golang.

Want to chat with us ? Telegram channel : https://t.me/plik_root_gg Plik Plik is a scalable & friendly temporary file upload system ( wetransfer like

root.gg 1k Jun 28, 2022