Got: Simple golang package and CLI tool to download large files faster πŸƒ than cURL and Wget!

Overview

Got.

Simple and fast concurrent downloader.

Installation ❘ CLI Usage ❘ Module Usage ❘ License

Tests

Comparison

Comparison in cloud server:

[[email protected] ~]# time got -o /tmp/test -c 20 http://www.ovh.net/files/1Gio.dat
URL: http://www.ovh.net/files/1Gio.dat done!

real    0m8.832s
user    0m0.203s
sys 0m3.176s


[[email protected] ~]# time curl http://www.ovh.net/files/1Gio.dat --output /tmp/test1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
								 Dload  Upload   Total   Spent    Left  Speed
100 1024M  100 1024M    0     0  35.6M      0  0:00:28  0:00:28 --:--:-- 34.4M

real    0m28.781s
user    0m0.379s
sys 0m1.970s

Installation

Download and install the latest release:

# go to tmp dir.
cd /tmp

# Download latest version.
curl -sfL https://git.io/getgot | sh

# Make the binary executable.
chmod +x /tmp/bin/got

# Move the binary to your PATH
sudo mv /tmp/bin/got /usr/bin/got

Or Go ahead compile it yourself:

go get github.com/melbahja/got/cmd/got

Or from the AUR

Install got for the latest release version or got-git for the latest development version.

Note: these packages are not maintained by melbahja

Command Line Tool Usage

Simple usage:

got https://example.com/file.mp4

You can specify destination path:

got -o /path/to/save https://example.com/file.mp4

You can download multiple URLs and save them to directory:

got --dir /path/to/dir https://example.com/file.mp4 https://example.com/file2.mp4

You can download multiple URLs from a file:

got --dir /path/to/dir -f urls.txt

You can pipe multiple URLs:

cat urls.txt | got --dir /path/to/dir

Docs for available flags:

got help

Module Usage

You can use Got to download large files in your go code, the usage is simple as the CLI tool:

package main

import "github.com/melbahja/got"

func main() {

	g := got.New()

	err := g.Download("http://localhost/file.ext", "/path/to/save")

	if err != nil {
		// ..
	}
}

For more see PkgDocs.

How It Works?

Got takes advantage of the HTTP range requests support in servers RFC 7233, if the server supports partial content Got split the file into chunks, then starts downloading and merging the chunks into the destinaton file concurrently.

License

Got is provided under the MIT License Β© Mohammed El Bahja.

Issues
  • v0.2

    v0.2

    Adding support for multiple file downloads and Context

    Related PRs: #15 and #16

    The New API:

    Multiple file download

    g := got.New()
    
    // Optional
    // g.Client = your custom http client
    
    err := g.Download("http://lcoalhost/file.ext", "/path/to/save")
    

    Multiple files with context:

    g := got.NewWithContext(context.Background())
    
    // file 1
    err := g.Download("http://lcoalhost/file.ext", "/path/to/save1")
    
    // file 2
    err := g.Download("http://lcoalhost/file2.ext", "/path/to/save2")
    
    // Download config.
    err := g.Do(&got.Download{
    		URL:       "http://lcoalhost/file2.ext",
    		Dest:      "/path/to/save2",
    		Concurrency: 10,
    })
    

    Download Single file:

    
      ctx := context.Background()
    
    	dl := got.NewDownload(ctx,"http://lcoalhost/file.ext", "/path/to/save")
    
    	// Init
    	if err := dl.Init(); err != nil {
    		// err
    	}
    
    	// Start download
    	if err := dl.Start(); err != nil {
    		// err
    	}
    

    CLI

    Single file:

    got https://example.com/file.mp4
    

    Multiple files:

    got https://example.com/file.mp4 https://example.com/file2.mp4
    

    Multiple save to specific dir:

    got --dir /path/to/dir https://example.com/file.mp4 https://example.com/file2.mp4
    

    From stdin:

    cat file.txt | got --dir /dir
    

    From file:

    got --bf file.txt --dir /dir
    

    Waiting for your feedback: @suzaku @malusev998 @poldi1405 @xurwxj

    opened by melbahja 19
  • Adding context.Context as parameter to Download.Start and Cancellation to the main Binary

    Adding context.Context as parameter to Download.Start and Cancellation to the main Binary

    1. Extracting context.Context from the internals of the Download.Start() method as placing it as parameters allows fine graned control over the cancellation. This breaks backwards compatibility!

    2. Cancellation to the program in main binary - when system interrupts the program in any way unfinished download should be removed from the system

    Signed-off-by: Dusan Malusev [email protected]

    opened by malusev998 12
  • added simple progress status

    added simple progress status

    I've added a simpler progress status report that should be easier on the eyes for everyday use.

    opened by mpldr 7
  • Set default outfile by path

    Set default outfile by path

    Hello, I wanted to ask whether setting a default download path was a wanted feature. so the out flag is optional.

    For example when downloading a list of files (another feature perhaps?) it can be tedious to always provide the outfile. sure, its possible using xargs and some replacements but is that how it should be?

    so I'd like to suggest the following:

    If the outpath is undefined the filepath is parsed from the URL, or if empty replaced by index. Meaning:

    http://example.com/some/path/video.mp4?hash=deadbeef&expires=123456789 -> video.mp4
    http://example.com/some/path/video.mp4 -> video.mp4
    http://example.com/ -> index
    http://example.com/index.html -> index.html
    http://example.com/?page=about -> index // to keep it simple
    http://example.com/about.php?session=asdf -> about.php
    

    I would implement this if it is a wanted feature.

    opened by mpldr 6
  • Optimizations and refactoring

    Optimizations and refactoring

    Hi! Loved the idea of this tool, so I decided to contribute some improvements:

    • Some simple refactoring
    • Fixed bug where content-disposition name wouldn't actually be used for d.name (since d.name was set only once, then even if Info returned a valid name, it wouldn't be used in the resulting filepath)
    • Removed usage of redundant HEAD request to determine range support, now only GET is used since its more reliable and makes the code simpler
    • Optimized the concurrent file writing by a lot, now all goroutines write to a single file concurrently using WriteAt (which is documented to be concurrent), so no temporary files are used and no merge is needed, again simplifying the code

    Not sure why, but the test workflows on go 1.14 windows are sometimes not passing (even though they are marked as ok, some unrelated errors are thrown by go vet, which makes the github CI workflow get marked as failed)

    Didn't measure the optimizations on a server with good internet, but my local testing seems to save around 3+ seconds each time since we avoid file copying

    opened by renbou 5
  • panic: slice bounds out of range

    panic: slice bounds out of range

    Hi! I've tried to install got from the AUR and I received this error:

    panic: runtime error: slice bounds out of range [:7] with length 4
    
    goroutine 1 [running]:
    main.main()
    	/home/mohamed/work/dev/go/src/opensource/got/cmd/got/main.go:43 +0x583
    
    opened by issadarkthing 5
  • Issue while retrieving filesize

    Issue while retrieving filesize

    Sometimes (not always, not safely reproducible) I get unrealistic filesizes (see screenshot)

    20201202_10h56m36s_grim

    This happens from time to time on URLs where it works on other times (this one was with a github repo download about 10 MiB in size)

    opened by mpldr 5
  • Corrupt files

    Corrupt files

    Hi!

    The tools is really fast! :smiley: But it seems to produce corrupt files. E.g.

    got --out job_jobse..mp4 https://arteconcert-a.akamaihd.net/am/concert/096000/096900/096905-054-A_SQ_0_VO_05149030_MP4-2200_AMM-CONCERT-NEXT_1NdTQsPyN0.mp4
    

    The download itself worked without issues. If I play the downloaded file with mpv I get this:

    mpv --no-video job_jobse.mp4 
         Video --vid=1 (*) (h264 1280x720 25.000fps)
     (+) Audio --aid=1 (*) (aac 2ch 48000Hz)
    AO: [pulse] 48000Hz stereo 2ch float
    A: 00:11:55 / 00:59:39 (19%)
    [ffmpeg/audio] aac: channel element 3.8 is not allocated
    Error decoding audio.
    [ffmpeg/audio] aac: Reserved bit set.
    [ffmpeg/audio] aac: Number of bands (32) exceeds limit (20).
    Error decoding audio.
    [ffmpeg/audio] aac: Multiple frames in a packet.
    [ffmpeg/audio] aac: Input buffer exhausted before END element found
    Error decoding audio.
    [ffmpeg/audio] aac: Sample rate index in program config element does not match the sample rate index configured by the container.
    [ffmpeg/audio] aac: Inconsistent channel configuration.
    [ffmpeg/audio] aac: get_buffer() failed
    Error decoding audio.
    [ffmpeg/audio] aac: Input buffer exhausted before END element found
    Error decoding audio.
    [ffmpeg/audio] aac: Prediction is not allowed in AAC-LC.
    Error decoding audio.
    [ffmpeg/audio] aac: Sample rate index in program config element does not match the sample rate index configured by the container.
    [ffmpeg/audio] aac: Too large remapped id is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
    [ffmpeg/audio] aac: If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. ([email protected])
    ...
    

    So it plays without issues until 11:55 and then I get lots of errors. Downloading the same file with wget everything works as expected.

    Could it be that there are race conditions in the tool that causes the chunks to be assembled in the wrong order or something like that? I suspect if you do the same download now that it might work for you because you didn't trigger the race condition. But that's just guessing of course.

    opened by githubixx 4
  • Error 403 when downloading from GitHub Releases

    Error 403 when downloading from GitHub Releases

    I tried to download the binary from GitHub using got: got --out got.tar.gz https://github.com/melbahja/got/releases/download/v0.1.1/got_0.1.1_Linux_amd64.tar.gz

    But instead I got Response status code is not ok: 403.

    After a little search I found this issue https://github.com/cavaliercoder/grab/issues/43.

    Basically AWS returns 403 to HEAD requests, changing the HEAD request to a GET at https://github.com/melbahja/got/blob/9098e5bee46ab083f6540eefeee23992fff91364/got.go#L275 worked but I am not sure if that's a good solution.

    opened by Pauloo27 3
  • Fix: when head method not supported

    Fix: when head method not supported

    fixes: #3

    opened by melbahja 3
  • Integrate Progress Bars from Schollz

    Integrate Progress Bars from Schollz

    Schollz just released progressbar. Would make a great addition to got.

    opened by zQueal 0
  • stream error: stream ID 21; INTERNAL_ERROR; received from peer

    stream error: stream ID 21; INTERNAL_ERROR; received from peer

    Hello,

    Launching the command to a server with basic authorization (jenkins), I receive the following error (the file size is 3 gbs.):

    2021/12/13 08:27:18 stream error: stream ID 21; INTERNAL_ERROR; received from peer

    This happens over the first thirty seconds. In the following executions, the ID keeps changing but it always happens. I think that the problem is with the slow connection of the server but i don't know if it is possible to stabilize the connection so that it does not fall. Sadly, I cannot provide a repository because my project is not open source.

    Thanks for your work!

    opened by ralonsoj 3
  • Fall back to filename from URL when don't have it in the header

    Fall back to filename from URL when don't have it in the header

    In case we have a valid header, we'll try to get unsafeName from it but we cannot do it:

    content-disposition: attachment
    

    So, we fall back to a default case, getting the filename right from the URL and fix it, if we have a filename in the content-disposition.

    opened by ant1k9 0
  • Can we resume the download?

    Can we resume the download?

    Team,

    Whenever we are restarting for same file it downloads from scratch. Does it support the multipart download download and restart from where we paused earlier.

    opened by smartaquarius10 3
  • Project name

    Project name

    Hello,

    Just an FYI - there's already a project named Got - unfortunate coincidence it seems.

    Not sure if there's anything that can be done but thought I'd put it out there.

    Regards

    opened by rjc 6
  • Would you consider extending this to be a port of wget?

    Would you consider extending this to be a port of wget?

    There is this, but it's not hugely well-known, and only a partial port and hasn't been update in a while. This tool's name (got) is really excellent, and I would love to see something like wget written in Go. There are a few ports of well-known UNIX commands in modern languages (such as fd; a fast alternative to find). Would really be interested to see where this could go! (No pun intended).

    opened by jakewilliami 1
  • How to disable SSL cert check?

    How to disable SSL cert check?

    I wonder if disabling SSL cert check is possible? In some cases we use self-signed certificates and got refuses to download from that servers.

    pops an error like in the following and doesn't start to download.

    Get "URL-REDACTED": x509: certificate signed by unknown authority
    
    opened by nsa 1
  • just 2 issue while using got

    just 2 issue while using got

    Hi melbahja! I got 2 issue while using got, maybe we can improve

    1. save disk space I use got to download a file from http server support range feature. the file size is about 8GB, so it take a moment to complete. while downloading, I find in C:\Users\{username}\AppData\Local\Temp\GotChunks{some digits} directory, there are 40 chunk files, chunk-0 ~ chunk-39. they are temp files to assemble the origin file, this means, my computer should have 16GB free disk space to complete the download progress, is there any way to optimize? I mean, since got remove these chunk-i files after merge them together, but, if I don't have 16GB free disk space, the 8GB file can't be download. what if the the file is 200GB? I should first have 400GB free disk space, that's unreasonable.

    2. merge chunk files take a long time while all chunk files are downloaded, got didn't quit immediately, it take a while to merge these chunk files, after all chunk files merged, got quit, and we got the origin file, all chunk files removed also. if the origin 8GB file, it take a momnet to merge, what if a 200GB file, how long will it take to merge chunk files? I don't know if wget or curl will suspended a moment while download progress bar reach 100%

    thanks anyway!

    opened by brownchow 3
Releases(v0.6.1)
Owner
Mohamed El Bahja
Developer, FOSS Enthusiast.
Mohamed El Bahja
Go-file-downloader-ftctl - A file downloader cli built using golang. Makes use of cobra for building the cli and go concurrent feature to download files.

ftctl This is a file downloader cli written in Golang which uses the concurrent feature of go to download files. The cli is built using cobra. How to

Dipto Chakrabarty 2 Jan 2, 2022
lls is lightweight ls. Using lls, you can get a list of files in a directory that contains a large number of files.

lls lls is lightweight ls. Using lls, you can get a list of files in a directory that contains a large number of files. How? You allocate a buffer for

Tatsuya Kaneko 56 Nov 18, 2021
zipspy - a CLI tool to extract files from zip archives in S3 without needing to download the entire archive

Zipspy allows you interact with ZIP archives stored in remote locations without requiring a local copy. For example, you can list the filenames in an S3 ZIP archive, download a subset of files, search and retrieve files with regular expressions, and more!

Alec Rabold 0 Jan 20, 2022
Command-line tool to organize large directories of media files recursively by date, detecting duplicates.

go-media-organizer Command-line tool written in Go to organise all media files in a directory recursively by date, detecting duplicates.

Allan Avelar 8 Jan 6, 2022
convert curl commands to Python, JavaScript, Go, PHP, R, Dart, Java, MATLAB, Rust, Elixir and more

curlconverter curlconverter transpiles curl commands into programs in other programming languages. $ curlconverter --data "Hello, world!" example.com

null 5k Jan 16, 2022
The power of curl, the ease of use of httpie.

Curlie If you like the interface of HTTPie but miss the features of curl, curlie is what you are searching for. Curlie is a frontend to curl that adds

Olivier Poitrey 1.6k Jan 16, 2022
A golang CLI tool to download malware from a variety of sources.

mlget _____ _____ _____ _____ _____ /\ \

null 82 Jan 9, 2022
This is the tool to download files from qiniu cruster manually.

This is the tool to download files from qiniu cruster manually. toCheck = []string{ sealPath, filepath.Join(cachePath, "p_aux"), filepath.Join(cachePa

lyswifter 1 Nov 25, 2021
CLI tool to upload object to s3-compatible storage backend and set download policy for it.

typora-s3 CLI tool to upload object to s3-compatible storage backend and set download policy for it. Build $ git clone https://github.com/fengxsong/ty

fengxsong 0 Dec 29, 2021
Cleo CLI - do annoying stuff faster

Cleo CLI Installing Heroku CLI Most of Heroku functionality relies on Heroku CLI being present in your system. Go ahead and install it if you haven't

Cleo AI 0 Dec 20, 2021
Downloader written in golang to download the public data files from RUC Paraguay.

rucpy-downloader Downloader written in golang to download the public data files(RUC Paraguay) from set.gov.py. The downloader will download the public

bitebait 1 Dec 6, 2021
tmux-wormhole - download files and directories with tmux!

tmux-wormhole Use tmux and magic wormhole to get things from your remote computer to your tmux. If tmux has DISPLAY set, open the file locally! Demo U

Graham Clark 36 Dec 8, 2021
Softsuite - Start from gofiber boilerplate and plan to build large projects

Softsuite Thanks to Cozy (ItsCosmas) to build gofiber boilerplate. I start learn

Mai 0 Jan 19, 2022
πŸ“₯ Command-line tool to download videos from hanime.tv

hanime Command-line tool to download videos from hanime.tv Requirements Installation Install via go get Install from source Install from release Usage

私はレγ‚ͺンです 18 Dec 28, 2021
Nebula Diagnosis CLI Tool is an information diagnosis cli tool for the nebula service and the node to which the service belongs.

Nebula Diagnosis CLI Tool is an information diagnosis cli tool for the nebula service and the node to which the service belongs.

Katz 1 Jan 12, 2022
Little golang app that allows you to download a youtube video as mp3, and optionally embed ID3 tags -Cover Art, Artist ...-

yt2mp3 Little golang app that allows you to download a youtube video as mp3, and optionally embed ID3 tags -Cover Art, Artist ...- Instructions At the

null 0 Dec 25, 2021
A small CLI tool to compress and decompress files using Golang

Goflate A simple & small CLI tool to compress and decompress files using Golang Usage Install the binary to your local machine with the below command

Pedre Viljoen 1 Dec 17, 2021
Watcher - A simple command line app to watch files in a directory for changes and run a command when files change!

Watcher - Develop your programs easily Watcher watches all the files present in the directory it is run from of the directory that is specified while

Geet Sethi 0 Jan 2, 2022