GeeseFS allows you to mount an S3 bucket as a file system.
FUSE file systems based on S3 typically have performance problems, especially with small files and metadata operations.
GeeseFS attempts to solve these problems by using aggressive parallelism and asynchrony.
POSIX Compatibility Matrix
|Read after write||+||+||-||+||+|
|readdir & changes||+||+||-||+||+|
* Directory renames are allowed in Goofys for directories with no more than 1000 entries and the limit is hardcoded
List of non-POSIX behaviors/limitations for GeeseFS:
- does not store file mode/owner/group, use
- does not support hard links
- does not support special files (block/character devices, named pipes)
- does not support locking
atimeis always the same as
- file modification time can't be set by user (for example with
cp --preserveor utimes(2))
In addition to the items above:
- default file size limit is 1.03 TB, achieved by splitting the file into 1000x 5MB parts, 1000x 25 MB parts and 8000x 125 MB parts. You can change part sizes, but AWS's own limit is anyway 5 TB.
Owner & group, modification times and special files are in fact supportable with Yandex S3 because it has listings with metadata. Feel free to post issues if you want it. :-)
GeeseFS is stable enough to pass most of
xfstests which are applicable, including dirstress/fsstress stress-tests (generic/007, generic/011, generic/013).
|Parallel multipart uploads||+||-||+||+||-|
|No readahead on random read||+||-||+||-||+|
|Server-side copy on append||+||-||-||*||+|
|Server-side copy on update||+||-||-||*||-|
|xattrs without extra RTT||+*||-||-||-||+|
|Fast recursive listings||+||-||*||-||+|
|Disk cache for reads||+||*||-||+||+|
|Disk cache for writes||+||*||-||+||-|
* Recursive listing optimisation in Goofys is buggy and may skip files under certain conditions
* S3FS uses server-side copy, but it still downloads the whole file to update it. And it's buggy too :-)
* rclone mount has VFS cache, but it can only cache whole files. And it's also buggy - it often hangs on write.
* xattrs without extra RTT only work with Yandex S3 (--list-type=ext-v1).
- Pre-built binaries:
- Or build from source with Go 1.13 or later:
$ go get github.com/yandex-cloud/geesefs
$ cat ~/.aws/credentials [default] aws_access_key_id = AKID1234567890 aws_secret_access_key = MY-SECRET-KEY $ $GOPATH/bin/geesefs <bucket> <mountpoint> $ $GOPATH/bin/geesefs [--endpoint https://...] <bucket:prefix> <mountpoint> # if you only want to mount objects under a prefix
You can also supply credentials via the
AWS_SECRET_ACCESS_KEY environment variables.
To mount an S3 bucket on startup make sure the credential is configured for
root and add this to
bucket /mnt/mountpoint fuse.geesefs _netdev,allow_other,--file-mode=0666,--dir-mode=0777 0 0
You can also use a different path to the credentials file by adding
See also: Instruction for Azure Blob Storage.
There's a lot of tuning you can do. Consult
geesefs -h to view the list of options.
Licensed under the Apache License, Version 2.0
Compatibility with S3
geesefs works with:
- Yandex Object Storage (default)
- Amazon S3
- Ceph (and also Ceph-based Digital Ocean Spaces, DreamObjects, gridscale etc)
- OpenStack Swift
- Azure Blob Storage (even though it's not S3)
It should also work with any other S3 that implements multipart uploads and multipart server-side copy (UploadPartCopy).
The following backends are inherited from Goofys code and still exist, but are broken:
- Google Cloud Storage
- Azure Data Lake Gen1
- Azure Data Lake Gen2
- Yandex Object Storage
- Amazon S3
- Amazon SDK for Go
- Other related fuse filesystems
- fuse binding, also used by