`kawipiko` -- blazingly fast static HTTP server -- focused on low latency and high concurrency, by leveraging Go, `fasthttp` and the CDB embedded database

Overview

kawipiko -- blazingly fast static HTTP server

kawipiko is a lightweight static HTTP server written in Go; focused on serving static content as fast and efficient as possible, with the lowest latency, and with the lowest resource consumption (either CPU, RAM, IO); supporting both HTTP/1 (with or without TLS), HTTP/2 and HTTP/3 (over QUIC); available as a single statically linked executable without any other dependencies.

However, simple doesn't imply dumb or limited, instead it implies efficient through the removal of superfluous features, thus being inline with UNIX's old philosophy of "do one thing and do it well". Therefore, it supports only GET requests, and does not provide features like dynamic content generation, authentication, reverse proxying, etc.; meanwhile still providing compression (gzip, zopfli, or brotli), plus HTML-CSS-JS minifying (TODO), without affecting its performance (due to its unique architecture as described below).

What kawipiko does provide is something very unique, that no other HTTP server offers: the static content is served from a CDB file with almost no latency (as compared to classical static servers that still have to pass through the OS via the open-read-close syscalls). Moreover, as noted earlier, the static content can still be compressed or minified ahead of time, thus reducing not only CPU but also bandwidth and latency.

CDB files are binary database files that provide efficient read-only key-value lookup tables, initially used in some DNS and SMTP servers, mainly for their low overhead lookup operations, zero locking in multi-threaded / multi-process scenarios, and "atomic" multi-record updates. This also makes them suitable for low-latency static content serving over HTTP, which is what this project provides.

For those familiar with Netlify (or competitors like CloudFlare Pages, GitHub Pages, etc.), kawipiko is a host-it-yourself alternative featuring:

  • self-contained deployment with simple configuration; (i.e. just fetch the executable and use the proper flags;)
  • low and constant resource consumption (both in terms of CPU and RAM); (i.e. you won't have surprises when under load;)
  • (hopefully) extremely secure; (i.e. it doesn't launch processes, it doesn't connect to other services or databases, it doesn't open any files, etc.; basically you can easily chroot it, or containerize it as is in fashion these days;)
  • highly portable, supporting at least Linux (the main development, testing and deployment platform), FreeBSD, OpenBSD, and OSX;

For a complete list of features please consult the features section. Unfortunately, there are also some tradeoffs as described in the limitations section (although none are critical).

With regard to performance, as described in the benchmarks section, kawipiko is at least on-par with NGinx, sustaining over 100K requests / second with 0.25ms latency for 99% of the requests even on my 6 years old laptop. However the main advantage over NGinx is not raw performance, but deployment and configuration simplicity, plus efficient management and storage of large collections of many small files.



./documentation/banner.png


Manual

Workflow

The project provides the following executables (statically linked, without any other dependencies):

  • kawipiko-server -- which serves the static content from the CDB archive either via HTTP (with or without TLS), HTTP/2 or HTTP/3 (over QUIC);
  • kawipiko-archiver -- which creates the CDB archive from a source folder holding the static content, optionally compressing and minifying files;
  • kawipiko -- an all-in-one executable that bundles all functionality in one executable; (i.e. kawipiko server ... or kawipiko archiver ...);

Unlike most (if not all) other servers out-there, in which you just point your web server to the folder holding the static website content root, kawipiko takes a radically different approach: in order to serve the static content, one has to first archive the content into the CDB archive through kawipiko-archiver, and then one can serve it from the CDB archive through kawipiko-server.

This two step phase also presents a few opportunities:

  • one can decouple the "building", "testing", and "publishing" phases of a static website, by using a similar CI/CD pipeline as done for other software projects;
  • one can instantaneously rollback to a previous version if the newly published one has issues;
  • one can apply extreme compression (e.g. zopfli or brotli), to trade CPU during deployment vs latency and bandwidth at runtime.

kawipiko-server

See dedicated manual.

kawipiko-archiver

See dedicated manual.


Examples

  • fetch and extract the Python 3.10 documentation HTML archive:

    curl \
            -s -S -f \
            -o ./python-3.10.1-docs-html.tar.bz2 \
            https://docs.python.org/3/archives/python-3.10.1-docs-html.tar.bz2 \
    #
    
    tar \
            -x -j -v \
            -f ./python-3.10.1-docs-html.tar.bz2 \
    #
    
  • create the CDB archive (without any compression):

    kawipiko-archiver \
            --archive ./python-3.10.1-docs-html-nocomp.cdb \
            --sources ./python-3.10.1-docs-html \
            --debug \
    #
    
  • create the CDB archive (with gzip compression):

    kawipiko-archiver \
            --archive ./python-3.10.1-docs-html-gzip.cdb \
            --sources ./python-3.10.1-docs-html \
            --compress gzip \
            --debug \
    #
    
  • create the CDB archive (with zopfli compression):

    kawipiko-archiver \
            --archive ./python-3.10.1-docs-html-zopfli.cdb \
            --sources ./python-3.10.1-docs-html \
            --compress zopfli \
            --debug \
    #
    
  • create the CDB archive (with brotli compression):

    kawipiko-archiver \
            --archive ./python-3.10.1-docs-html-brotli.cdb \
            --sources ./python-3.10.1-docs-html \
            --compress brotli \
            --debug \
    #
    
  • serve the CDB archive (with gzip compression):

    kawipiko-server \
            --bind 127.0.0.1:8080 \
            --archive ./python-3.10.1-docs-html-gzip.cdb \
            --archive-mmap \
            --archive-preload \
            --debug \
    #
    
  • compare sources and archive sizes:

    du \
            -h -s \
            \
            ./python-3.10.1-docs-html-nocomp.cdb \
            ./python-3.10.1-docs-html-gzip.cdb \
            ./python-3.10.1-docs-html-zopfli.cdb \
            ./python-3.10.1-docs-html-brotli.cdb \
            \
            ./python-3.10.1-docs-html \
            ./python-3.10.1-docs-html.tar.bz2 \
    #
    
    45M     ./python-3.10.1-docs-html-nocomp.cdb
    9.7M    ./python-3.10.1-docs-html-gzip.cdb
    ???     ./python-3.10.1-docs-html-zopfli.cdb
    7.9M    ./python-3.10.1-docs-html-brotli.cdb
    
    46M     ./python-3.10.1-docs-html
    6.0M    ./python-3.10.1-docs-html.tar.bz2
    

Installation

See dedicated installation document.


Features

Implemented

The following is a list of the most important features:

  • (optionally) the static content is compressed or minified when the CDB archive is created, thus no CPU cycles are used while serving requests;
  • (optionally) the static content can be compressed with either gzip, zopfli or brotli;
  • (optionally) in order to reduce the serving latency even further, one can preload the entire CDB archive in memory, or alternatively mapping it in memory (using mmap); this trades memory for CPU;
  • (optionally) caching the static content fingerprint and compression, thus significantly reducing the CDB archive rebuilding time, and significantly reducing the IO for the source file-system;
  • atomic static website content changes; because the entire content is held in a single CDB archive, and because the file replacement is atomically achieved via the rename syscall (or the mv tool), all served resources are observed to change at the same time;
  • _wildcard.* files (where .* are the regular extensions like .txt, .html, etc.) which will be used if an actual resource is not found under that folder; (these files respect the hierarchical tree structure, i.e. "deeper" ones override the ones closer to "root";)
  • support for HTTP/1 (with or without TLS), by leveraging github.com/valyala/fasthttp;
  • support for HTTP/2, by leveraging Go's net/http;
  • support for HTTP/3 (over QUIC), by leveraging github.com/lucas-clemente/quic-go;

Pending

The following is a list of the most important features that are currently missing and are planed to be implemented:

  • (TODO) support for custom HTTP response headers (for specific files, for specific folders, etc.); (currently only Content-Type, Content-Length, Content-Encoding are included; additionally Cache-Control: public, immutable, max-age=3600, optionally ETag, and a few TLS or security related headers can also be included;)
  • (TODO) support for mapping virtual hosts to key prefixes; (currently virtual hosts, i.e. the Host header, are ignored;)
  • (TODO) support for mapping virtual hosts to multiple CDB archives; (i.e. the ability to serve multiple domains, each with its own CDB archive;)
  • (TODO) automatic reloading of the CDB archives;
  • (TODO) minifying HTML, CSS and JavaScript, by leveraging https://github.com/tdewolff/minify;
  • (TODO) customized error pages (embedded in the CDB archive);

Limitations

As stated in the about section, nothing comes for free, and in order to provide all these features, some corners had to be cut:

  • (TODO) currently if the CDB archive changes, the server needs to be restarted in order to pickup the changed files;
  • (won't fix) the CDB archive maximum size is 4 GiB (after compression and minifying), and there can't be more than 16M resources; (however if you have a static website this large, you are probably doing something extremely wrong, as large files should be offloaded to something like AWS S3, and served through a CDN like CloudFlare or AWS CloudFront;)
  • (won't fix) the server does not support per-request decompression / recompression; this implies that if the content was saved in the CDB archive with compression (say brotli), the server will serve all resources compressed (i.e. Content-Encoding: brotli), regardless of what the browser accepts (i.e. Accept-Encoding: gzip); the same applies for uncompressed content; (however always using gzip compression is safe enough, as it is implemented in virtually all browsers and HTTP clients out there;)
  • (won't fix) regarding the "atomic" static website changes, there is a small time window in which a client that has fetched an "old" version of a resource (say an HTML page), but it has not yet fetched the required resources (say the CSS or JS files), and in between fetching the HTML and CSS/JS the CDB archive was changed, the client will consequently fetch the new version of these required resources; however due to the low latency serving, this time window is extremely small; (this is not a limitation of this HTTP server, but a limitation of the way websites are built; always use fingerprints in your resources URL, and perhaps always include the current and previous version on each deploy;)

Benchmarks

See dedicated benchmarks document.


FAQ

Is it production ready?

Yes, it currently is serving ~600K HTML pages.

Although, being open source, you are responsible for making sure it works within your requirements!

However, I am available for consulting on its deployment and usage. :)

Why CDB?

Until I expand upon why I have chosen to use CDB for service static website content, you can read about the sparkey from Spotify.

Why Go?

Because Go is highly portable, highly stable, and especially because it can easily support cross-compiling statically linked binaries to any platform it supports.

Why not Rust?

Because Rust fails to easily support cross-compiling (statically or dynamically linked) executables to any platform it supports.

Because Rust is less portable than Go; for example Rust doesn't consider OpenBSD as a "tier-1" platform.


Notice (copyright and licensing)

Authors

Ciprian Dorin Craciun

Notice -- short version

The code is licensed under AGPL 3 or later.

If you change the code within this repository and use it for non-personal purposes, you'll have to release it as per AGPL.

Notice -- long version

For details about the copyright and licensing, please consult the notice file in the documentation/licensing folder.

If someone requires the sources and/or documentation to be released under a different license, please send an email to the authors, stating the licensing requirements, accompanied with the reasons and other details; then, depending on the situation, the authors might release the sources and/or documentation under a different license.


References

See dedicated references document.

You might also like...
Fast Static File Analysis Framework
Fast Static File Analysis Framework

Florentino; Fast Static File Analysis Framework Story Florentino is named after a fiction warrior. Flarentino: "I'd wear a fedora but they haven't inv

A fast, high performance Cross-platform lightweight Nat Tracker Server,
A fast, high performance Cross-platform lightweight Nat Tracker Server,

NatTrackerServer A fast, high performance Cross-platform lightweight Nat Tracker Server suport IPv4 and IPv6 Tracker Server protocol 1、get NAT public

Package socket provides a low-level network connection type which integrates with Go's runtime network poller to provide asynchronous I/O and deadline support. MIT Licensed.

socket Package socket provides a low-level network connection type which integrates with Go's runtime network poller to provide asynchronous I/O and d

Http-server - A HTTP server and can be accessed via TLS and non-TLS mode

Application server.go runs a HTTP/HTTPS server on the port 9090. It gives you 4

A simple low bandwidth simulator written in go

NETSNAIL 0.8 ABOUT Netsnail is a simple network proxy that simulates low bandwidth. RUNNING Usage of netsnail: -d=0: the delay on data transfe

Proxy is a high performance HTTP(S) proxies, SOCKS5 proxies,WEBSOCKET, TCP, UDP proxy server implemented by golang. Now, it supports chain-style proxies,nat forwarding in different lan,TCP/UDP port forwarding, SSH forwarding.Proxy是golang实现的高性能http,https,websocket,tcp,socks5代理服务器,支持内网穿透,链式代理,通讯加密,智能HTTP,SOCKS5代理,黑白名单,限速,限流量,限连接数,跨平台,KCP支持,认证API。
Static file server that service content required by dan's services

Static file server that service content required by dan's services.

🚀Gev is a lightweight, fast non-blocking TCP network library based on Reactor mode. Support custom protocols to quickly and easily build high-performance servers.
🚀Gev is a lightweight, fast non-blocking TCP network library based on Reactor mode. Support custom protocols to quickly and easily build high-performance servers.

gev 中文 | English gev is a lightweight, fast non-blocking TCP network library based on Reactor mode. Support custom protocols to quickly and easily bui

Simple, fast and scalable golang rpc library for high load

gorpc Simple, fast and scalable golang RPC library for high load and microservices. Gorpc provides the following features useful for highly loaded pro

Comments
  • link to CDB implementation

    link to CDB implementation

    "CDB" is mentioned at least 10 times in the readme, but I didnt find a single mention of the implemention used:

    https://github.com/volution/kawipiko/blob/86463b58f8da6ab5746da28c9ec009bf2a6dd65a/sources/go.mod#L7

    https://github.com/colinmarc/cdb

    I think it would be right to add a mention or link to the implementation in the Readme.

    opened by 89z 1
  • How many web pages is it serving per unit of time?

    How many web pages is it serving per unit of time?

    In the FAQ it says "Yes, it currently is serving ~600K HTML pages.". Is it serving these within a second, minute, hour? What's the average size of all these pages?

    opened by lf94 0
Releases(preview)
Owner
Volution
Volution
YoMo 45 Aug 25, 2022
the pluto is a gateway new time, high performance, high stable, high availability, easy to use

pluto the pluto is a gateway new time, high performance, high stable, high availability, easy to use Acknowledgments thanks nbio for providing low lev

mobus 2 Sep 19, 2021
Fast HTTP package for Go. Tuned for high performance. Zero memory allocations in hot paths. Up to 10x faster than net/http

fasthttp Fast HTTP implementation for Go. Currently fasthttp is successfully used by VertaMedia in a production serving up to 200K rps from more than

Aliaksandr Valialkin 18.4k Sep 16, 2022
A memory-safe SSH server, focused on listening only on VPN networks such as Tailscale

Features Is tested to work with SCP Integrates well with systemd Quickstart Download binary for your architecture. We only support Linux. If you don't

function61.com 2 Jun 10, 2022
Go package to simulate bandwidth, latency and packet loss for net.PacketConn and net.Conn interfaces

lossy Go package to simulate bandwidth, latency and packet loss for net.PacketConn and net.Conn interfaces. Its main usage is to test robustness of ap

Cevat Barış Yılmaz 310 Sep 6, 2022
DNS Ping: to check packet loss and latency issues with DNS servers

DNSping DNS Ping checks packet loss and latency issues with DNS servers Installation If you have golang, easiest install is go get -u fortio.org/dnspi

Fortio (Φορτίο) 60 Aug 5, 2022
Hedged Go GRPC client which helps to reduce tail latency at scale.

hedgedgrpc Hedged Go GRPC client which helps to reduce tail latency at scale. Rationale See paper Tail at Scale by Jeffrey Dean, Luiz André Barroso. I

cristaltech 4 Feb 4, 2022
Http-recorder - Application for record http response as static files

http-recorder This is a application for record http response as static files. Th

null 1 Mar 21, 2022
Simple, secure and modern Go HTTP server to serve static sites, single-page applications or a file with ease

srv srv is a simple, secure and modern HTTP server, written in Go, to serve static sites, single-page applications or a file with ease. You can use it

Kevin Pollet 55 Sep 7, 2022
scrapligo -- is a Go library focused on connecting to devices, specifically network devices (routers/switches/firewalls/etc.) via SSH and NETCONF.

scrapligo -- scrap(e c)li (but in go!) -- is a Go library focused on connecting to devices, specifically network devices (routers/switches/firewalls/etc.) via SSH and NETCONF.

null 146 Sep 18, 2022