Crane - 🐦 A full-text WebAssembley search engine for static websites

Overview

Crane 🐦

My blog post: WebAssembly Search Tools for Static Sites


Crane is a technical demo is inspired by Stork and uses a near-identical configuration file setup. So it had to be named after a bird too.

I wrote it to help me understand how WebAssembly search tools work. Please use Stork instead.

Crane is two programs. The first program scans a group of documents and builds an efficient index. 1MB of text and metadata is turned into a 25KB index (14KB gzipped). The second program is a Wasm module that is sent to the browser along with a little bit of JavaScript glue code and the index. The result is an instant search engine that helps users find web pages as they type.

Visit the demo


Crane instant search in action


The full text search engine is powered in part with code from Artem Krylysov's blog post Let's build a Full-Text Search engine.

No effort has been made to shrink the Wasm binary. See Reducing the size of Wasm files.

Use it

Describe your document files and their metadata.

[input]
files = [
    {
        path = "docs/essays/essay01.txt",
        url = "essays/essay01.txt",
        title = "Introduction"
    },
    # etc.
]

[output]
filename = "dist/federalist.crane"

Pass the configuration file to the build script. You'll want a fresh index whenever your documents change but you only need to build the Wasm module once ever.

./build-index.sh federalist.toml
./build-search.sh

Host the files from /dist on your website (e.g. wasm_exec.js, crane.js, crane.wasm, federalist.crane). And away you go!

const crane = new Crane("crane.wasm", "federalist.crane");
await crane.load();

const results = crane.query('some keywords');
console.log(results);

See the demo inside /docs for a basic UI.


Build demo page

./gh-pages.sh
You might also like...
Tts - A project takes advantage of a server to run compute some audio file from text you send it

Text to Speech Hey this project takes advantage of a server to run compute some

A fast clone of the Jekyll blogging engine, in Go

Gojekyll Gojekyll is a partially-compatible clone of the Jekyll static site generator, written in the Go programming language. It provides build and s

Remark42 is a self-hosted, lightweight, and simple comment engine
Remark42 is a self-hosted, lightweight, and simple comment engine

Remark42 is a self-hosted, lightweight, and simple (yet functional) comment engine, which doesn't spy on users. It can be embedded into blogs, articles or any other place where readers add comments.

Crane (FinOps Crane) is an opensource project which manages cloud resource on Kubernetes stack, it is inspired by FinOps concepts.
Crane (FinOps Crane) is an opensource project which manages cloud resource on Kubernetes stack, it is inspired by FinOps concepts.

Crane (FinOps Crane) is an opensource project which manages cloud resource on Kubernetes stack, it is inspired by FinOps concepts. Goal of Crane is to provide an one-stop shop project to help Kubernetes users to save cloud resource usage with a rich set of functionalities.

In-memory, full-text search engine built in Go. For no particular reason.
In-memory, full-text search engine built in Go. For no particular reason.

Motivation I just wanted to learn how to write a search engine from scratch without any prior experience. Features Index content Search content Index

In-memory, full-text search engine built in Go. For no particular reason.
In-memory, full-text search engine built in Go. For no particular reason.

Motivation I just wanted to learn how to write a search engine from scratch without any prior experience. Features Index content Search content Index

Easysearch - Easy Full-Text Search Engine in golang
Easysearch - Easy Full-Text Search Engine in golang

Easy Full-Text Search Engine Overview EasySearch是一个分布式的全文检索搜索引擎,同时支持内存检索与磁盘检索,并针

Small and fast FTS (full text search)

Microfts A small full text indexing and search tool focusing on speed and space. Initial tests seem to indicate that the database takes about twice as

micro-draft-manager is a microservice that helps you to manage unstructured data in your application with sorting and full-text search

micro-draft-manager is a microservice that helps you to manage unstructured data in your application with sorting and full-text search. For example, y

Phalanx is a cloud-native full-text search and indexing server written in Go built on top of Bluge that provides endpoints through gRPC and traditional RESTful API.

Phalanx Phalanx is a cloud-native full-text search and indexing server written in Go built on top of Bluge that provides endpoints through gRPC and tr

Formrecevr is a simple and lightweight from receiver backend primarily designed for (but not limited to) static websites.

Formrecevr Formrecevr (pronunced "Form receiver") is a simple and lightweight from receiver backend primarily designed for (but not limited to) static

A full-featured regex engine in pure Go based on the .NET engine

regexp2 - full featured regular expressions for Go Regexp2 is a feature-rich RegExp engine for Go. It doesn't have constant time guarantees like the b

Moviefetch: a simple program to search and download for movies from websites like 1337x and then stream them

MovieFetch Disclaimer I am NOT responisble for any legal issues or other you enc

Control your legacy Reciva based internet radios (Crane, Grace Digital, Tangent, etc.) via REST api or web browser.

reciva-web-remote Control your legacy Reciva based internet radios (Crane, Grace Digital, Tangent, etc.) via REST api or web browser. Usage This progr

A Golang Core api of crane

Core API of Crane core api of crane. DEV GUIDE clone the project to your $GOPATH. following command will generate crd yamls and files in the project d

Crane: Cloud Resource Analytics and Economics
Crane: Cloud Resource Analytics and Economics

Crane: Cloud Resource Analytics and Economics Crane: Cloud Resource Analytics an

Crane scheduler is a Kubernetes scheduler which can schedule pod based on actual node load.

Crane-scheduler Overview Crane-scheduler is a collection of scheduler plugins based on scheduler framework, including: Dynamic scheuler: a load-aware

The full power of the Go Compiler directly in your browser, including a virtual file system implementation. Deployable as a static website.
The full power of the Go Compiler directly in your browser, including a virtual file system implementation. Deployable as a static website.

Static Go Playground Features Full Go Compiler running on the browser. Supports using custom build tags. Incremental builds (build cache). Supports mu

red-tldr is a lightweight text search tool, which is used to help red team staff quickly find the commands and key points they want to execute, so it is more suitable for use by red team personnel with certain experience.
red-tldr is a lightweight text search tool, which is used to help red team staff quickly find the commands and key points they want to execute, so it is more suitable for use by red team personnel with certain experience.

Red Team TL;DR English | 中文简体 What is Red Team TL;DR ? red-tldr is a lightweight text search tool, which is used to help red team staff quickly find t

Comments
  • Remove `.gitattributes` (for now)

    Remove `.gitattributes` (for now)

    I decided that a very minor change was not worthy of having its own pull request, so the suggestion is here instead. I know the project is primarily written in Go, but as you know, there are some shell scripts that are vital to the functionality of the code. The suggestion? Simply remove the .gitattributes file. When I tested it in a fork, the dist folder was vendored; but if that is still a concern, then you can add dist/* linguist-vendored.

    opened by ghost 1
  • gh-pages

    gh-pages

    In looking at the docs folder using duplicate files in scope of crane-search, I wanted to see how we could organize everything to display the useful information without taking up space.

    The first question in doing this is how are you hosting the docs page(s)? Is it through the docs folder, are through some other source? I ask this because I had two suggestions:

    • Move docs folder into a gh-pages branch
    • Disperse the docs files throughout the crane-search repository to show where the files are expected to be.

    Both of these ideas can be seen kinda like an "embedded template."

    P.S.: Let's add that users can link the .js files to <script> tags in HTML :)

    opened by ghost 1
  • Wasm size

    Wasm size

    Cool idea. Did you try tinyGo by any chance? Some data indicate that it is much better in generating smaller wasms. I am sure that for sites with Gbs of text it still will not work well in terms of size of wasm, but... for some cases... it could be very interesting solution.

    opened by gitMitos 1
Owner
Andrew Healey
Software engineer / writer. Doing my best to learn in public 🌻
Andrew Healey
Misou is a personal search engine very much inspired by monocle that looks through my knowledge sources.

?? Mi 搜 - a personal search engine Misou is a personal search engine very much inspired by monocle that looks through my knowledge sources. It is writ

Adrian Stobbe 29 Nov 7, 2022
The world’s fastest framework for building websites.

A Fast and Flexible Static Site Generator built with love by bep, spf13 and friends in Go. Website | Forum | Documentation | Installation Guide | Cont

GoHugo.io 63.8k Nov 19, 2022
A go skeleton for websites.

git-go-websiteskeleton A basic website skeleton in Go that comes with the Gorilla Multiplexer for routing, glog for access and error logging, as well

Jean de Klerk 306 Nov 12, 2022
A golang script designed to output the cert information for various websites

gofer gofer is a golang script designed to output the cert information for various websites Example run You can supply multiple sites with port (ie. :

Nick Anderson 2 Jun 15, 2022
An online, full fledged bank system - ATM, online bank, transactions, bank cards

Bank An online, full fledged bank system - ATM, online bank, transactions, bank cards Online Bank: The online banking system can be used to send trans

Glaukio 3 Apr 2, 2022
Plenti Static Site Generator with Go backend and Svelte frontend

Plenti Static Site Generator with Go backend and Svelte frontend Website: https://plenti.co Requirements ❗ You must have NodeJS version 13 or newer As

Plentico 827 Nov 21, 2022
verless is a Static Site Generator designed for Markdown-based content

verless (pronounced like serverless) is a Static Site Generator designed for Markdown-based content with a focus on simplicity and performance. It reads your Markdown files, applies your HTML templates and renders them as a website.

verless 299 Oct 26, 2022
Vela plugin designed for generating a static documentation website with Hugo.

Vela plugin designed for generating a static documentation website with Hugo.

Vela 0 Jul 22, 2022
Statika is simple static site generator(SSG) written in go emphasizing convention over configuration

Statika Statika is simple static site generator(SSG) written in go emphasizing convention over configuration. This is a newer version of my original s

Jeff Smith 1 Jul 14, 2022
notion-md-gen allows you to use Notion as a CMS for pages built with any static site generators

notion-md-gen allows you to use Notion as a CMS for pages built with any static site generators

Bonaysoft 67 Nov 20, 2022