An R-tree implementation for Go

Overview

rtree

GoDoc

This package provides an in-memory R-Tree implementation for Go. It's designed for Tile38 and is optimized for fast rect inserts and replacements.

Cities

Usage

Installing

To start using rtree, install Go and run go get:

$ go get -u github.com/tidwall/rtree

Basic operations

// create a 2D RTree
var tr rtree.RTree

// insert a point
tr.Insert([2]float64{-112.0078, 33.4373}, [2]float64{-112.0078, 33.4373}, "PHX")

// insert a rect
tr.Insert([2]float64{10, 10}, [2]float64{20, 20}, "rect")

// search 
tr.Search([2]float64{-112.1, 33.4}, [2]float64{-112.0, 33.5}, 
 	func(min, max [2]float64, value interface{}) bool {
		println(value.(string)) // prints "PHX"
	},
)

// delete 
tr.Delete([2]float64{-112.0078, 33.4373}, [2]float64{-112.0078, 33.4373}, "PHX")

Algorithms

This implementation is a variant of the original paper:
R-TREES. A DYNAMIC INDEX STRUCTURE FOR SPATIAL SEARCHING

Inserting

Same as the original algorithm. From the root to the leaf, the rects which will incur the least enlargment are chosen. Ties go to rects with the smallest area.

Deleting

Same as the original algorithm. A target rect is deleted directly. When the number of children in a rect falls below it's minumum entries, it is removed from the tree and it's items are re-inserted.

Splitting

This is a custom algorithm. It attempts to minimize intensive operations such as pre-sorting the children and comparing overlaps & area sizes. The desire is to do simple single axis distance calculations each child only once, with a target 50/50 chance that the child might be moved in-memory.

When a rect has reached it's max number of entries it's largest axis is calculated and the rect is split into two smaller rects, named left and right. Each child rects is then evaluated to determine which smaller rect it should be placed into. Two values, min-dist and max-dist, are calcuated for each child.

  • min-dist is the distance from the parent's minumum value of it's largest axis to the child's minumum value of the parent largest axis.
  • max-dist is the distance from the parent's maximum value of it's largest axis to the child's maximum value of the parent largest axis.

When the min-dist is less than max-dist then the child is placed into the left rect. When the max-dist is less than min-dist then the child is placed into the right rect. When the min-dist is equal to max-dist then the child is placed into an equal bucket until all of the children are evaluated. Each equal rect is then one-by-one placed in either left or right, whichever has less children.

License

rtree source code is available under the MIT License.

Comments
  • Replacing the root node with a bigger size than before, re-inserts the root node and grows the Rtree.

    Replacing the root node with a bigger size than before, re-inserts the root node and grows the Rtree.

    If the size of the root node, which is the first node inserted, changes size in that the new size is bigger than the old one the delete won't delete because tr.root.contains(&item) returns false and hence is truein the code below.

    Because the Replace function is just a Delete and Insert, this leads to a bug where the root node is now inserted once again. So after the Replace we have RTree.Len() + 1 , which is wrong. RTree.Len() has to be the same before and after the Replace.

    https://github.com/tidwall/rbang/blob/ecaccd2e0367c5dd8e735fe40d6780dfa38f7806/rbang.go#L370-L375

    opened by Robert-M-Muench 8
  • Is rbang threadsafe, or planned to be threadsafe?

    Is rbang threadsafe, or planned to be threadsafe?

    I tried to use https://github.com/Workiva/go-datastructures/tree/master/rtree, but it seems buggy when you are changing some members' coordinates; deleting and inserting doesn't seem to work correctly.

    However, it claims that the RTree can be used threadsafe. From looking at the source, I think rbang is not threadsafe. Is that correct? Any plans to make it threadsafe?

    opened by Robert-M-Muench 3
  • a few performance notes

    a few performance notes

    Hi! I came here via a twitter post about performance. I was curious, so I poked around a bit for fun. A few minor comments, in the hopes that they are useful.

    First, BenchmarkRandomInsert can be a bit misleading. If you make it run faster, it gets to a higher b.N, but at higher b.N, each iteration has to do more work. Thus code improvements don't necessarily show up in improved benchmark results. Picking a fixed number of iterations to do each time would be better. Perhaps like:

    func BenchmarkRandomInsert(b *testing.B) {
    	const iters = 10000
    	boxes := randBoxes(iters)
    	b.ResetTimer()
    	for i := 0; i < b.N; i++ {
    		var tr RTree
    		for iter := 0; iter < iters; iter++ {
    			tr.Insert(boxes[iter].min[:], boxes[iter].max[:], iter)
    		}
    	}
    }
    

    Second, I can get a ~15% speedup on BenchmarkRandomInsert on my machine by unrolling the loops in chooseLeastEnlargement. Here's the 3d version:

    // min is a sloppy float min function.
    // It is faster than math.Min, at the cost
    // of sloppy handling of the usual
    // floating point delights: NaN and Inf.
    func min(x, y float64) float64 {
    	if x < y {
    		return x
    	}
    	return y
    }
    
    // max is like min, only max.
    func max(x, y float64) float64 {
    	if x < y {
    		return y
    	}
    	return x
    }
    
    func (r *box) chooseLeastEnlargement(b *box) int {
    	j, jenlargement, jarea := -1, 0.0, 0.0
    	n := r.data.(*node)
    	for i := 0; i < n.count; i++ {
    		area := (n.boxes[i].max[0] - n.boxes[i].min[0]) *
    			(n.boxes[i].max[1] - n.boxes[i].min[1]) *
    			(n.boxes[i].max[2] - n.boxes[i].min[2])
    
    		enlargedArea := (max(b.max[0], n.boxes[i].max[0]) - min(b.min[0], n.boxes[i].min[0])) *
    			(max(b.max[1], n.boxes[i].max[1]) - min(b.min[1], n.boxes[i].min[1])) *
    			(max(b.max[2], n.boxes[i].max[2]) - min(b.min[2], n.boxes[i].min[2]))
    
    		enlargement := enlargedArea - area
    
    		if j == -1 || enlargement < jenlargement || (enlargement == jenlargement && area < jarea) {
    			j, jenlargement, jarea = i, enlargement, area
    		}
    	}
    	return j
    }
    

    IMHO, it also improves readability a bit. Of course, it makes your code generation more complicated.

    opened by josharian 3
  • Change split function from largest axis to minimum overlap

    Change split function from largest axis to minimum overlap

    The current splitting mechanism seems quite simple and based on recent papers on node splitting for r-trees I think there is a lot of room for improving it.

    One way is instead of splitting the rectangle (when it reached maximum entries) based on its largest axis we can split it based on the axis which makes a smaller overlapping area after the split.

    This way it generates more efficient splitting because the smaller the overlapping area helps find the right rectangle faster. When there is no overlapping area (the overlapping area is zero) we can use the largest axis like before.

    I run this benchmark from the test file for 10 million records for 5 times go test -bench=. -run=TestGeoIndex -count=5 which gives these results.

    Largest axis:

    seed: 1649339938841388500 insert: 10,000,000 ops in 9713ms, 1,029,571/sec, 971 ns/op search: 10,000,000 ops in 11043ms, 905,545/sec, 1104 ns/op delete: 10,000,000 ops in 18059ms, 553,752/sec, 1805 ns/op insert: 10,000,000 ops in 9486ms, 1,054,226/sec, 948 ns/op search: 10,000,000 ops in 13304ms, 751,670/sec, 1330 ns/op delete: 10,000,000 ops in 14877ms, 672,192/sec, 1487 ns/op insert: 10,000,000 ops in 8545ms, 1,170,221/sec, 854 ns/op search: 10,000,000 ops in 8024ms, 1,246,227/sec, 802 ns/op delete: 10,000,000 ops in 13623ms, 734,071/sec, 1362 ns/op insert: 10,000,000 ops in 9244ms, 1,081,823/sec, 924 ns/op search: 10,000,000 ops in 8730ms, 1,145,484/sec, 872 ns/op delete: 10,000,000 ops in 14045ms, 711,980/sec, 1404 ns/op insert: 10,000,000 ops in 8570ms, 1,166,880/sec, 856 ns/op search: 10,000,000 ops in 9173ms, 1,090,108/sec, 917 ns/op delete: 10,000,000 ops in 14159ms, 706,288/sec, 1415 ns/op goos: linux goarch: amd64 pkg: github.com/tidwall/rtree cpu: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz BenchmarkRandomInsert-8 2106502 662.1 ns/op BenchmarkRandomInsert-8 2076409 649.2 ns/op BenchmarkRandomInsert-8 1785523 671.1 ns/op BenchmarkRandomInsert-8 1993303 698.6 ns/op BenchmarkRandomInsert-8 1968975 668.6 ns/op PASS ok github.com/tidwall/rtree 187.575s

    Minimum overlapping:

    seed: 1649340172027585044 insert: 10,000,000 ops in 10305ms, 970,424/sec, 1030 ns/op search: 10,000,000 ops in 8469ms, 1,180,722/sec, 846 ns/op delete: 10,000,000 ops in 17717ms, 564,439/sec, 1771 ns/op insert: 10,000,000 ops in 9831ms, 1,017,192/sec, 983 ns/op search: 10,000,000 ops in 7829ms, 1,277,341/sec, 782 ns/op delete: 10,000,000 ops in 13622ms, 734,104/sec, 1362 ns/op insert: 10,000,000 ops in 8415ms, 1,188,372/sec, 841 ns/op search: 10,000,000 ops in 7520ms, 1,329,788/sec, 751 ns/op delete: 10,000,000 ops in 12974ms, 770,768/sec, 1297 ns/op insert: 10,000,000 ops in 9586ms, 1,043,225/sec, 958 ns/op search: 10,000,000 ops in 9120ms, 1,096,486/sec, 912 ns/op delete: 10,000,000 ops in 14197ms, 704,351/sec, 1419 ns/op insert: 10,000,000 ops in 9437ms, 1,059,663/sec, 943 ns/op search: 10,000,000 ops in 7609ms, 1,314,176/sec, 760 ns/op delete: 10,000,000 ops in 13316ms, 750,948/sec, 1331 ns/op goos: linux goarch: amd64 pkg: github.com/tidwall/rtree cpu: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz BenchmarkRandomInsert-8 1834258 735.7 ns/op BenchmarkRandomInsert-8 1822066 719.2 ns/op BenchmarkRandomInsert-8 1780968 716.1 ns/op BenchmarkRandomInsert-8 1821146 765.8 ns/op BenchmarkRandomInsert-8 1800307 716.3 ns/op PASS ok github.com/tidwall/rtree 176.871s

    Searching took about 810 ns/op for minimum_overlap and 1005 ns/op for the largest axis which seems a fairly good imporvment.

    Insertion time increased a bit from about 669 ns/op to about 730 ns/op because we do more jobs in the splitting phase.

    I think in geo indexing people insert the data once and query it many times so this algorithm may speed up queries for most of the users.

    opened by cep-ter 1
  • Read only data structure

    Read only data structure

    This adds a ReadOnly RTree that is generated from a Generic type.

    It's intended to be serializable to allow saving large datasets that are never updated.

    opened by paulstuart 0
  • Best practice regarding different (0,0) origins

    Best practice regarding different (0,0) origins

    I have a graphics lib, where the origin (0,0) is at the top/left, and rects use a top/left and width/height notation.

    rbang origin (0,0) is bottom/left and uses a bottom/left, top/right notation.

    So, when inserting rects from the graphics lib, I use code like this to update the same rectangle:

    • r = rectangle
    • xll = x lower-left
    • yll = y lower-left
    • xur = x upper-right
    • your = y upper-right
    rbang.Replace(
    	[2]float64{
    		float64(r.xll),
    		float64(r.yll)},
    	[2]float64{
    		float64(r.xur),
    		float64(r.yur)},
    	r,
    	[2]float64{
    		float64(r.GetLeft()),
    		float64(r.GetTop())},
    	[2]float64{
    		float64(r.GetLeft() + r.GetWidth()),
    		float64(r.GetTop() + r.GetHeight())}, r)
    

    The problem is, that I have a very simple three rects layout (root, two childs with 50% width and 100% height of root), where a search with a point inside c2, returns the root node and c1... and not the root node and c2.

    opened by Robert-M-Muench 1
Owner
Josh Baker
Josh Baker
A prefix tree implementation in go

Trie (Prefix tree) This library is compatible with Go 1.11+ Please refer to CHANGELOG.md if you encounter breaking changes. Motivation Introduction Us

Viant, Inc 31 Nov 3, 2022
An yet-another red-black tree implementation, with a C++ STL-like API.

A red-black tree with an API similar to C++ STL's. INSTALLATION go get github.com/yasushi-saito/rbtree EXAMPLE More examples can be fou

Yasushi Saito 18 Apr 25, 2022
An R-tree implementation for Go

rtree This package provides an in-memory R-Tree implementation for Go. It's designed for Tile38 and is optimized for fast rect inserts and replacement

Josh Baker 236 Dec 29, 2022
An immutable radix tree implementation in Golang

go-immutable-radix Provides the iradix package that implements an immutable radix tree. The package only provides a single Tree implementation, optimi

HashiCorp 859 Dec 29, 2022
B-tree implementation for Go

btree btree is a Go implementation of a B-Tree. This project is intended for learning purposes so the code is relatively small (<500LOC) and highly do

Amit Davidson 214 Dec 31, 2022
Go implementation of the van Emde Boas tree data structure: Priority queue for positive whole numbers in O(log log u) time.

vEB Go implementation of the van Emde Boas tree data structure: Priority queue for positive whole numbers in O(log log u) time. Supports the following

null 4 Mar 7, 2022
Go-merkle - Merkle tree implementation in Golang

go-merkle go-merkle implements a simple merkle tree in Golang. It allows to obta

atomixwap 2 Aug 8, 2022
A Go implementation of a radix tree, that uses binary searches to speed up insert, retrieve and delete operations on dense trees

radixs A Go implementation of a radix tree, that uses binary searches to speed up insert, retrieve and delete operations on dense trees. This implemen

Bruno Moura 0 Feb 14, 2022
An app with Trie tree and Breve search Implementation CLI and HTTP both 🥳

Introduction LifeLongLearner project consists of two different parts. My English Vocabulary My Technical Book Notes All of them provided by me within

A.Samet İleri 15 Jul 1, 2022
A tree like tool help you to explore data structures in your redis server

Redis-view is a tree like tool help you explore data structures in your redis server

dreamersdw 19 Mar 17, 2022
AVL tree with some useful extensions written in Go

gocover An AVL tree (Adel'son-Vel'skii & Landis) is a binary search tree in which the heights of the left and right subtrees of the root differ by at

Michael Lore 28 Mar 23, 2022
an R-Tree library for Go

rtreego A library for efficiently storing and querying spatial data in the Go programming language. About The R-tree is a popular data structure for e

Daniel Connelly 557 Jan 3, 2023
Just an itsy bitsy b-tree in Go

tinybtree Just an itsy bitsy b-tree. Usage Keys are strings, values are interfaces. Functions Get(key string) (value interface{}, gotten bool) Set(key

Josh Baker 93 Dec 25, 2022
Tree algorithms in Golang

Tree Algorithms in Golang This is my humble attempt to share some of the tree algorithms in Golang. Pull requests are always welcome! :) Contents Tree

Arun Shaji 3 Oct 6, 2021
Simple code just to try out and Binary Tree on Golang.

Character counter | ▮▮▮▮▮▮▮▮ Simple code just to try out and Binary Tree on Golang. Count characters to train openning a file and reading it, as well

Arthur Abeilice 0 May 17, 2022
Golang channel example with equivalent binary tree

golang_channel_example_with_equivalent_binary_tree Exercise: Equivalent Binary Trees There can be many different binary trees with the same sequence o

web_developer 1 Oct 9, 2021
Segment tree for bytes in Go

bsegtree Segment tree for bytes in Go Based on Thomas Oberndörfer's int range segment tree with fixing/optimization/modification for bytes ranges. For

Temple3x 3 Jun 16, 2022
A project that deals with implementations of a binary tree

Binary Search Tree This is a project that deals with implementations of a binary tree and the following functions. Print Prints the entire tree. Argum

Victor Zeddys 1 Nov 1, 2021
A versioned, snapshottable (immutable) AVL+ tree for persistent data.

IAVL+ Tree Note: Requires Go 1.13+ A versioned, snapshottable (immutable) AVL+ tree for persistent data. The purpose of this data structure is to prov

null 0 Nov 24, 2021