An IPFS bytes exchange for caching and retrieving data from Filecoin

Overview


🐸

go-hop-exchange


An IPFS bytes exchange to allow any IPFS node to become a Filecoin retrieval provider and retrieve content from Filecoin

Highlights

  • IPFS exchange interface like Bitswap
  • Turn any IPFS node into a Filecoin retrieval provider (YES, that means you will earn FIL when we launch on mainnet!)
  • New content is dispatched via Gossipsub and stored if enough space is available
  • IPFS Plugin to wrap the default Bitswap implementation and fetch blocks from Filecoin if not available on the public IPFS network
  • Upload and retrieve directly from Filecoin if not secondary providers cache the content (Coming Soon)

Background

To speed up data retrieval from Filecoin, a secondary market allows clients to publish their content ids to a network of providers in order to retrieve it faster and more often at a cheaper price. This does not guarrantee data availability and so should be used in addition to a regular storage deal. You can think of this as the CDN layer of Filecoin. This library is still very experimental and more at the prototype stage so feel free to open an issue if you have any suggestion or would like to contribute!

Install

As a library:

$ go get github.com/myelnet/go-hop-exchange

As an IPFS plugin:

Please follow the instructions in the plugin repo

Library Usage

  1. Import the package.
package main

import (
	hop "github.com/myelnet/go-hop-exchange"
)
  1. Initialize a blockstore, graphsync, libp2p host and gossipsub subscription.
var ctx context.Context
var bstore blockstore.Blockstore
var ps *pubsub.PubSub
var host libp2p.Host
var ds datastore.Batching
var gs graphsync.GraphExchange
var ks keystore.Keystore

exch, err := hop.NewExchange(
		ctx,
		hop.WithBlockstore(bstore),
		hop.WithPubSub(ps),
		hop.WithHost(host),
		hop.WithDatastore(ds),
		hop.WithGraphSync(gs),
		hop.WithRepoPath("ipfs-repo-path"),
		hop.WithKeystore(ks),
		hop.WithFilecoinAPI(
			"wss://filecoin.infura.io",
			http.Header{
				"Authorisation": []string{"Basic "},
			},
		)
	)

blocks := bserv.New(bstore, exch)
dag := dag.NewDAGService(n.blocks)

WithFilecoinAPI is optional and if not provided, the node only supports free transfers and will charge 0 price per byte to client requests for content it serves. This is mostly for testing purposes.

  1. When getting from the DAG it will automatically query the network
var dag ipld.DAGService
var ctx context.Context
var root cid.Cid

node, err := dag.Get(ctx, root)
  1. Clients can anounce a new deal they made so the content is propagated to providers this is also called when adding a block with dag.Add.
var ctx context.Context
var root cid.Cid

err := exch.Announce(ctx, root)
  1. We're also exposing convenience methods to transfer funds or import keys to the underlying wallet
var ctx context.Context
var to address.Address

from, err := exch.Wallet().DefaultAddress() 

err = exch.Wallet().Transfer(ctx, from, to, "12.5")

Design principles

  • Composable: Hop is highly modular and can be combined with any ipfs, data transfer, Filecoin or other exchange systems.
  • Lightweight: we try to limit the size of the build as we are aiming to bring this exchange to mobile devices. We do not import core implementations such as go-ipfs or lotus directly but rely on shared packages.
  • Do one thing well: there are many problems to solve in the decentralized storage space. This package only focuses on routing and retrieving content from peers in the most optimal way possible.
  • KISS: Keep it simple, stupid. We take a naive approach to everything and try not to reinvent the wheel. Filecoin is already complex enough as it is.
Comments
  • feat: enhance logging

    feat: enhance logging

    We can now run the program and specify log level via arg or env :

    • ENV LOG=debug go run . start
    • go run . -log=debug start

    3 levels available : trace, debug and info

    • trace & debug for dev purpose with slow but clean logs
    • info for prod with fast json encoded logs

    Also, this PR will try to replace every fmt.print based logs with the zerolog lib, and will log all the silenced errors, at the cost that during the tests for exemple, too many lines might be displayed.

    opened by gallexis 11
  • Use a Myel node for bootstrap and gate connections from IPFS nodes

    Use a Myel node for bootstrap and gate connections from IPFS nodes

    • Confirm that a Myel node can be used as a bootstrap node (I think the DHT server mode is enabled by default but I might be wrong so you can double check by connecting 2 nodes to a 3rd Myel node and see if they discover each other.
    • Use a connection gater to prevent Myel nodes from connecting to IPFS nodes. Two ways to go about it would be either checking the user agent header or the protocols a peer supports.
    opened by tchardin 6
  • Generate SSL certificates for Myel providers

    Generate SSL certificates for Myel providers

    How could Myel providers who run a Pop on their home devices easily generate a SSL certificate so clients can retrieve over WSS

    Benefits: Each swarm could use its own SSL certificates to ensure :

    1. A possible communication of peers using web browsers with the Providers
    2. A safe communication channel

    Problem:

    1. Deal with ACME DNS challenge
    2. Might be too SPOF

    Hints :

    opened by gallexis 6
  • Implement logging strategy

    Implement logging strategy

    Currently there is very little logs and most implemented with fmt.Print* we need a performant logging solution that can also save logs to disk when running on remote machines. Also support different levels of logging. https://github.com/ipfs/go-log might be convenient otherwise https://github.com/go-kit/kit/tree/master/log is nice as well. We can discuss.

    enhancement good first issue 
    opened by tchardin 5
  • Smart chunking

    Smart chunking

    When it is possible to detect file type with the extension name, we should select an appropriate chunking strategy. This improves deduplication, data transfer speed and makes the network overall more efficient. Here's some general guidelines:

    • Audio and video content should have trickle layout and chunk sizes of 1MB.
    • Images, compressed archives (.zip etc), size splitter with 1MB chunks, balanced layout.
    • Text, JSON etc. Buzhash chunker with balanced layout and 16kb chunks for best deduplication. We can probably experiment with different params but this seems to be reasonable efforts.
    opened by tchardin 3
  • Create nodes with “testing” roles

    Create nodes with “testing” roles

    We need nodes with "testing" roles in that they are capable of autonomously:

    • Sending files to be cached by other peers.
    • Retrieving files from peers at random time intervals.
    • Logging and displaying stats from these actions.

    The nodes should be capable of running multiple test scenarios.

    eg. nodes send File A 99% of the time to emulate the coverage of a "popular"/ oft requested file. They send File B 1% of time and then gather stats that compare coverage and retrieval performance for A and B.

    eg. nodes have a library of files in order of increasing size. They collect stats on pushing / retrieval times relative to file size.

    opened by alexander-camuto 2
  • refactor: different way to wait for data transfer events

    refactor: different way to wait for data transfer events

    This seems like a better way for waiting for a transfer without having to do gymnastics to check it is the correct transfer. @gallexis maybe try this way on the replication dispatch (replication.go:141)

    opened by tchardin 2
  • feat: benchmark JS client

    feat: benchmark JS client

    This adds a new cli tool for running benchmarks against a JS client running in a headless chrome browser. The cli offers 2 different test modes:

    • e2e: starts a pop node as provider, adds all the content in the given directory and retrieves it all in parallel
    • daemon: starts a service worker client in a headless browser and a cli command server listening for get commands. Each get command navigates to the given URL loading the content from the service worker client during a shared browser session.
    opened by tchardin 1
  • feat: import car file and replicate to the network

    feat: import car file and replicate to the network

    @alexander-camuto you will need to set the replication factor to the exact number of nodes you'd like to dispatch too. Make sure they're connected otherwise it might hang for a while.

    opened by tchardin 1
  • feat: add upgrade handler to the pop server

    feat: add upgrade handler to the pop server

    Now if a github secret is provided, the server will listen for webhook requests and trigger the upgrade handler. This means you can send requests to <name.myel.zone>/upgrade and if activate, the node should auto upgrade. TODO:

    • [x] We're still missing a function to call the process to restart itself. @alexander-camuto I've seen some libraries do that if you wanna look into it.
    opened by tchardin 1
  • feat: k8s capabilities for pop

    feat: k8s capabilities for pop

    • allows for the deployment of a global CDN
    • a core number of nodes are run on regular EC2 on-demand instance -- these form the backbone of the CDN
    • scaling the CDN is then done using volatile aws spot / excess compute
    • master node autoscales CDN as requests increase i.e deploys more or less worker nodes
    • master node provides a monitoring dashboard to track usage / performance
    • worker nodes all share same FIL private key
    opened by alexander-camuto 1
  • Make requests to IPFS gateways using bcli

    Make requests to IPFS gateways using bcli

    • Add the ability to make requests for CIDs from arbitrary IPFS gateways eg. bcli get ipfs.com cid
    • Log TTFB and transfer time as when fetching from pop nodes
    enhancement 
    opened by alexander-camuto 0
  • Decentralized hole punching

    Decentralized hole punching

    • We currently use ngrok as a NAT hole punching solution. Although easy to use this introduces a myriad of setbacks:
      • ngrok servers are not distributed geographically, so we really take a hit in performance
      • ngrok code is proprietary so its hard to figure out what exactly is going on behind the scenes
      • ngrok relays / servers can't act as providers on our network -- which is a missed opportunity

    In light of the PL project flare developments we should consider rolling out their relay-circuit v2 implementation for the public nodes on our networks (i.e those not behind a NAT).

    Some brief notes on how to implement this:

    • Use autonat to determine if a node is behind a NAT or not -- if not, automatically promote a node to a relay.
    • Multi-address can contain information as to whether a peer requires a connection via a relay or not -- and through which relay.
    • Relays can see requests for content and would determine if they themselves should cache the content to boost performance and avoid the messaging roundtrips hole-punching requires -- this would be a first step in introducing a performance boosting hierarchy to the network.
    • Baking in support for WebRTC would remove the need for the DNS records we have to maintain atm -- which would make it easier to onboard new providers (as WebRTC is currently the only protocol that can perform hole-punching browser side)
    enhancement 
    opened by alexander-camuto 0
  • Separate hashing function when responding to caching requests

    Separate hashing function when responding to caching requests

    • Currently providers can claim they already have a CID locally when receiving a caching request. This provides a vector of attack whereby malicious providers could lie about the content they have locally to intentionally reduce the replication factor / redundancy of specific pieces of content.

    • A simple solution to this is for the provider to back up this claim using a new hash of the content DAG (eg. keccak / sha-3) that the CID alone wouldn't provide -- serving as a simple proof that the provider does have the content.

      • the problem is that the provider can hold content, hash it, and subsequently delete it, but store the hash to respond to subsequent requests.
      • a potential solution is to use a keyed hash function, whereby a node sending caching requests also includes a randomly generated key as a payload. The nodes responding then have to hash the DAG using that key, providing a proof that, at least at that given point in time, they did actually hold the content.
    enhancement 
    opened by alexander-camuto 1
  • TestMultiTx is racy

    TestMultiTx is racy

    Failed in CI with output:

    {"level":"error","error":"No state for /1626167599029: datastore: key not found","time":"2021-07-13T09:13:20Z","message":"attempting to configure data store"}
    2021-07-13T09:13:20.035Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `0`, event `4`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
    2021-07-13T09:13:20.035Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `0`, event `4`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
    {"level":"error","error":"No state for /1626167599030: datastore: key not found","time":"2021-07-13T09:13:20Z","message":"attempting to configure data store"}
    2021-07-13T09:13:20.051Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `0`, event `4`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
    2021-07-13T09:13:20.051Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `0`, event `4`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
    2021-07-13T09:13:20.053Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `6`, event `27`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
        tx_test.go:415: could not finish gtx2
    
    opened by tchardin 1
  • Improve payment channel settlement on the provider side

    Improve payment channel settlement on the provider side

    • Context: In the case of a simple retrieval, the provider can redeem the received vouchers and settle the channel once the transfer is finished.
    • Problem: if the client wishes to reuse a payment channel a while longer say for progressively retrieving parts of a DAG, it becomes more complex for the provider to know when is a good time to settle a payment channel.
    • Naive solution: wait for the client to call settle so the provider knows the client no longer needs the channel and can redeem all the vouchers as one. This is nice because it means the provider needn't pay for the settle gas costs.
    • Caveat: What if the client disappears for whatever reasons without calling settle, the provider must then have a way to collect their earnings. It also means the provider must subscribe to chain events.
    • Enhanced solution: The client must set a MinSettleHeight param on the vouchers which guarantees no one can call settle before then. The provider reads the value and can decide to update the payment channel using the voucher or just wait knowing more transfers might be coming. If the client doesn't call settle by the chain height, the provider can just redeem and settle the channel.
    • Security consideration: Providers shouldn't accept vouchers with a TimelockMin value 12h over the MinSettleHeight as it would mean the client can call settle and collect back their funds before the provider can redeem the vouchers.
    • Additional improvements: Subscribing to chain epochs from a lotus node RPC puts too much strain and dependency on 3rd party infrastructure. Nodes should connect to a few lotus peers directly and subscribe to the gossip topic announcing new blocks. This could be a good start for enabling pushing blocks directly in the future.
    opened by tchardin 0
Releases(v0.1.1)
  • v0.1(Oct 28, 2021)

    This is an alpha release featuring end to end retrieval market capabilities including content routing, replication and transfer.

    Base features:

    • Free and paid data transfers.
    • Gossipsub based content routing using recursive message forwarding to send responses.
    • Publish records to a remote index if an optional url is provided.
    • Light wallet interface to interact with the local keystore and send messages to a remote lotus node RPC in order to transfer funds in and out of the node's wallet.
    • Access a key in the local keystore to sign a new payment channel message.
    • Send chain messages to a remote lotus node RPC.
    • Manage the lifecycle of payment channels, store vouchers locally.
    • Retrieve content from a Filecoin storage provider.
    • Maintain a log of all request to rank content by usage frequency
    • Ability to sync provided blocks from a nearby peer when joining the network for the first time.
    • Dispatch new content to cache providers given a replication factor.
    • Decision mechanism for deciding whether to store an incoming block or not. (accept all by default)
    • Priority cue to garbage collect least frequently used content.
    • Set different strategy levels for selecting providers including:
      • First response with given price ceiling.
      • Cheapest price in given time period.
    • Import content from CAR file.
    • HTTP gateway for posting and getting content.
    • TLS certificate generation via Certmagic to enable secure websocket transport with browser clients.
    • Observability and benchmarks for retrieval operations.
    • Testground plans to test the network at different scales.
    Source code(tar.gz)
    Source code(zip)
    pop-amd64-darwin(34.07 MB)
    pop-amd64-linux(30.82 MB)
    pop-arm64-darwin(32.19 MB)
    pop-arm64-linux(28.64 MB)
  • v0.1-rc2(Jun 22, 2021)

  • v0.1-rc1(May 3, 2021)

    This is a first release candidate which provide end to end retrieval market features including content routing, replication and transfer.

    Base features:

    • Free and paid data transfers.
    • Gossipsub based content routing using recursive message forwarding to send responses.
    • Light wallet interface to interact with the local keystore and send messages to a remote lotus node RPC in order to transfer funds in and out of the node's wallet.
    • Access a key in the local keystore to sign a new payment channel message.
    • Send chain messages to a remote lotus node RPC.
    • Manage the lifecycle of payment channels, store vouchers locally.
    • Create a storage deal with a Filecoin storage provider.
    • Retrieve content from a Filecoin storage provider.
    • Settle payment channel automatically.
    • Automatically collect payment channel after the wait period.
    • Maintain a log of all request to rank content by usage frequency
    • Ability to sync provided blocks with a neary peer when joining the network for the first time.
    • Dispatch new content to cache providers given a replication factor.
    • Decision mechanism for deciding whether to store an incoming block or not. (accept all by default)
    • Priority cue to garbage collect least frequently used content.
    • Set different strategy levels for selecting providers including:
      • First response with given price ceiling.
      • Cheapest price in given time period.
    • Keep track of previous successful providers to increase speed of provider selection.
    • Observability and benchmarks for retrieval operations.
    • Testground plans to test the network at different scales. (Due to some Testground issues it is not possible to run beyond 47 instances currently and requires debugging from the maintenance team)
    Source code(tar.gz)
    Source code(zip)
Owner
Myel
Community powered content delivery network
Myel
Ipfs-retriever - An application that retrieves files from IPFS network

ipfs-retriever This is an application that retrieves files from IPFS network. It

Phat Nguyen Luu 0 Jan 5, 2022
Tapestry is an underlying distributed object location and retrieval system (DOLR) which can be used to store and locate objects. This distributed system provides an interface for storing and retrieving key-value pairs.

Tapestry This project implements Tapestry, an underlying distributed object location and retrieval system (DOLR) which can be used to store and locate

Han Cai 1 Mar 16, 2022
A tool for checking the accessibility of your data by IPFS peers

ipfs-check Check if you can find your content on IPFS A tool for checking the accessibility of your data by IPFS peers Documentation Build go build wi

Adin Schmahmann 17 Nov 9, 2022
A minimal filecoin client library

filclient A standalone client library for interacting with the filecoin storage network Features Make storage deals with miners Query storage ask pric

Application Research Group 38 Sep 8, 2022
A Filecoin Network sidecar for miners to bid in storage deal auctions.

bidbot Bidbot is a Filecoin Network sidecar for miners to bid in storage deal auctions. Join us on our public Slack channel for news, discussions, and

textile.io 32 Nov 10, 2022
Jazigo is a tool written in Go for retrieving configuration for multiple devices, similar to rancid, fetchconfig, oxidized, Sweet.

Table of Contents About Jazigo Supported Platforms Features Requirements Quick Start - Short version Quick Start - Detailed version Global Settings Im

null 185 Oct 21, 2022
A simple tool for retrieving a request's IP address on the server.

reqip A simple tool for retrieving a request's IP address on the server. Inspired from request-ip Installation Via go get go get github.com/mo7zayed/r

Mohamed Zayed 15 Oct 26, 2022
Deece is an open, collaborative, and decentralised search mechanism for IPFS

Deece Deece is an open, collaborative, and decentralised search mechanism for IPFS. Any node running the client is able to crawl content on IPFS and a

null 12 Oct 29, 2022
🌐 (Web 3.0) Pastebin built on IPFS, securely served by Distributed Web and Edge Network.

pastebin-ipfs 简体中文 (IPFS Archivists) Still in development, Pull Requests are welcomed. Pastebin built on IPFS, securely served by Distributed Web and

Mayo/IO 164 Nov 9, 2022
IPFS implementation in Go

go-ipfs What is IPFS? IPFS is a global, versioned, peer-to-peer filesystem. It combines good ideas from previous systems such as Git, BitTorrent, Kade

IPFS 14.4k Nov 27, 2022
A standalone ipfs gateway

rainbow Because ipfs should just work like unicorns and rainbows Building go build Running rainbow Configuration NAME: rainbow - a standalone ipf

IPFS 21 Nov 9, 2022
A minimal IPFS replacement for P2P IPLD apps

IPFS-Nucleus IPFS-Nucleus is a minimal block daemon for IPLD based services. You could call it an IPLDaemon. It implements the following http api call

Peergos 26 Nov 4, 2022
Technical specifications for the IPFS protocol stack

IPFS Specifications This repository contains the specs for the IPFS Protocol and associated subsystems. Understanding the meaning of the spec badges a

IPFS 1k Nov 26, 2022
Generates file.key file for IPFS Private Network.

ipfs-keygen Generates file.key file for IPFS Private Network. Installation go get -u github.com/reixmor/ipfs-keygen/ipfs-keygen Usage ipfs-keygen > ~/

Camilo Abel Monreal Aguero 0 Jan 18, 2022
Go-ipfs-pinner - The pinner system is responsible for keeping track of which objects a user wants to keep stored locally

go-ipfs-pinner Background The pinner system is responsible for keeping track of

y 0 Jan 18, 2022
Go client for the Foreign exchange rates and currency conversion API 💰

fixer Go client for Fixer.io (Foreign exchange rates and currency conversion API) You need to register for a free access key if using the default Fixe

Peter Hellberg 19 Nov 14, 2022
Cert bound sts server - Certificate Bound Tokens using Security Token Exchange Server (STS)

Certificate Bound Tokens using Security Token Exchange Server (STS) Sample demonstration of Certificate Bound Tokens acquired from a Security Token Ex

null 0 Jan 2, 2022
Sixmap - Tool to visualize the SIX (Seattle Internet Exchange) route server coverage

Mapping the SIX route server This program generates an IPv4 map. In particular,

Brad Fitzpatrick 17 Nov 9, 2022
concurrent caching proxy and decoder library for collections of PMTiles

go-pmtiles A caching proxy for the serverless PMTiles archive format. Resolves several of the limitations of PMTiles by running a minimalistic, single

Protomaps 36 Nov 14, 2022