An implementation of Neural Turing Machines

Related tags

Machine Learning ntm
Overview

Neural Turing Machines

Package ntm implements the Neural Turing Machine architecture as described in A.Graves, G. Wayne, and I. Danihelka. arXiv preprint arXiv:1410.5401, 2014.

Using this package along its subpackages, the "copy", "repeatcopy" and "ngram" tasks mentioned in the paper were verified. For each of these tasks, the successfully trained models are saved under the filenames "seedA_B", where A is the number indicating the seed provided to rand.Seed in the training process, and B is the iteration number in which the trained weights converged.

Reproducing results in the paper

The following sections detail the steps of verifying the results in the paper. All commands are assumed to be run in the $GOPATH/github.com/fumin/ntm folder.

Copy

Train

To start training, run go run copytask/train/main.go which not only commences training but also starts a web server that would be convenient to track progress. To print debug information about the training process, run curl http://localhost:8082/PrintDebug. Run it twice to close debug info. To track the cross-entropy loss during the training process, run curl http://localhost:8082/Loss. To save the trained weights to disk, run curl http://localhost:8082/Weights > weights.

Testing

To test the saved weights in the previous training step, run go run copytask/test/main.go -weightsFile=weights. Alternatively, you can also specify one of the successfully trained weights in the copytask/test folder such as the file copytask/test/seed2_19000. Upon running the above command, a web server would be started which can be accessed at http://localhost:9000/. Below are screenshots of the web page showing the testing results for a test case of length 20. The first figure shows the input, output, and predictions of the NTM, and the second figure shows the addressing weights of the memory head.

The figure below shows the results for the test case of length 120. As mentioned in the paper, the NTM is able to perform pretty well in this case even though it is only trained on sequences whose length is at most 20.

Repeat copy

To experiment on the repeat copy task, follow the steps of the copy task except changing the package from copytask to repeatcopy.

In this task, I deviated from the paper a bit in an attempt to see if we could general NTM to unseed repeat numbers. In particular, I think that the paper's way of representing the repeat number as a scalar which is normalized to [0, 1] seems a bit too artificial, and I took a different approach by encoding the repeat number as a sequence of binary inputs. The reasoning behind my approach is that by distributing the encoding through time, there would be no upper limits on the repeat number, and given NTMs relatively strong memorizing abilities, this act of distributing through time should not pose too big a problem to NTMs. In addition, I also gave the NTMs two memory heads instead of one as in the paper for these repeat copy tasks. However, in the end the NTMs still was not able to generalize well on the repeat number.

Below, I first show the results on the test case of repeat 7 and length 7. For this test case, we see the NTM is able to solve it perfectly by emitting the end signal unambiguously on the last time instant. Moreover, we see that the NTM solves it by assigning the first memory head the reponsibility of keeping count of the repeat times, and the second memory head the responsibility of replaying the input sequence.

Next, we generalize the NTM to configurations unseen in the training step. The below figure shows the results on generalizing on the repeat number to 15. We see that the NTM fails on this generalization.

The below figure shows the results on generalizing on the sequence length to 15. We see that the NTM does a fairly good job as mentioned in the paper.

Dynamic N-grams

To experiment on the dynamic n-grams task, follow the steps of the copy task except changing the package from copytask to ngram.

The figure below shows the results of this task. We see that the bits-per-sequence loss is 133 which is close to the theoretical optimal value given by Bayesian analysis in the paper. Moreover, by observing the fact that the memory weights for the same 5-bit prefix remains the same throughout the entire testing sequence, we verified the paper's claim that the NTM solved this task by emulating the optimal Bayesian approach of keeping track of transitional counts.

Acrostic generation

I applied NTMs to automatically generate acrostics. An acrostic is a poem in which the first word of each line in the text spells out a message. Acrostics have a rich history in ancient China where literary inquisitions were severe and common, and continues to enjoy much popularity in today's Chinese societies such as Taiwan. The example below shows an acrostic carrying the message "vote to remove Senator 蔡正元 on the 14th", referring to the Senator's recall election on 2015/02/14.

The poem on the left is generated by a combination of a 2-gram model and a set of hand-crafted rules and features, whereas the poem on the right is generated by a NTM whose only learning material is the training corpus. Those whose read classical Chinese should notice that compared to the 2-gram poem on the left, the poem generated by a NTM is grammatically more correct, and resembles more that of written by a true person.

The NTMs in this experiment were trained with the Tang poetry collection 全唐詩. The vocabulary is limited to the top 3000 popular characters in the collection, while the rest are designated as unknown. During training, the network first receives instructions of what and where the keywords are, and is then asked to produce the full poem with no further input. In the example below, the top row are the inputs and the bottom row the outputs. Moreover, one of the instructions in the top input row is for the character 鄉 to appear at the fourth position of the first line.

After training, the NTM is able to achieve a bits-per-character of 7.6600, which is comparable to the 2-gram entropy estimate of 7.4451 on the same corpus over the same 3000 word vocabulary. Moreover, the NTM is able to almost perfectly emit the "linefeed" character at the specified positions, suggesting the fact that it has learned long range dependencies that exceed the capabilities of a 2-gram model.

More details about this experiment can be found in the slides of this talk.

Below are instructions on using this code to generate acrostics with NTMs, which assume we are already in the "poem" folder by running cd poem. To train a NTM to do acrostics, run go run train/main.go as in the steps above for the copy and repeat tasks. To generate acrostics using your trained model or one that comes along this package, run go run test/main.go -weightsFile=test/h1Size512_numHeads8_n128_m32/seed9_78100_5p6573 and possibly substituting the option -weightsFile with a different file.

Testing

To run the tests of this package, run go test -test.v.

Issues
  • memOp weight update

    memOp weight update

    I have read the code, I can see that the controller weights are updated by RMSProp, but where is the memOp weights updated?? I can see that the gradients of parameters are computed, but they have never been used to update their values. Would you be so kind as to clarify this for me??

    opened by qingyuanxingsi 4
  • Fix function comments based on best practices from Effective Go

    Fix function comments based on best practices from Effective Go

    Every exported function in a program should have a doc comment. The first sentence should be a summary that starts with the name being declared. From effective go.

    I generated this with CodeLingo and I'm keen to get some feedback, but this is automated so feel free to close it and just say opt out to opt out of future CodeLingo outreach PRs.

    opened by BlakeMScurr 2
  • Duplicate Detection

    Duplicate Detection

    In sub-task Acrostic generation, section(https://github.com/fumin/ntm/blob/master/poem/test/main.go#L106) you force the characters in a generated poem(TEST PHASE) to be different from each other, is this assumption reasonable??As far as I know, there are many poems with same characters. Or are there other considerations?

    opened by qingyuanxingsi 1
  • I got a lot of errors concerning the gonum libraries.

    I got a lot of errors concerning the gonum libraries.

    ../../gonum/blas/native/internal/math32/sqrt_amd64.s:17: Error: no such instruction: text ·Sqrt(SB),NOSPLIT,$0' ../../gonum/blas/native/internal/math32/sqrt_amd64.s:18: Error: junk(FP)' after expression ../../gonum/blas/native/internal/math32/sqrt_amd64.s:18: Error: too many memory references for sqrtss' ../../gonum/blas/native/internal/math32/sqrt_amd64.s:19: Error: too many memory references formovss'

    github.com/gonum/internal/asm

    ../../gonum/internal/asm/daxpy_amd64.s: Assembler messages: ../../gonum/internal/asm/daxpy_amd64.s:47: Error: no such instruction: text ·DaxpyUnitary(SB),NOSPLIT,$0' ../../gonum/internal/asm/daxpy_amd64.s:48: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:48: Error: too many memory references for movhpd' ../../gonum/internal/asm/daxpy_amd64.s:49: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:49: Error: too many memory references for movlpd' ../../gonum/internal/asm/daxpy_amd64.s:50: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:50: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:51: Error: junk (FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:51: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:52: Error: junk (FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:52: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:53: Error: junk (FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:53: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:55: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:56: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:57: Error: bad expression ../../gonum/internal/asm/daxpy_amd64.s:57: Error: junk if n<0 goto V1' after expression ../../gonum/internal/asm/daxpy_amd64.s:61: Error: junk(R8)(SI_8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:61: Error: too many memory references for movupd' ../../gonum/internal/asm/daxpy_amd64.s:62: Error: junk(R9)(SI_8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:62: Error: too many memory references for movupd' ../../gonum/internal/asm/daxpy_amd64.s:63: Error: too many memory references formulpd' ../../gonum/internal/asm/daxpy_amd64.s:64: Error: too many memory references for addpd' ../../gonum/internal/asm/daxpy_amd64.s:65: Error: too many memory references formovupd' ../../gonum/internal/asm/daxpy_amd64.s:67: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:68: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:69: Error: invalid character '=' in operand 1 ../../gonum/internal/asm/daxpy_amd64.s:72: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:73: Error: invalid character '=' in operand 1 ../../gonum/internal/asm/daxpy_amd64.s:76: Error: junk (R8)(SI*8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:76: Warning:X0' is not valid here (expected (%rbx)') ../../gonum/internal/asm/daxpy_amd64.s:77: Error: junk(R9)(SI_8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:77: Warning: X1' is not valid here (expected(%rbx)') ../../gonum/internal/asm/daxpy_amd64.s:78: Error: too many memory references for mulsd' ../../gonum/internal/asm/daxpy_amd64.s:79: Error: too many memory references foraddsd' ../../gonum/internal/asm/daxpy_amd64.s:80: Error: junk (R10)(SI_8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:80: Warning:0(R10)(SI8)' is not valid here (expected(%rbx)') ../../gonum/internal/asm/daxpy_amd64.s:87: Error: no such instruction:text ·DaxpyInc(SB),NOSPLIT,$0' ../../gonum/internal/asm/daxpy_amd64.s:88: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:88: Error: too many memory references formovhpd' ../../gonum/internal/asm/daxpy_amd64.s:89: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:89: Error: too many memory references formovlpd' ../../gonum/internal/asm/daxpy_amd64.s:90: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:90: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:91: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:91: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:92: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:92: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:93: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:93: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:94: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:94: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:95: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:95: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:96: Error: junk(FP)' after expression ../../gonum/internal/asm/daxpy_amd64.s:96: Error: too many memory references formovq' ../../gonum/internal/asm/daxpy_amd64.s:98: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:99: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:100: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:101: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:102: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:103: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:105: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:106: Error: bad expression ../../gonum/internal/asm/daxpy_amd64.s:106: Error: junkif n<0 goto V2' after expression ../../gonum/internal/asm/daxpy_amd64.s:110: Error: junk(R8)(SI8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:110: Error: too many memory references formovhpd' ../../gonum/internal/asm/daxpy_amd64.s:111: Error: junk(R9)(DI8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:111: Error: too many memory references formovhpd' ../../gonum/internal/asm/daxpy_amd64.s:112: Error: junk(R8)(AX8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:112: Error: too many memory references formovlpd' ../../gonum/internal/asm/daxpy_amd64.s:113: Error: junk(R9)(BX8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:113: Error: too many memory references formovlpd' ../../gonum/internal/asm/daxpy_amd64.s:115: Error: too many memory references formulpd' ../../gonum/internal/asm/daxpy_amd64.s:116: Error: too many memory references foraddpd' ../../gonum/internal/asm/daxpy_amd64.s:117: Error: too many memory references formovhpd' ../../gonum/internal/asm/daxpy_amd64.s:118: Error: too many memory references formovlpd' ../../gonum/internal/asm/daxpy_amd64.s:120: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:121: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:122: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:123: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:125: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:126: Error: invalid character '=' in operand 1 ../../gonum/internal/asm/daxpy_amd64.s:129: Error: invalid character '=' in operand 2 ../../gonum/internal/asm/daxpy_amd64.s:130: Error: invalid character '=' in operand 1 ../../gonum/internal/asm/daxpy_amd64.s:133: Error: junk(R8)(SI8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:133: Warning:X0' is not valid here (expected(%rbx)') ../../gonum/internal/asm/daxpy_amd64.s:134: Error: junk(R9)(DI_8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:134: Warning:X1' is not valid here (expected(%rbx)') ../../gonum/internal/asm/daxpy_amd64.s:135: Error: too many memory references formulsd' ../../gonum/internal/asm/daxpy_amd64.s:136: Error: too many memory references foraddsd' ../../gonum/internal/asm/daxpy_amd64.s:137: Error: junk(R9)(DI_8)' after expression ../../gonum/internal/asm/daxpy_amd64.s:137: Warning: 0(R9)(DI*8)' is not valid here (expected(%rbx)')

    opened by ypxie 1
  • Use CodeLingo to Address Further Issues

    Use CodeLingo to Address Further Issues

    Hi @fumin!

    Thanks for merging the fixes from our earlier pull request. They were generated by CodeLingo which we've used to find a further 70 issues in the repo. This PR adds a set of CodeLingo Tenets which catch any new cases of the found issues in PRs to your repo.

    CodeLingo will also send follow-up PRs to fix the existing repos in the codebase. Install CodeLingo GitHub app after merging this PR. It will always be free for open source.

    We're most interested to see if we can help with project specific bugs. Tell us about more interesting issues and we'll see if our tech can help - free of charge.

    Thanks, Leila and the CodeLingo Team

    opened by CodeLingoTeam 0
  • NaN panics

    NaN panics

    During my training, I find that the *sVal(https://github.com/fumin/ntm/blob/master/addressing.go#L194) can easily reach NaN, can this be properly solved, instead of panicing???

    opened by qingyuanxingsi 3
Owner
Fumin
Fumin
fonet is a deep neural network package for Go.

fonet fonet is a deep neural network package for Go. It's mainly created because I wanted to learn about neural networks and create my own package. I'

Barnabás Pataki 67 May 13, 2022
Artificial Neural Network

go-deep Feed forward/backpropagation neural network implementation. Currently supports: Activation functions: sigmoid, hyperbolic, ReLU Solvers: SGD,

Patrik Ehrencrona 365 May 15, 2022
Neural Networks written in go

gobrain Neural Networks written in go Getting Started The version 1.0.0 includes just basic Neural Network functions such as Feed Forward and Elman Re

Go Machine Learning 522 May 3, 2022
Neural Network for Go.

gonet gonet is a Go module implementing multi-layer Neural Network. Install Install the module with: go get github.com/dathoangnd/gonet Import it in

Dat Hoang 74 Apr 21, 2022
onnx-go gives the ability to import a pre-trained neural network within Go without being linked to a framework or library.

This is a Go Interface to Open Neural Network Exchange (ONNX). Overview onnx-go contains primitives to decode a onnx binary model into a computation b

Olivier Wulveryck 388 May 1, 2022
Golang Neural Network

Varis Neural Networks with GO About Package Some time ago I decided to learn Go language and neural networks. So it's my variation of Neural Networks

Artem Filippov 44 Mar 20, 2022
A neural network library built in Go

go-mind A neural network library built in Go. Usage import "github.com/stevenmiller888/go-mind" m := mind.New(0.7, 10000, 3, "sigmoid") m.Learn([][]

Steven Miller 165 Mar 16, 2022
Example of Neural Network models of social and personality psychology phenomena

SocialNN Example of Neural Network models of social and personality psychology phenomena This repository gathers a collection of neural network models

null 6 Jan 15, 2022
Neural network in Go

network Package network is a simple implementation of a nonbiased neural network. The networks created by this package can be trained with backpropaga

Shingirai Chanakira 0 Nov 25, 2021
Go (Golang) encrypted deep learning library; Fully homomorphic encryption over neural network graphs

DC DarkLantern A lantern is a portable case that protects light, A dark lantern is one who's light can be hidden at will. DC DarkLantern is a golang i

Raven 1 Dec 2, 2021
k-modes and k-prototypes clustering algorithms implementation in Go

go-cluster GO implementation of clustering algorithms: k-modes and k-prototypes. K-modes algorithm is very similar to well-known clustering algorithm

e-Xpert Solutions 31 Mar 14, 2022
A native Go clean room implementation of the Porter Stemming algorithm.

Go Porter Stemmer A native Go clean room implementation of the Porter Stemming Algorithm. This algorithm is of interest to people doing Machine Learni

Charles Iliya Krempeaux 179 Oct 15, 2021
Golang implementation of the Paice/Husk Stemming Algorithm

##Golang Implementation of the Paice/Husk stemming algorithm This project was created for the QUT course INB344. Details on the algorithm can be found

Aaron Groves 28 Jan 23, 2022
Fast (linear time) implementation of the Gaussian Blur algorithm in Go.

Song2 Fast (linear time) implementation of the Gaussian Blur algorithm in Go.

Masaya Watanabe 49 Apr 24, 2022
A high performance go implementation of Wappalyzer Technology Detection Library

wappalyzergo A high performance port of the Wappalyzer Technology Detection Library to Go. Inspired by https://github.com/rverton/webanalyze. Features

ProjectDiscovery 246 May 12, 2022
Go implementation of the yolo v3 object detection system

Go YOLO V3 This repository provides a plug and play implementation of the Yolo V3 object detection system in Go, leveraging gocv. Prerequisites Since

Wim Spaargaren 53 May 4, 2022
k-means clustering algorithm implementation written in Go

kmeans k-means clustering algorithm implementation written in Go What It Does k-means clustering partitions a multi-dimensional data set into k cluste

Christian Muehlhaeuser 374 Apr 27, 2022
Golang k-d tree implementation with duplicate coordinate support

Golang k-d tree implementation with duplicate coordinate support

DownFlux 45 Apr 13, 2022
Go-turing-i2c-cmdline - Controlling the i2c management bus of the turing pi with i2c works fine

go-turing-i2c-cmdline What is it? Controlling the i2c management bus of the turi

null 2 Jan 24, 2022
Run your MapReduce workloads as a single binary on a single machine with multiple CPUs and high memory. Pricing of a lot of small machines vs heavy machines is the same on most cloud providers.

gomap Run your MapReduce workloads as a single binary on a single machine with multiple CPUs and high memory. Pricing of a lot of small machines vs he

null 20 May 1, 2022
Fast, portable, non-Turing complete expression evaluation with gradual typing (Go)

Common Expression Language The Common Expression Language (CEL) is a non-Turing complete language designed for simplicity, speed, safety, and portabil

Google 1.2k May 17, 2022
Expression evaluation engine for Go: fast, non-Turing complete, dynamic typing, static typing

Expr Expr package provides an engine that can compile and evaluate expressions. An expression is a one-liner that returns a value (mostly, but not lim

Anton Medvedev 2.6k May 15, 2022
Expression evaluation engine for Go: fast, non-Turing complete, dynamic typing, static typing

Expr Expr package provides an engine that can compile and evaluate expressions. An expression is a one-liner that returns a value (mostly, but not lim

Anton Medvedev 2.6k May 17, 2022
Fast, portable, non-Turing complete expression evaluation with gradual typing (Go)

Common Expression Language The Common Expression Language (CEL) is a non-Turing complete language designed for simplicity, speed, safety, and portabil

Google 1.2k May 20, 2022
Expression evaluation engine for Go: fast, non-Turing complete, dynamic typing, static typing

Expr Expr package provides an engine that can compile and evaluate expressions. An expression is a one-liner that returns a value (mostly, but not lim

Anton Medvedev 2.6k May 20, 2022
Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch

EGNN - Pytorch Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch. May be eventually used for Alphafold2 replication.

Phil Wang 201 May 15, 2022
Parallel implementation of Gzip for modern multi-core machines written in Go

gzip Parallel implementation of gzip for modern multi-core machines written in Go Usage: gzip [OPTION]... [FILE] Compress or uncompress FILE (by defau

Pedro Albanese 0 Nov 16, 2021
💁‍♀️Your new best friend powered by an artificial neural network

??‍♀️ Your new best friend Website — Documentation — Getting started — Introduction — Translations — Contributors — License ⚠️ Please check the Call f

Olivia 3.2k May 18, 2022
fonet is a deep neural network package for Go.

fonet fonet is a deep neural network package for Go. It's mainly created because I wanted to learn about neural networks and create my own package. I'

Barnabás Pataki 67 May 13, 2022