Diff, match and patch text in Go

Overview

go-diff GoDoc Build Status Coverage Status

go-diff offers algorithms to perform operations required for synchronizing plain text:

  • Compare two texts and return their differences.
  • Perform fuzzy matching of text.
  • Apply patches onto text.

Installation

go get -u github.com/sergi/go-diff/...

Usage

The following example compares two texts and writes out the differences to standard output.

package main

import (
	"fmt"

	"github.com/sergi/go-diff/diffmatchpatch"
)

const (
	text1 = "Lorem ipsum dolor."
	text2 = "Lorem dolor sit amet."
)

func main() {
	dmp := diffmatchpatch.New()

	diffs := dmp.DiffMain(text1, text2, false)

	fmt.Println(dmp.DiffPrettyText(diffs))
}

Found a bug or are you missing a feature in go-diff?

Please make sure to have the latest version of go-diff. If the problem still persists go through the open issues in the tracker first. If you cannot find your request just open up a new issue.

How to contribute?

You want to contribute to go-diff? GREAT! If you are here because of a bug you want to fix or a feature you want to add, you can just read on. Otherwise we have a list of open issues in the tracker. Just choose something you think you can work on and discuss your plans in the issue by commenting on it.

Please make sure that every behavioral change is accompanied by test cases. Additionally, every contribution must pass the lint and test Makefile targets which can be run using the following commands in the repository root directory.

make lint
make test

After your contribution passes these commands, create a PR and we will review your contribution.

Origins

go-diff is a Go language port of Neil Fraser's google-diff-match-patch code. His original code is available at http://code.google.com/p/google-diff-match-patch/.

Copyright and License

The original Google Diff, Match and Patch Library is licensed under the Apache License 2.0. The full terms of that license are included here in the APACHE-LICENSE-2.0 file.

Diff, Match and Patch Library

Written by Neil Fraser Copyright (c) 2006 Google Inc. http://code.google.com/p/google-diff-match-patch/

This Go version of Diff, Match and Patch Library is licensed under the MIT License (a.k.a. the Expat License) which is included here in the LICENSE file.

Go version of Diff, Match and Patch Library

Copyright (c) 2012-2016 The go-diff authors. All rights reserved. https://github.com/sergi/go-diff

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Comments
  • Use common lineHash to share indice between text1 and text2 for correct line diffs

    Use common lineHash to share indice between text1 and text2 for correct line diffs

    What?

    Use common cache of line contents between two texts in DiffLinesToChars to get line diffs correctly.

    Why?

    In some cases, line diffs cannot be retrieved correctly in the standard way. The following code is one of the examples.

    package main
    
    import (
    	"fmt"
    
    	"github.com/sergi/go-diff/diffmatchpatch"
    )
    
    const (
    	text1 = `hoge:
      step11:
      - arrayitem1
      - arrayitem2
      step12:
        step21: hoge
        step22: -93
    fuga: flatitem
    `
    	text2 = `hoge:
      step11:
      - arrayitem4
      - arrayitem2
      - arrayitem3
      step12:
        step21: hoge
        step22: -92
    fuga: flatitem
    `
    )
    
    func main() {
    	dmp := diffmatchpatch.New()
    	a, b, c := dmp.DiffLinesToChars(text1, text2)
    	diffs := dmp.DiffMain(a, b, false)
    	diffs = dmp.DiffCharsToLines(diffs, c)
    	// DiffCleanupSemantic improves a little but not enough
    	// diffs = dmp.DiffCleanupSemantic(diffs)
    	fmt.Println(diffs)
    }
    
    [{Insert hoge:
      step11:
    hoge:
    } {Equal hoge:
    } {Insert hoge:
    } {Equal   step11:
    } {Insert hoge:
    } {Equal   - arrayitem1
    } {Insert hoge:
    } {Equal   - arrayitem2
    } {Insert hoge:
    } {Equal   step12:
    } {Insert hoge:
    } {Equal     step21: hoge
    } {Insert hoge:
    } {Equal     step22: -93
    } {Delete fuga: flatitem
    }]
    

    This fix corresponds to javascript implementation.

    Testing?

    Add a unit testcase

    Anything Else?

    This is my first contribution to this repository, therefore I would really appreciate any feedback, suggestions and change requests.

    opened by nrnrk 0
  • Is this repo still maintained? Friendly fork?

    Is this repo still maintained? Friendly fork?

    Thank you for this code. I'm a very happy user of sergi/go-diff, although I have to pin my use of the module to the latest working version, i.e. v1.1.0.

    Is this repo still maintained? There are many open issues with no response, the most recent tagged version (v1.2.0) has been broken (#123) for a year and half, and there have been no commits to master for a similar time period. There are many pull requests that have had no response.

    Under the terms of sergi/go-diff's MIT license, I propose to friendly fork this repo, unless there is still an intent to work on this. Go needs a good diff library like this, and a friendly fork is one that is carefully designed to be easy to merge back into the original.

    Let me know if there is still an intent to work on this valuable software, or if I should fork.

    opened by twpayne 0
  • PatchApply panics with slice bounds out of range

    PatchApply panics with slice bounds out of range

    The following code panics when using PatchApply

    func main() {
    	dmp := diffmatchpatch.New()
    	patches, _ := dmp.PatchFromText("@@ -1,2 +1,3 @@\n %E2%98%9E \n+r\n")
    	fmt.Println(dmp.PatchToText(patches))
    
    	s, _ := dmp.PatchApply(patches, "☞ 𝗒π—₯π——π—˜π—₯ ")
    	fmt.Println(fmt.Sprintf("%q", s))
    }
    
    panic: runtime error: slice bounds out of range [:35] with length 33
    
    goroutine 1 [running]:
    github.com/sergi/go-diff/diffmatchpatch.(*DiffMatchPatch).PatchApply(0x14000109ef8, {0x14000078200?, 0x1, 0x1?}, {0x102d6650a, 0x19})
            /Users/michael/go/pkg/mod/github.com/sergi/[email protected]/diffmatchpatch/patch.go:306 +0x998
    main.main()
            /Users/michael/code/playground/scratch/main.go:14 +0xf0
    exit status 2
    

    It looks like its finding the wrong start location here - https://github.com/sergi/go-diff/blob/master/diffmatchpatch/patch.go#L265 E.g: this prints 25...

    func main() {
    	dmp := diffmatchpatch.New()
    	fmt.Println(dmp.MatchMain("\x01\x02\x03\x04☞ 𝗒π—₯π——π—˜π—₯ \x01\x02\x03\x04", "☞ \x01\x02\x03\x04", 4))
    }
    
    opened by mwain 0
  • Restructure the pretty functions. Add markdown pretty function. Diff result with old and new pretty color function

    Restructure the pretty functions. Add markdown pretty function. Diff result with old and new pretty color function

    1. Restructure the pretty functions for more general pretty ways.
    2. Add pretty function for markdown.
    3. Diff's pretty result with both old and new pretty color.
    opened by whitefirer 0
  • fix panic: runtime error: slice bounds out of range

    fix panic: runtime error: slice bounds out of range

    This PR should fix #127

    The new code uses same implementation as the Java version: https://github.com/google/diff-match-patch/blob/62f2e689f498f9c92dbc588c58750addec9b1654/java/src/name/fraser/neil/plaintext/diff_match_patch.java#L545 https://github.com/google/diff-match-patch/blob/62f2e689f498f9c92dbc588c58750addec9b1654/java/src/name/fraser/neil/plaintext/diff_match_patch.java#L568

    opened by iambus 4
Owner
Sergi Mansilla
Developer, author of "Reactive Programming with RxJS", speaker, Node, Go, distributed systems, curious being.
Sergi Mansilla
Parse data and test fixtures from markdown files, and patch them programmatically, too.

go-testmark Do you need test fixtures and example data for your project, in a language agnostic way? Do you want it to be easy to combine with documen

Eric Myhre 20 Oct 31, 2022
Unified diff parser and printer for Go

go-diff Diff parser and printer for Go. Installing go get -u github.com/sourcegraph/go-diff/diff Usage It doesn't actually compute a diff. It only rea

Sourcegraph 379 Nov 12, 2022
OAS 3.1 Validation and Diff CLI Tool

oas-diff OAS 3.1 Validation and Diff Tool Requisits Go 1.17+ Run Build make build Run ./build/oasdiff --help Examples Validate ./build/oasdiff valid

UP9 9 May 12, 2022
Match regex group into go struct using struct tags and automatic parsing

regroup Simple library to match regex expression named groups into go struct using struct tags and automatic parsing Installing go get github.com/oris

Ori Seri 126 Nov 5, 2022
A general purpose application and library for aligning text.

align A general purpose application that aligns text The focus of this application is to provide a fast, efficient, and useful tool for aligning text.

John Moore 78 Sep 27, 2022
Parse placeholder and wildcard text commands

allot allot is a small Golang library to match and parse commands with pre-defined strings. For example use allot to define a list of commands your CL

Sebastian MΓΌller 55 Nov 24, 2022
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.

omniparser Omniparser is a native Golang ETL parser that ingests input data of various formats (CSV, txt, fixed length/width, XML, EDI/X12/EDIFACT, JS

JF Technology 521 Nov 30, 2022
Produces a set of tags from given source. Source can be either an HTML page, Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.

Tagify Gets STDIN, file or HTTP address as an input and returns a list of most popular words ordered by popularity as an output. More info about what

ZoomIO 24 Sep 27, 2022
Templating system for HTML and other text documents - go implementation

FAQ What is Kasia.go? Kasia.go is a Go implementation of the Kasia templating system. Kasia is primarily designed for HTML, but you can use it for any

MichaΕ‚ Derkacz 74 Mar 15, 2022
Small and fast FTS (full text search)

Microfts A small full text indexing and search tool focusing on speed and space. Initial tests seem to indicate that the database takes about twice as

Bill Burdick 27 Jul 30, 2022
:book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech

Joseph Kato 3k Dec 3, 2022
PipeIt is a text transformation, conversion, cleansing and extraction tool.

PipeIt PipeIt is a text transformation, conversion, cleansing and extraction tool. Features Split - split text to text array by given separator. Regex

Allen Dang 73 Aug 15, 2022
ByNom is a Go package for parsing byte sequences, suitable for parsing text and binary data

ByNom is a Go package for parsing byte sequences. Its goal is to provide tools to build safe byte parsers without compromising the speed or memo

Andrew Bashkatov 4 May 5, 2021
πŸ‘„ The most accurate natural language detection library in the Go ecosystem, suitable for long and short text alike

?? The most accurate natural language detection library in the Go ecosystem, suitable for long and short text alike

Peter M. Stahl 797 Dec 1, 2022
Fast and secure steganography CLI for hiding text/files in images.

indie CLI This complete README is hidden in the target.png file below without the original readme.png this could have also been a lie as none could ev

BoB 4 Mar 20, 2022
a simple and lightweight terminal text editor written in Go

Simple Text editor written in Golang build go build main.go

buzz 3 Oct 4, 2021
AppGo is an application that is intended to read a plain text log file and deliver an encoded polyline

AppGo AppGo is an application that is intended to read a plain text log file and deliver an encoded polyline. Installation To run AppGo it is necessar

Wendy Conde 0 Oct 23, 2021
A UTF-8 and internationalisation testing utility for text rendering.

Ι±Γ©Ε₯Γ Ε‚ "English, but metal" Metal is a tool that converts English text into a legible, Zalgo-like character swap for the purposes of testing localisati

Harley 0 Jan 14, 2022
Guess the natural language of a text in Go

guesslanguage This is a Go version of python guess-language. guesslanguage provides a simple way to detect the natural language of unicode string and

Nikita Vershinin 55 Nov 28, 2022