signature-based file format identification

Overview

Siegfried

Siegfried is a signature-based file format identification tool, implementing:

  • the National Archives UK's PRONOM file format signatures
  • freedesktop.org's MIME-info file format signatures
  • the Library of Congress's FDD file format signatures (beta).
  • Wikidata (beta).

Version

1.9.1

Build Status Build status GoDoc Go Report Card

Usage

Command line

sf file.ext
sf DIR

Options

sf -csv file.ext | DIR                     // Output CSV rather than YAML
sf -json file.ext | DIR                    // Output JSON rather than YAML
sf -droid file.ext | DIR                   // Output DROID CSV rather than YAML
sf -nr DIR                                 // Don't scan subdirectories
sf -z file.zip | DIR                       // Decompress and scan zip, tar, gzip, warc, arc
sf -zs gzip,tar file.tar.gz | DIR          // Selectively decompress and scan 
sf -hash md5 file.ext | DIR                // Calculate md5, sha1, sha256, sha512, or crc hash
sf -sig custom.sig file.ext                // Use a custom signature file
sf -                                       // Scan stream piped to stdin
sf -name file.ext -                        // Provide filename when scanning stream 
sf -f myfiles.txt                          // Scan list of files and directories
sf -v | -version                           // Display version information
sf -home c:\junk -sig custom.sig file.ext  // Use a custom home directory
sf -serve hostname:port                    // Server mode
sf -throttle 10ms DIR                      // Pause for duration (e.g. 1s) between file scans
sf -multi 256 DIR                          // Scan multiple (e.g. 256) files in parallel 
sf -log [comma-sep opts] file.ext | DIR    // Log errors etc. to stderr (default) or stdout
sf -log e,w file.ext | DIR                 // Log errors and warnings to stderr
sf -log u,o file.ext | DIR                 // Log unknowns to stdout
sf -log d,s file.ext | DIR                 // Log debugging and slow messages to stderr
sf -log p,t DIR > results.yaml             // Log progress and time while redirecting results
sf -log fmt/1,c DIR > results.yaml         // Log instances of fmt/1 and chart results
sf -replay -log u -csv results.yaml        // Replay results file, convert to csv, log unknowns
sf -setconf -multi 32 -hash sha1           // Save flag defaults in a config file
sf -setconf -serve :5138 -conf srv.conf    // Save/load named config file with '-conf filename' 

Example

asciicast

Signature files

By default, siegfried uses the latest PRONOM signatures without buffer limits (i.e. it may do full file scans). To use MIME-info or LOC signatures, or to add buffer limits or other customisations, use the roy tool to build your own signature file.

Install

With go installed:

go get github.com/richardlehane/siegfried/cmd/sf

sf -update

Or, without go installed:

Win:

Download a pre-built binary from the releases page. Unzip to a location in your system path. Then run:

sf -update

Mac Homebrew (or Linuxbrew):

brew install mistydemeo/digipres/siegfried

Or, for the most recent updates, you can install from this fork:

brew install richardlehane/digipres/siegfried

Ubuntu/Debian (64 bit):

wget -qO - https://bintray.com/user/downloadSubjectPublicKey?username=bintray | sudo apt-key add -
echo "deb http://dl.bintray.com/siegfried/debian wheezy main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update && sudo apt-get install siegfried

FreeBSD:

pkg install siegfried

Arch Linux:

git clone https://aur.archlinux.org/siegfried.git
cd siegfried
makepkg -si

Changes

v1.9.1 (2020-10-11)

Changed

  • update PRONOM to v97
  • zs flag now activates -z flag

Fixed

  • details text in PRONOM identifier
  • roy panic when building signatures with empty sequences. Reported by Greg Lepore

v1.9.0 (2020-09-22)

Added

  • a new Wikidata identifier, harvesting information from the Wikidata Query Service. Implemented by Ross Spencer.
  • select which archive types (zip, tar, gzip, warc, or arc) are unpacked using the -zs flag (sf -zs tar,zip). Implemented by Ross Spencer.

Changed

  • update LOC signatures to 2020-09-21
  • update tika-mimetypes signatures to v1.24
  • update freedesktop.org signatures to v2.0

Fixed

  • incorrect basis for some signatures with multiple patterns. Reported and fixed by Ross Spencer.

v1.8.0 (2020-01-22)

Added

  • utc flag returns file modified dates in UTC e.g. sf -utc FILE | DIR. Requested by Dragan Espenschied
  • new cost and repetition flags to control segmentation when building signatures

Changed

  • update PRONOM to v96
  • update LOC signatures to 2019-12-18
  • update tika-mimetypes signatures to v1.23
  • update freedesktop.org signatures to v1.15

Fixed

  • XML namespaces detected by prefix on root tag, as well as default namespace (for mime-info spec)
  • panic when scanning certain MS-CFB files. Reported separately by Mike Shallcross and Euan Cochrane
  • file with many FF xx sequences grinds to a halt. Reported by Andy Foster

See the CHANGELOG for the full history.

Rights

Copyright 2020 Richard Lehane, Ross Spencer

Licensed under the Apache License, Version 2.0

Announcements

Join the Google Group for updates, signature releases, and help.

Contributing

Like siegfried and want to get involved in its development? That'd be wonderful! There are some notes on the wiki to get you started, and please get in touch.

Thanks

Thanks TNA for http://www.nationalarchives.gov.uk/pronom/ and http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm

Thanks Ross for https://github.com/exponential-decay/skeleton-test-suite-generator and http://exponentialdecay.co.uk/sd/index.htm, both are very handy!

Thanks Misty for the brew and ubuntu packaging

Thanks Steffen for the FreeBSD and Arch Linux packaging

Comments
  • WinApi error #4 (x Many) followed by Cannot Allocate Memory error...

    WinApi error #4 (x Many) followed by Cannot Allocate Memory error...

    Hi Richard,

    Logging this here so I can collect the data more analytically and see if there are any existing clues I should be looking at or tips. Again, running through our legacy collection.

    Can I confirm which SF log setting I need to see the file it was processing at the time?

      fatal error: runtime: cannot allocate memory
    
      runtime stack:
      runtime.throw(0x7e7c20, 0x1f)
              c:/go/src/runtime/panic.go:527 +0x7f
      runtime.persistentalloc1(0x4000, 0x8, 0x9a0e58, 0x0)
              c:/go/src/runtime/malloc.go:878 +0x253
      runtime.persistentalloc.func1()
              c:/go/src/runtime/malloc.go:831 +0x35
      runtime.systemstack(0xcfd50)
              c:/go/src/runtime/asm_386.s:283 +0x81
      runtime.persistentalloc(0x4000, 0x0, 0x9a0e58, 0x1)
              c:/go/src/runtime/malloc.go:832 +0x4e
      runtime.fixAlloc_Alloc(0x996ab0, 0x1)
              c:/go/src/runtime/mfixalloc.go:67 +0xcd
      runtime.mHeap_AllocSpanLocked(0x98ed40, 0x1, 0xc000011f)
              c:/go/src/runtime/mheap.go:561 +0x184
      runtime.mHeap_Alloc_m(0x98ed40, 0x1, 0x19, 0xcfe00, 0x40)
              c:/go/src/runtime/mheap.go:425 +0x281
      runtime.mHeap_Alloc.func1()
              c:/go/src/runtime/mheap.go:484 +0x3d
      runtime.systemstack(0xcfe30)
              c:/go/src/runtime/asm_386.s:283 +0x81
      runtime.mHeap_Alloc(0x98ed40, 0x1, 0x19, 0x27f0100, 0x15a)
              c:/go/src/runtime/mheap.go:485 +0x5a
      runtime.mCentral_Grow(0x994890, 0x0)
              c:/go/src/runtime/mcentral.go:190 +0x8e
      runtime.mCentral_CacheSpan(0x994890, 0x16cb0ee8)
              c:/go/src/runtime/mcentral.go:86 +0x439
      runtime.mCache_Refill(0x3c04a8, 0x19, 0x15084000)
              c:/go/src/runtime/mcache.go:118 +0xae
      runtime.mallocgc.func2()
              c:/go/src/runtime/malloc.go:611 +0x2b
      runtime.systemstack(0x988f80)
              c:/go/src/runtime/asm_386.s:267 +0x57
      runtime.mstart()
              c:/go/src/runtime/proc1.go:674
    
      goroutine 1 [running]:
      runtime.systemstack_switch()
              c:/go/src/runtime/asm_386.s:222 fp=0x16cb0f78 sp=0x16cb0f74
      runtime.mallocgc(0x200, 0x0, 0x3, 0x1)
              c:/go/src/runtime/malloc.go:612 +0x65a fp=0x16cb0fe0 sp=0x16cb0f78
      runtime.rawruneslice(0x7d, 0x0, 0x0, 0x0)
              c:/go/src/runtime/string.go:297 +0xc4 fp=0x16cb1008 sp=0x16cb0fe0
      runtime.stringtoslicerune(0x16cb1074, 0x161c7dfc, 0x0, 0x0, 0x0, 0x0)
              c:/go/src/runtime/string.go:169 +0x18a fp=0x16cb1034 sp=0x16cb1008
      syscall.UTF16FromString(0x14d84c80, 0x7c, 0x0, 0x0, 0x0, 0x0, 0x0)
              c:/go/src/syscall/syscall_windows.go:44 +0x129 fp=0x16cb10f8 sp=0x16cb1034
      syscall.UTF16PtrFromString(0x14d84c80, 0x7c, 0x14d84ce4, 0x0, 0x0)
              c:/go/src/syscall/syscall_windows.go:71 +0x33 fp=0x16cb1118 sp=0x16cb10f8
      os.Lstat(0x14d84c80, 0x7c, 0x0, 0x0, 0x0, 0x0)
              c:/go/src/os/stat_windows.go:81 +0x24d fp=0x16cb1160 sp=0x16cb1118
      os.Stat(0x14d84c80, 0x7c, 0x0, 0x0, 0x0, 0x0)
              c:/go/src/os/stat_windows.go:55 +0x5d fp=0x16cb118c sp=0x16cb1160
      main.retryStat(0x14d84c80, 0x7c, 0x32f73778, 0x1619d680, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/longpath_windows.go:56 +0x5d fp=0x16cb11b0 sp=0x16cb118c
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165e50, 0x32f73778, 0x1619d680, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:165 +0xe3 fp=0x16cb1264 sp=0x16cb11b0
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165e50, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb1300 sp=0x16cb1264
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb1334 sp=0x16cb1300
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165e00, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb13e8 sp=0x16cb1334
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165db0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb1484 sp=0x16cb13e8
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb14b8 sp=0x16cb1484
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165d60, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb156c sp=0x16cb14b8
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165d10, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb1608 sp=0x16cb156c
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb163c sp=0x16cb1608
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165cc0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb16f0 sp=0x16cb163c
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165c70, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb178c sp=0x16cb16f0
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb17c0 sp=0x16cb178c
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165c20, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb1874 sp=0x16cb17c0
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165bd0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb1910 sp=0x16cb1874
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb1944 sp=0x16cb1910
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165b80, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb19f8 sp=0x16cb1944
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165b30, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb1a94 sp=0x16cb19f8
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb1ac8 sp=0x16cb1a94
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165ae0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb1b7c sp=0x16cb1ac8
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165a90, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb1c18 sp=0x16cb1b7c
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb1c4c sp=0x16cb1c18
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165a40, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb1d00 sp=0x16cb1c4c
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x161659f0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb1d9c sp=0x16cb1d00
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb1dd0 sp=0x16cb1d9c
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x161659a0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb1e84 sp=0x16cb1dd0
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165950, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb1f20 sp=0x16cb1e84
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb1f54 sp=0x16cb1f20
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165900, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2008 sp=0x16cb1f54
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x161658b0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb20a4 sp=0x16cb2008
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb20d8 sp=0x16cb20a4
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165860, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb218c sp=0x16cb20d8
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165810, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb2228 sp=0x16cb218c
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb225c sp=0x16cb2228
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x161657c0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2310 sp=0x16cb225c
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165770, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb23ac sp=0x16cb2310
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb23e0 sp=0x16cb23ac
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165720, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2494 sp=0x16cb23e0
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x161656d0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb2530 sp=0x16cb2494
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb2564 sp=0x16cb2530
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165680, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2618 sp=0x16cb2564
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165630, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb26b4 sp=0x16cb2618
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb26e8 sp=0x16cb26b4
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x161655e0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb279c sp=0x16cb26e8
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165590, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb2838 sp=0x16cb279c
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb286c sp=0x16cb2838
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165540, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2920 sp=0x16cb286c
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x161654f0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb29bc sp=0x16cb2920
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb29f0 sp=0x16cb29bc
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x161654a0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2aa4 sp=0x16cb29f0
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165450, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb2b40 sp=0x16cb2aa4
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb2b74 sp=0x16cb2b40
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165400, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2c28 sp=0x16cb2b74
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x161653b0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb2cc4 sp=0x16cb2c28
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb2cf8 sp=0x16cb2cc4
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165360, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2dac sp=0x16cb2cf8
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165310, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb2e48 sp=0x16cb2dac
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb2e7c sp=0x16cb2e48
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x161652c0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb2f30 sp=0x16cb2e7c
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165270, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb2fcc sp=0x16cb2f30
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3000 sp=0x16cb2fcc
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165220, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb30b4 sp=0x16cb3000
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x161651d0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb3150 sp=0x16cb30b4
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3184 sp=0x16cb3150
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165180, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb3238 sp=0x16cb3184
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165130, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb32d4 sp=0x16cb3238
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3308 sp=0x16cb32d4
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x161650e0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb33bc sp=0x16cb3308
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16165090, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb3458 sp=0x16cb33bc
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb348c sp=0x16cb3458
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16165040, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb3540 sp=0x16cb348c
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16164ff0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb35dc sp=0x16cb3540
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3610 sp=0x16cb35dc
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16164fa0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb36c4 sp=0x16cb3610
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16164f50, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb3760 sp=0x16cb36c4
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3794 sp=0x16cb3760
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16164f00, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb3848 sp=0x16cb3794
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16164eb0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb38e4 sp=0x16cb3848
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3918 sp=0x16cb38e4
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16164e60, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb39cc sp=0x16cb3918
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16164e10, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb3a68 sp=0x16cb39cc
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3a9c sp=0x16cb3a68
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16164dc0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb3b50 sp=0x16cb3a9c
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16164d70, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb3bec sp=0x16cb3b50
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3c20 sp=0x16cb3bec
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16164d20, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb3cd4 sp=0x16cb3c20
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16164cd0, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb3d70 sp=0x16cb3cd4
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3da4 sp=0x16cb3d70
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16164c80, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb3e58 sp=0x16cb3da4
      path/filepath.walk(0x14d84c80, 0x7c, 0xd94590, 0x16164c30, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:363 +0x1c7 fp=0x16cb3ef4 sp=0x16cb3e58
      path/filepath.Walk(0x14d84c80, 0x7c, 0x170bfbc0, 0x0, 0x0)
              c:/go/src/path/filepath/path.go:396 +0xb7 fp=0x16cb3f28 sp=0x16cb3ef4
      main.multiIdentifyS.func1(0x14d84c80, 0x7c, 0xd94590, 0x16164be0, 0x0, 0x0, 0x0, 0x0)
              C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0x457 fp=0x16cb3fdc sp=0x16cb3f28
      ...additional frames elided...
    
    bug 
    opened by ross-spencer 35
  • Feature Request: URIS for file paths

    Feature Request: URIS for file paths

    Hi Richard,

    Having been working with SF pretty intensely for the reporting tool I found it difficult to identify files inside of archive formats - i have to identify first an archive file format using the PUID, and log its complete path. I then look for occurrences of its path inside other paths - flagging them as content inside that archive.

    Not the most elegant solution!

    It's something I can achieve in DROID by just looking at the URI_SCHEME which I extract from the URI...

    Do you think it scope creep to add to SF? - or could it be a 'goer' in the YAML output?

    Cheers,

    Ross

    enhancement 
    opened by ross-spencer 23
  • SF fails ungracefully meeting Windows path length limit 260+

    SF fails ungracefully meeting Windows path length limit 260+

    Given a structure; /e/DC_Spencer/folder-test/ab/cdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopq

    With a file in it: abcd.txt

    SF will:

    panic: runtime error: invalid memory address or nil pointer dereference [signal 0xc0000005 code=0x0 addr=0x14 pc=0x41087e]

    goroutine 1 [running]: main.multiIdentifyS.func1(0x12938c30, 0xeb, 0x0, 0x0, 0x330602e0, 0x1292c0a0, 0x0, 0x0) C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:123 +0x1be path/filepath.walk(0x129384b0, 0xe2, 0xca4558, 0x12d541e0, 0x12921c38, 0x0, 0x0) c:/go/src/path/filepath/path.go:370 +0x33d path/filepath.walk(0x128c21a0, 0x2, 0xca4558, 0x12d540f0, 0x12921c38, 0x0, 0x0) c:/go/src/path/filepath/path.go:374 +0x3f9 path/filepath.Walk(0x128c21a0, 0x2, 0x12921c38, 0x0, 0x0) c:/go/src/path/filepath/path.go:396 +0xb7 main.multiIdentifyS(0x330601c8, 0x12d76008, 0x12d54050, 0x128c21a0, 0x2, 0x0, 0x0, 0x0) C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:135 +0x87 main.main() C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:344 +0x1e1e

    If you shorten that path by removing the 'ab' directory then it works. Length 258 characters vs. 260.

    Max path length specified here: https://msdn.microsoft.com/en-nz/library/windows/desktop/aa365247(v=vs.85).aspx

    We have a real example I can't provide as its structure describes the content of the un-accessioned files we're working with.

    Directory lengths like this can be created by transferring from Linux using Rsync or in Windows by using Cygwin to mkdir.

    bug 
    opened by ross-spencer 15
  • Certificate verification failed

    Certificate verification failed

    Hey hey. I'm trying to install Siegfried on a new PC (Windows with a linux subsystem) and I'm hitting the errors below when I do sudo add-apt-repository "deb [arch=amd64] https://www.itforarchivists.com/ buster main". Any thoughts?

    Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB] Hit:2 http://archive.ubuntu.com/ubuntu focal InRelease Get:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB] Get:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB] Ign:5 https://www.itforarchivists.com buster InRelease Err:6 https://www.itforarchivists.com buster Release Certificate verification failed: The certificate is NOT trusted. The certificate chain uses expired certificate. Could not handshake: Error in the certificate verification. [IP: 172.67.166.43 443] Reading package lists... Done E: The repository 'https://www.itforarchivists.com buster Release' does not have a Release file. N: Updating from such a repository can't be done securely, and is therefore disabled by default. N: See apt-secure(8) manpage for repository creation and user configuration details.

    opened by drjwbaker 12
  • Runtime error on recursive dir scanning

    Runtime error on recursive dir scanning

    Hardware: iMac (27'', end 2013) / proc: 3,5 GHz Intel Core i7 / memory: 32 Go 1600 MHz DDR3 OS: Os X Yosemite 10.10.2

    Siegfried process: recursive directory scanning Issue: after 1365 file/directory identifications achieved, siegfried stops on runtime error

    Herewith screen copy:

    sf -csv=tr/Volumes/SGRA_ARCHIVES/NEW-Server > /Users/pantz/Documents/APPLICATION-SGRA/analyse-siegfried.csv panic: runtime error: index out of range

    goroutine 1 [running]: github.com/richardlehane/mscfb.func·001(0x162, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:247 +0x3a1 github.com/richardlehane/mscfb.func·001(0x4d, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:258 +0x33f github.com/richardlehane/mscfb.func·001(0x50f, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:258 +0x33f github.com/richardlehane/mscfb.func·001(0x166, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:245 +0xf2 github.com/richardlehane/mscfb.func·001(0x2c, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:258 +0x33f github.com/richardlehane/mscfb.func·001(0xa, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:258 +0x33f github.com/richardlehane/mscfb.func·001(0x2, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:258 +0x33f github.com/richardlehane/mscfb.func·001(0x6f, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:258 +0x33f github.com/richardlehane/mscfb.func·001(0x1ca, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:258 +0x33f github.com/richardlehane/mscfb.func·001(0x4, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:258 +0x33f github.com/richardlehane/mscfb.func·001(0x0, 0x623ea0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:254 +0x386 github.com/richardlehane/mscfb.(_Reader).traverse(0xc20b903a40, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/directory.go:262 +0x16e github.com/richardlehane/mscfb.New(0x131c4f0, 0xc20af71900, 0xc2085ac100, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/mscfb/mscfb.go:163 +0x2c8 github.com/richardlehane/siegfried/pkg/core/containermatcher.mscfbRdr(0x101d5a0, 0xc2085ac100, 0x0, 0x0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/siegfried/pkg/core/containermatcher/mscfb.go:30 +0x112 github.com/richardlehane/siegfried/pkg/core/containermatcher.Matcher.Identify(0xc2085007d0, 0x2, 0x2, 0xc20b924480, 0x58, 0x101d5a0, 0xc2085ac100, 0xc2087bf4c8, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/siegfried/pkg/core/containermatcher/identify.go:43 +0x314 github.com/richardlehane/siegfried/pkg/core/containermatcher.(_Matcher).Identify(0xc208528880, 0xc20b924480, 0x58, 0x101d5a0, 0xc2085ac100, 0xc20b924600, 0x0, 0x0) :10 +0xe1 github.com/richardlehane/siegfried.(*Siegfried).Identify(0xc208010af0, 0xc20b924480, 0x58, 0x1012100, 0xc20b8b6e48, 0x0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/siegfried/siegfried.go:370 +0x54f main.identifyFile(0x101d530, 0xc208540440, 0xc208010af0, 0xc20b924480, 0x58, 0x1733000) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:132 +0x2c2 main.func·004(0xc20b924480, 0x58, 0x10120b0, 0xc20828c370, 0x0, 0x0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:119 +0x158 path/filepath.walk(0xc20b924480, 0x58, 0x10120b0, 0xc20828c370, 0xc2087bfd60, 0x0, 0x0) /usr/local/Cellar/go/1.4.2/libexec/src/path/filepath/path.go:347 +0x91 path/filepath.walk(0xc209aa8190, 0x4e, 0x10120b0, 0xc209aa8280, 0xc2087bfd60, 0x0, 0x0) /usr/local/Cellar/go/1.4.2/libexec/src/path/filepath/path.go:372 +0x51d path/filepath.walk(0xc20afd4f80, 0x3c, 0x10120b0, 0xc209aa80a0, 0xc2087bfd60, 0x0, 0x0) /usr/local/Cellar/go/1.4.2/libexec/src/path/filepath/path.go:372 +0x51d path/filepath.walk(0xc20a650fc0, 0x31, 0x10120b0, 0xc20a67ac30, 0xc2087bfd60, 0x0, 0x0) /usr/local/Cellar/go/1.4.2/libexec/src/path/filepath/path.go:372 +0x51d path/filepath.walk(0x7fff5fbffc9d, 0x21, 0x10120b0, 0xc20851d270, 0xc2087bfd60, 0x0, 0x0) /usr/local/Cellar/go/1.4.2/libexec/src/path/filepath/path.go:372 +0x51d path/filepath.Walk(0x7fff5fbffc9d, 0x21, 0xc2087bfd60, 0x0, 0x0) /usr/local/Cellar/go/1.4.2/libexec/src/path/filepath/path.go:394 +0xf2 main.multiIdentifyS(0x101d530, 0xc208540440, 0xc208010af0, 0x7fff5fbffc9d, 0x21, 0x0, 0x0, 0x0) /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:122 +0x94 main.main() /private/tmp/siegfried20150415-4862-btufud/siegfried-1.0.0/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:224 +0xd83

    bug 
    opened by paantz 11
  • container matching doesn't work with directory-only paths: fmt/1196 (SIARD)

    container matching doesn't work with directory-only paths: fmt/1196 (SIARD)

    Hi Richard,

    As discussed I've attached what I think to be an accurate skeleton container file for fmt/1196. What do you think? (Also noted that SF might not have this capability to identify either as of yet!)

    (Unzipping should provide you a SIARD zip)

    fmt-1196-container-signature-id-31020.zip

    cc. @Dclipsham. I can't get it to pass in DROID either, but l am reluctant to say the problem is there given the recent proximity of the DROID release. PS. David - If there were public samples that could be shared, that would be a great help!

    bug 
    opened by ross-spencer 10
  • Siegfried seems to skip certain files without error or warning

    Siegfried seems to skip certain files without error or warning

    Hi,

    I'm currently comparing the results from DROID and Siegfried (through Brunnhilde). In a dataset containing 216420 files, there are only 2537 discrepancies between the two (roughly 1%), which imho is not bad. However, in my test at least 50% of these discrepancies are due to Siegfried apparently skipping a file. A comparison of the outputs by roy yields "missing" from the siegfried CSV (confirmed by manually checking the Siegfried CSV: they aren't there, so no mistake by roy). I redid the Brunnhilde analysis several times and each time the same files were skipped. I analysed a few of these files (TIFF's in this case) with other programs (JHOVE, DPF Manager) and there seemed to be nothing wrong with them. I also checked whether it might be due to long paths/filenames, non-standard characters in the filename, too many files in a directory or extremely large files, but none of these things seemed a problem. This was confirmed by an individual analysis of each file with Siegfried: the files were correctly analysed. But when I tried to analyse the directory directly with Siegfried, the same files were skipped again. I have no idea why, but I can provide you with the files and the different analyses if you need them.

    Kind regards,

    Maarten

    bug 
    opened by MSavels 10
  • Panic error when running siegfried

    Panic error when running siegfried

    Siegfried version: 1.9.1 in server mode

    siegfried-stderr.log:

    2021/12/03 22:12:13 Starting server at localhost:5138. Use CTRL-C to quit.
    panic: sync: negative WaitGroup counter
    
    goroutine 401476 [running]:
    sync.(*WaitGroup).Add(0xc0069c8690, 0xffffffffffffffff)
            /home/travis/.gimme/versions/go1.15.2.linux.amd64/src/sync/waitgroup.go:74 +0x147
    sync.(*WaitGroup).Done(...)
            /home/travis/.gimme/versions/go1.15.2.linux.amd64/src/sync/waitgroup.go:99
    main.identifyFile.func1(0xc005ec6500, 0xc00033f4a0, 0xc005d42740)
            /home/travis/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:205 +0x75
    created by main.identifyFile
            /home/travis/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:202 +0xe5
    
    bug 
    opened by hmiguim 9
  • sf -update fails to download signatures file

    sf -update fails to download signatures file

    Dear @richardlehane ,

    I get the following error message when trying to update the signatures file:

    C:\Users\myuser\Downloads\siegfried_1-9-1_win64\win64>sf.exe -update 2021/11/22 10:14:47 [FATAL] failed to update signature file, Get https://www.itforarchivists.com/siegfried/update: dial tcp 104.21.89.244:443: connectex: Aucune connexion n’a pu être établie car l’ordinateur cible l’a expressément refusée.

    The part in French can roughly be translated to: no connection could be established because the target computer has expressly refused.

    What am I doing wrong ?

    Has this anything to do with https://github.com/digital-preservation/droid/issues/657 and https://github.com/digital-preservation/droid/issues/658 ?

    The workaround (https://github.com/richardlehane/siegfried/wiki/Getting-started#installing-the-latest-signature-file) worked fine though.

    Best regards,

    Samuel, for Conseil Départemental de l'Hérault (France) herault.fr

    question 
    opened by sviscapi 9
  • Panic on Roy Sig Creation

    Panic on Roy Sig Creation

    Pretty sure this is on me for trying to create an incorrect signature, but I can't figure it out. I think I'm using the EOF PRONOM attribute wrong.

    I get the following panic on the attached XML signature file:

    panic: runtime error: index out of range [-1]

    goroutine 1 [running]: github.com/richardlehane/siegfried/pkg/pronom.appendFragments(0xc0002562e0, 0x5, 0xbca900, 0x0, 0x0, 0xc000224620, 0x2, 0x2, 0x100, 0x0, ...) /home/travis/gopath/src/github.com/richardlehane/siegfried/pkg/pronom/parse.go:309 +0x1d9a github.com/richardlehane/siegfried/pkg/pronom.processSubSequence(0xc0002562e0, 0x5, 0x1, 0xb4cdf0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /home/travis/gopath/src/github.com/richardlehane/siegfried/pkg/pronom/parse.go:170 +0x683 github.com/richardlehane/siegfried/pkg/pronom.processDROID(0xc0002562e0, 0x5, 0xc000020140, 0x2, 0x2, 0x7c, 0x2, 0x8e8620, 0xc000b79fe0, 0x1) /home/travis/gopath/src/github.com/richardlehane/siegfried/pkg/pronom/parse.go:148 +0x285 github.com/richardlehane/siegfried/pkg/pronom.(*droid).Signatures(0xc000a00ba0, 0xc000a48000, 0x618, 0xe1c, 0xc000a1a000, 0x618, 0xe1c, 0x0, 0x0) /home/travis/gopath/src/github.com/richardlehane/siegfried/pkg/pronom/parseable.go:314 +0x340 github.com/richardlehane/siegfried/internal/identifier.joint.Signatures(0x8e9600, 0xc000300100, 0x8e9500, 0xc000a00ba0, 0xc000273818, 0x30, 0x82ae80, 0x0, 0x0, 0x30, ...) /home/travis/gopath/src/github.com/richardlehane/siegfried/internal/identifier/parseable.go:258 +0xbb github.com/richardlehane/siegfried/internal/identifier.filtered.Signatures(0xc000d00000, 0x671, 0x70f, 0x8e9880, 0xc0008dd580, 0x1, 0xc0002562e0, 0x5, 0xc001519aa0, 0x17, ...) /home/travis/gopath/src/github.com/richardlehane/siegfried/internal/identifier/parseable.go:398 +0x49 github.com/richardlehane/siegfried/pkg/pronom.doublesFilter.Signatures(0xc000d00000, 0x671, 0x70f, 0x8e9880, 0xc0008dd580, 0xc000256270, 0xa, 0x1, 0xc0002562e0, 0x5, ...) /home/travis/gopath/src/github.com/richardlehane/siegfried/pkg/pronom/parseable.go:54 +0xb4 github.com/richardlehane/siegfried/internal/identifier.sorted.Signatures(0x8e9580, 0xc00000e580, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) /home/travis/gopath/src/github.com/richardlehane/siegfried/internal/identifier/parseable.go:547 +0x49 github.com/richardlehane/siegfried/internal/identifier.(*Base).Add(0xc000f32780, 0x0, 0x0, 0x3, 0x0, 0x0, 0x0, 0x0) /home/travis/gopath/src/github.com/richardlehane/siegfried/internal/identifier/base.go:379 +0x9aa github.com/richardlehane/siegfried.(*Siegfried).Add(0xc0001282c0, 0x8e8e80, 0xc00025f2f0, 0x8e8e80, 0xc00025f2f0) /home/travis/gopath/src/github.com/richardlehane/siegfried/siegfried.go:136 +0x328 main.makegob(0xc0001282c0, 0xc0000100a0, 0x1, 0x1, 0x0, 0x0) /home/travis/gopath/src/github.com/richardlehane/siegfried/cmd/roy/roy.go:236 +0xa3 main.main() /home/travis/gopath/src/github.com/richardlehane/siegfried/cmd/roy/roy.go:528 +0x65a

    Packed-Font-File-Format-1.0-signature-file.txt

    bug 
    opened by gleporeNARA 9
  • Multiple results in DROID not visible in Siegfried

    Multiple results in DROID not visible in Siegfried

    Hello,

    First of all, your tool is very efficient and useful. We are using it in our archiving solution for French government.

    But there's a point we've recently discovered which can be a big problem for us. When we use DROID to identify compliant PUID in Pronom list, we discover that some files can have multiple formats. Siegfried give only one and in different cases it can be a wrong one.

    For example this ODF presentation is identified as spreadsheet and presentation (which can be explain by the fact that there's a spreasheet object in it) by DROID and only as spreadsheet by Siegfried FalseODS.zip

    Would it be possible to have an option to get all the possible formats as in DROID...

    Regards

    enhancement PRONOM 
    opened by JSLair 9
  • Add PRONOM types to PRONOM identifier

    Add PRONOM types to PRONOM identifier

    Exploration adding PRONOM types classification to the Siegfried PRONOM identifier.

    Connected to: https://github.com/richardlehane/siegfried/discussions/207

    This results in a new output from Siegfried which looks something as follows:

    filename : 'testdata/skeleton-suite/x-fmt/x-fmt-95-signature-id-858.pwi'
    filesize : 5
    modified : 2020-07-05T19:53:49+02:00
    errors   : 
    matches  :
      - ns      : 'pronom'
        id      : 'x-fmt/95'
        format  : 'Inkwriter/Notetaker Document'
        version : 
        mime    : 
        type    : 'Word Processor'
        basis   : 'extension match pwi; byte match at 0, 5'
        warning : 
    

    Note the addition of type : 'Word Processor'

    NB. This will only show a value if the PRONOM identifier is configured with PRONOM reports, i.e. the PRONOM XML export from PRONOM itself. The DROID signature file still needs this information to be added, we believe this is on the way. I can attend the next PRONOM meeting at the beginning of the year to ask more.

    Tests have been included as part of this feature. Additionally, source files have had linting changes made to them to pass linting. These are in the third commit associated with the PR and may warrant special attention for accuracy, especially around the correctness of the documentation.

    opened by ross-spencer 0
  • Auto-update or update in server mode

    Auto-update or update in server mode

    When running Siegfried as a long-standing service, for example as a docker service, it would be important to keep Siegfried up-to-date with the signature database. But, currently, there is no way to call the update action via the service, nor to somehow schedule an auto-update.

    This could be done in one of the following ways:

    • Add a new action to the Siegfried server API, e.g. GET /update, which would call the update action. An external service could be setup to call this endpoint periodically
    • Including an auto-update feature into the tool, specifically when it runs in server mode, e.g. sf -serve localhost:5138 -autoupdate 24h
    opened by luis100 1
  • FreeBSD 1.9.5 port: modules: go.mod build failure

    FreeBSD 1.9.5 port: modules: go.mod build failure

    Hello,

    I'm updating FreeBSD port from 1.9.4 to 1.9.5.

    This port is currently using GO_MODULE variable to the value specified by the module directive in go.mod: GO_MODULE= github.com/richardlehane/siegfried to fetch modules automatically from there and build fails at:

    (...)
    net/url
    vendor/golang.org/x/crypto/chacha20poly1305
    # golang.org/x/sys/unix
    vendor/golang.org/x/sys/unix/syscall.go:83:7: unsafe.Slice requires go1.17 or later (-lang was set to go1.16; check go.mod)
    vendor/golang.org/x/sys/unix/syscall_unix.go:118:7: unsafe.Slice requires go1.17 or later (-lang was set to go1.16; check go.mod)
    vendor/golang.org/x/crypto/curve25519
    (...)
    github.com/richardlehane/siegfried/pkg/sets
    github.com/richardlehane/siegfried/pkg/wikidata/internal/converter
    *** Error code 2
    

    To fix the problem I switched back to "traditional" method gomod-vendor to get modules:

    USE_GITHUB=     yes
    GH_ACCOUNT=     richardlehane
    #GO_MODULE=     github.com/richardlehane/siegfried # fails at github.com/richardlehane/siegfried/pkg/wikidata/internal/converter
    GH_TUPLE=       golang:image:e7cb96979f69:golang_image/vendor/golang.org/x/image \
                    golang:sys:aba9fc2a8ff2:golang_sys/vendor/golang.org/x/sys \
                    golang:text:v0.3.7:golang_text/vendor/golang.org/x/text \
                    richardlehane:characterize:v1.0.0:richardlehane_characterize/vendor/github.com/richardlehane/characterize \
                    richardlehane:match:v1.0.2:richardlehane_match/vendor/github.com/richardlehane/match \
                    richardlehane:mscfb:v1.0.4:richardlehane_mscfb/vendor/github.com/richardlehane/mscfb \
                    richardlehane:msoleps:v1.0.3:richardlehane_msoleps/vendor/github.com/richardlehane/msoleps \
                    richardlehane:webarchive:v1.0.0:richardlehane_webarchive/vendor/github.com/richardlehane/webarchive \
                    richardlehane:xmldetect:v1.0.2:richardlehane_xmldetect/vendor/github.com/richardlehane/xmldetect \
                    ross-spencer:spargo:v0.4.1:ross_spencer_spargo/vendor/github.com/ross-spencer/spargo \
                    ross-spencer:wikiprov:v0.2.0:ross_spencer_wikiprov/vendor/github.com/ross-spencer/wikiprov
    

    Any clues about this being related to go.mod recent changes?

    Thanks, Nuno Teixeira

    opened by nunotexbsd 5
  • Roy issue

    Roy issue

    When attempting to use a PRONOM signature file that contains (58464952|52494658) as the signature (created with Ross' tool) I get a roy crash:

    sudo roy build -extend Generic-RIFX-Container-1.0-signature-file.xml rifxgen.sig 2022/05/10 12:00:15 parse error dev/1: empty sequence

    The above signature would match either RIFX or XFIR at the beginning of a file.

    Not sure if this is an issue with roy or with ffdev.info, but having the ability to match multiple start sequences would be useful, especially for formats with both big and little endianess.

    Signature attached.

    RIFX-big-and-little-1.0-signature-file.zip

    opened by gleporeNARA 3
  • error running `roy harvest -wikidata`

    error running `roy harvest -wikidata`

    I'm trying out the instructions here and am getting the following error/output when trying to run $ roy harvest -wikidata to start off:

    2022/04/27 09:23:21 Roy (Wikidata): Harvesting Wikidata definitions: lang 'en' 2022/04/27 09:23:21 Roy (Wikidata): Harvesting definitions from: 'https://query.wikidata.org/sparql' 2022/04/27 09:23:21 Roy (Wikidata): Harvesting revision history from: 'https://www.wikidata.org/' 2022/04/27 09:24:55 Error trying to retrieve SPARQL with revision history: warning: there were errors retrieving provenance from Wikibase API: wikiprov: unexpected response from server: 429

    I'm on Ubuntu 20.04 with the latest siegfried release (1.9.2), is there something obvious I'm doing wrong? (@ross-spencer?)

    wikidata 
    opened by EG-tech 5
  • Wikidata TrID results do not have the same provenance metadata as other results resulting in large numbers of linting messages

    Wikidata TrID results do not have the same provenance metadata as other results resulting in large numbers of linting messages

    Working out the fix for https://github.com/richardlehane/siegfried/issues/153 I am seeing a lot of Wikidata linting messages appear for the new TrID patterns. It is because the SPARQL for provenance expected more uniformity.

    We need:

            optional { ?object prov:wasDerivedFrom ?provenance;
               optional { ?provenance pr:P248 ?reference. }
               optional { ?provenance pr:P813 ?date. }
    

    but were using:

            optional { ?object prov:wasDerivedFrom ?provenance;
               optional { ?provenance pr:P248 ?reference;
                                      pr:P813 ?date.
                        }
            }
    

    This has the unfortunate result of being an incomplete graph if ?date isn't available for the provenance for a record. E.g. the record for Gherkin files.

    We'll change the SPARQL to the above and that should be okay but need to verify.

    wikidata 
    opened by ross-spencer 0
Releases(v1.9.6)
Sign Container Images with cosign and Verify signature by using Open Policy Agent (OPA)

Sign Container Images with cosign and Verify signature by using Open Policy Agent (OPA) In the beginning, I believe it is worth saying that this proje

Batuhan Apaydın 60 Nov 30, 2022
Schmeckt wie Damals - Old recipes in new Format

Schmeckt wie Damals Historisches digitales Kochbuch, alte Rezepte in neuem Format Explore the docs » View Demo · Report Bug · Request Feature Inhaltsv

Georg Felix Dues 3 Sep 22, 2021
A simple tool who pulls data from Online.net API and parse them to a Prometheus format

Dedibox backup monitoring A simple tool who reads API from Online.net and parse them into a Prometheus-compatible format. Conceived to be lightweight,

Florian Forestier / Artheriom 4 Aug 16, 2022
Progress OpenEdge Profiler data parsing to OpenTracing format

openedge-profiler-parser Progress OpenEdge Profiler data parsing to OpenTracing format. Prerequisites In order to RUN you will be enough with Docker:

Baltic Amadeus 4 Nov 9, 2021
A simple webdev utility program that allows developers to quickly validate and format JSON code

Toolbox CLI A simple webdev utility program that allows developers to quickly validate and format JSON code, convert from UNIX epoch to timestamp and

Vlad Costea 0 Jan 4, 2022
A golang tool to list out all EKS clusters with active nodegroups in all regions in json format

eks-tool A quick and dirty tool to list out all EKS clusters with active nodegro

null 0 Dec 18, 2021
Envoy file based dynamic routing using kubernetes config map

Envoy File Based Dynamic Routing Config mapを使用してEnvoy File Based Dynamic Routingを実現します。 概要 アーキテクチャとしては、 +----------+ +--------------+ +-----------

null 2 Dec 30, 2022
A Go based deployment tool that allows the users to deploy the web application on the server using SSH information and pem file.

A Go based deployment tool that allows the users to deploy the web application on the server using SSH information and pem file. This application is intend for non tecnhincal users they can just open the GUI and given the server details just deploy.

Jobin Jose 1 Oct 16, 2021
GitHub Action: Compose multiple (conditional) checks into a single check based on file paths in a pull request

GitHub Action: Composite Example Usage --- name: All Checks on: pull_request: branches: - main jobs: meta: runs-on: - ubuntu-20.

Blend 17 Dec 29, 2022
crud is a cobra based CLI utility which helps in scaffolding a simple go based micro-service along with build scripts, api documentation, micro-service documentation and k8s deployment manifests

crud crud is a CLI utility which helps in scaffolding a simple go based micro-service along with build scripts, api documentation, micro-service docum

Piyush Jajoo 0 Nov 29, 2021
Monitoring changes in the source file and automatically compile and run (restart).

dogo Monitoring changes in the source file and automatically compile and run (restart). 中文 Install go get github.com/liudng/dogo Create config Here's

null 254 Dec 28, 2022
OpenAPI Terraform Provider that configures itself at runtime with the resources exposed by the service provider (defined in a swagger file)

Terraform Provider OpenAPI This terraform provider aims to minimise as much as possible the efforts needed from service providers to create and mainta

Daniel I. Khan Ramiro 228 Dec 26, 2022
go-ima is a tool that checks if a file has been tampered with. It is useful in ensuring integrity in CI systems

go-ima Tool that checks the ima-log to see if a file has been tampered with. How to use Set the IMA policy to tcb by configuring GRUB GRUB_CMDLINE_LIN

TestifySec 9 Apr 26, 2022
Copy your HashiCorp Vault secrets to a file

Vault Backup ⚠️ Check the oficial way to backup your HashiCorp Vault. Create a backup file of all HashiCorp Vault kv2 secrets. ./vault-backup -help

Leonardo Comelli 7 Dec 20, 2022
Help developer to sync between local file and remote apollo portal web since portal web is so messy to use

apollo-synchronizer Help developer to sync between local file and remote apollo portal web since portal web is so messy to use Features download names

yeqown 4 Oct 27, 2022
Docker file for go template

Overview If you are wondering how to make correct Dockerfile for web applications that wrote in GO Lang you are in the right place In this repository

Farhad 4 Dec 2, 2021
PoC for Grafana 8.x Local File Inclusion (Pre-Auth)

Grafana 8.x Local File Inclusion (Pre-Auth) CVE: Pending All credits go to j0v and his tweet https://twitter.com/j0v0x0/status/1466845212626542607 Dis

Tay 9 Nov 9, 2022
Grafana Unauthorized arbitrary file reading vulnerability

CVE-2021-43798 Grafana Unauthorized arbitrary file reading vulnerability 8.3.1 (2021-12-07) Security: Fixes CVE-2021-43798 . For more information, see

Jas502n 300 Dec 25, 2022
Terminal file manager

?? llama Llama — a terminal file manager. Why another file manager? I wanted something simple and minimalistic, something to help me with faster navig

Anton Medvedev 1.1k Jan 6, 2023