Powerful and versatile MIME sniffing package using pre-compiled glob patterns, magic number signatures, XML document namespaces, and tree magic for mounted volumes, generated from the XDG shared-mime-info database.

Overview

mimemagic

GoDoc Build Status Codecov Go Report Card

Powerful and versatile MIME sniffing package using pre-compiled glob patterns, magic number signatures, xml document namespaces, and tree magic for mounted volumes, generated from the XDG shared-mime-info database.

Features

  • All in native go, no outside dependencies/C library bindings
  • 1003 MIME types, with a description, an acronym (where available), common aliases, extensions, icons, and subclasses
  • 493 magic signature tests (comprising of 1147 individual patterns), featuring range searches and bit masks, as per the xdg specification
  • 1099 glob patterns, for filename-based matching
  • 11 Tree Magic signatures and 28 XML namespace/local name pairs, offered for completeness' sake.
  • Included is the xml file parser to generate your own MIME definitions
  • Also included is a CLI based on this library that is fully featured and blazing-fast, beating the native 'file' and KDE's 'kmimetypefinder' in performance
  • Cross-platform support

Installation

The library:

go get github.com/zRedShift/mimemagic

The CLI:

go get github.com/zRedShift/mimemagic/cmd/mimemagic

API

See the Godoc reference, and cmd/mimemagic for an example implementation.

Usage

The library:

package main

import (
	"fmt"
	"github.com/zRedShift/mimemagic"
	"strings"
)

func main() {
	// Ignoring Read errors that might arise
	mimeType, _ := mimemagic.MatchFilePath("sample.svgz", -1)

	// image/svg+xml-compressed
	fmt.Println(mimeType.MediaType())

	// compressed SVG image
	fmt.Println(mimeType.Comment)

	// SVG (Scalable Vector Graphics)
	fmt.Printf("%s (%s)\n", mimeType.Acronym, mimeType.ExpandedAcronym)

	// application/gzip
	fmt.Println(strings.Join(mimeType.SubClassOf, ", "))

	// .svgz
	fmt.Println(strings.Join(mimeType.Extensions, ", "))

	// This is an image.
	switch mimeType.Media {
	case "image":
		fmt.Println("This is an image.")
	case "video":
		fmt.Println("This is a video file.")
	case "audio":
		fmt.Println("This is an audio file.")
	case "application":
		fmt.Println("This is an application.")
	default:
		fmt.Printf("This is a(n) %s.", mimeType.Media)
	}

	// true
	fmt.Println(mimeType.IsExtension(".svgz"))
}

The CLI:

Usage: mimemagic [options] <file> ...
Determines the MIME type of the given file(s).

Options:
  -c    Determine the MIME type of the file(s) using only its content.
  -f    Determine the MIME type of the file(s) using only the file name. Does
        not check for the file's existence. The -c
         flag takes precedence.
  -i    Output the MIME type in a human readable format.
  -l int
        The number of bytes from the beginning of the file mimemagic will
        examine. Reads the entire file if set to a negative value. By default
        mimemagic will only read the first 512 from stdin, however setting this
        flag to a non-default negative value will override this. (default -1)
  -t    Determine the MIME type of the directory/mounted volume using tree
        magic. Can't be used in conjunction with with -c, -f or -x.
  -x    Determine the MIME type of the xml file(s) using the local names and
        namespaces within. Can't be used in conjunction with -c, -f or -t.

Arguments:
  file
        The file(s) to test. '-' to read from stdin. If '-' is set, all other
        inputs will be ignored.

Examples:
  $ mimemagic -c sample.svgz
    	application/gzip
  $ mimemagic *.svg*
    	Olympic_rings_with_transparent_rims.svg: image/svg+xml
    	Piano.svg.png: image/png
    	RAID_5.svg: image/svg+xml
    	sample.svgz: image/svg+xml-compressed
  $ cat /dev/urandom | mimemagic -
    	application/octet-stream
  $ ls software; mimemagic -i -t software/
    	autorun
    	UNIX software

Benchmarks

See Benchmarks. For Match(), the average across over 400 completely different files (representing a unique MIME type each) is 13 ± 7 μs/op. For MatchGlob() it's 900 ± 200 ns/op, and for 12 ± 7 μs/op MatchMagic().

Comments
  • freedesktop.org.xml file license

    freedesktop.org.xml file license

    I've historically been the maintainer of shared-mime-info for around 15 years, and cmd/parser/freedesktop.org.xml looks like it's a copy of the database shipped with shared-mime-info, which is released under the GPL, with shared-mime-info's translators work merged in, and the GPL header removed.

    The license that you're shipping mimemagic under (MIT) isn't compatible with shared-mime-info's.

    There are a number of possibilities to fix this problem:

    • change the mimemagic license to be GPL compatible
    • parse the XML file that shared-mime-info ships at runtime, and don't ship it in a codebase with an incompatible license

    Using a GPL file as a source makes your whole codebase a derived work, making it all GPL, so I think it's pretty important that this problem gets corrected before somebody uses it in a pure MIT codebase, or a closed-source application.

    You will also need to re-add the GPL header to the shared-mime-info XML file as a matter of urgency.

    opened by hadess 63
  • Fix the licensing for shared-mime-info/freedesktop.org.xml to avoid a DMCA takedown

    Fix the licensing for shared-mime-info/freedesktop.org.xml to avoid a DMCA takedown

    So I missed #4 due to some shenanigans with github email notifications, sorry about that. Yesterday I received a DMCA takedown notice, exactly the same as this gist, so I'm going to adopt the solution since I have very limited time to respond to the notice. I'm not a license lawyer, though.

    opened by zRedShift 3
  • modules: fix: version 2.0.0 is not installable via go-tooling

    modules: fix: version 2.0.0 is not installable via go-tooling

    A git tag and github release was created for a version 2.0.0 but the version was not installable via go tools and could not be added as dependency to a go.mod file.

    For Major version >1 the version must be appended to the import path. This was missing. (see https://github.com/golang/go/wiki/Modules#releasing-modules-v2-or-higher for more information)

    Append /v2 to the import path, change all references accordingly. The retract directives for minor versions of v1 are removed. They can only be defined in the go.mod file for the same major version.

    opened by fho 2
  • prevent that big fixtures.tar.gz files is included in vendor copies

    prevent that big fixtures.tar.gz files is included in vendor copies

    When the package is vendored as a go module dependency in another project the fixtures.tar.gz becomes part of the copy.

    The file unnecessarily increases the size of vendor directories. It is 15MB big and only used in testcases. Testcases are not included in vendor copies.

    Move the file to a testdata/ directory, to prevent that it's included in vendor copies.

    opened by fho 2
  • Adding dwg magic bytes.

    Adding dwg magic bytes.

    I'm wondering if you're open to taking contributions for various other file types? We're starting to use this quite a bit, and for-see adding a number.

    opened by bnekolny 2
Releases(v2.0.0)
Owner
Ronen Ulanovsky
Ronen Ulanovsky
A Go language binding for encodeing and decoding data in the bencode format that is used by the BitTorrent peer-to-peer file sharing protocol.

bencode-go A Go language binding for encoding and decoding data in the bencode format that is used by the BitTorrent peer-to-peer file sharing protoco

Jack Palevich 186 Sep 27, 2022
Protobuf3 with Interface support - Designed for blockchains (deterministic, upgradeable, fast, and compact)

Amino Spec (and impl for Go) This software implements Go bindings for the Amino encoding protocol. Amino is an object encoding specification. It is a

Tendermint 243 Sep 26, 2022
Go encoding/xml package that improves support for XML namespaces

encoding/xml with namespaces This is a fork of the Go encoding/xml package that improves support for XML namespaces, kept in sync with golang/go#48641

nb.io 3 Dec 1, 2021
ARP spoofing tool based on go language, supports LAN host scanning, ARP poisoning, man-in-the-middle attack, sensitive information sniffing, HTTP packet sniffing

[ARP Spoofing] [Usage] Commands: clear clear the screen cut 通过ARP欺骗切断局域网内某台主机的网络 exit exit the program help display help hosts 主机管理功能 loot 查看嗅探到的敏感信息

Re 46 Sep 16, 2022
Go implementation of the XDG Base Directory Specification and XDG user directories

xdg Provides an implementation of the XDG Base Directory Specification. The specification defines a set of standard paths for storing application file

Adrian-George Bostan 268 Sep 26, 2022
Go implementation of the XDG Base Directory Specification and XDG user directories

xdg Provides an implementation of the XDG Base Directory Specification. The specification defines a set of standard paths for storing application file

Adrian-George Bostan 268 Sep 26, 2022
Sand is the next, versatile, high-level compiled or interpreted language that's easy to learn and performant to run.

Sand is the newest, dynamically typed, interpreted programming language. Table of Contents History Project Stats History Sand was created as part of @

Neuron AI 4 Mar 13, 2022
Convert Gitignore to Glob patterns in Go

globify-gitignore Convert Gitignore to Glob patterns A Go

Amin Yahyaabadi 5 Nov 8, 2021
Using NFP (Number Format Parser) you can get an Abstract Syntax Tree (AST) from Excel number format expression

NFP (Number Format Parser) Using NFP (Number Format Parser) you can get an Abstract Syntax Tree (AST) from Excel number format expression. Installatio

fossabot 0 Feb 4, 2022
GoDynamic can load and run Golang dynamic library compiled by -buildmode=shared -linkshared

GoDynamic can load and run Golang dynamic library compiled by -buildmode=shared -linkshared How does it work? GoDynamic works like a dynamic

pkujhd 10 Aug 31, 2022
Guess-number-game - Computer thoughts of some integer number, you must guess it with limited number of attempts

Guess number game Rules Computer has thought of some integer number. You must guess it, you have numberOfAttempts attempts. How to run Just type in co

Nikita Shamaev 0 Dec 31, 2021
Redwood is a highly-configurable, distributed, realtime database that manages a state tree shared among many peers

Redwood is a highly-configurable, distributed, realtime database that manages a state tree shared among many peers. Imagine something like a Redux store, but distributed across all users of an application, that offers offline editing and is resilient to poor connectivity.

Redwood 677 Sep 23, 2022
DND-magic-item-Generator - D&D magic item generator like in Diablo

DND-magic-item-Generator D&D magic item generator like in Diablo Legendary items

Hex Kot 0 Mar 28, 2022
A simple command line tool using which you can skip phone number based SMS verification by using a temporary phone number that acts like a proxy.

Fake-SMS A simple command line tool using which you can skip phone number based SMS verification by using a temporary phone number that acts like a pr

Narasimha Prasanna HN 726 Sep 23, 2022
A simple command line tool using which you can skip phone number based SMS verification by using a temporary phone number that acts like a proxy

Fake-SMS A simple command line tool using which you can skip phone number based SMS verification by using a temporary phone number that acts like a pr

Narasimha Prasanna HN 726 Sep 20, 2022
XPath package for Golang, supports HTML, XML, JSON document query.

XPath XPath is Go package provides selecting nodes from XML, HTML or other documents using XPath expression. Implementation htmlquery - an XPath query

null 509 Sep 25, 2022
Progressively image a mounted disk correctly without corruption

hot-clone This tool allows you to image a actively changing block device. Including the one the rootfs is stored on. Backup a device hot-clone uses bl

Ben Cox 252 Aug 28, 2022
Container-Explorer is a tool to explore containerd installation on a mounted image.

Container-Explorer Container-Explorer is a tool to explore containerd installation on a mounted image. Container-Explorer attempts to provide the simi

Google 22 Aug 3, 2022
Kubectl plugin to ease sniffing on kubernetes pods using tcpdump and wireshark

ksniff A kubectl plugin that utilize tcpdump and Wireshark to start a remote capture on any pod in your Kubernetes cluster. You get the full power of

Eldad Rudich 2.4k Sep 26, 2022
A cross platform package that follows the XDG Standard

XDG A cross platform package that tries to follow XDG Standard when possible. Since XDG is linux specific, I am only able to follow standards to the T

null 71 Sep 26, 2022