Fast directory traversal for Golang

Overview

godirwalk

godirwalk is a library for traversing a directory tree on a file system.

GoDoc Build Status

In short, why do I use this library?

  1. It's faster than filepath.Walk.
  2. It's more correct on Windows than filepath.Walk.
  3. It's more easy to use than filepath.Walk.
  4. It's more flexible than filepath.Walk.

Usage Example

Additional examples are provided in the examples/ subdirectory.

This library will normalize the provided top level directory name based on the os-specific path separator by calling filepath.Clean on its first argument. However it always provides the pathname created by using the correct os-specific path separator when invoking the provided callback function.

    dirname := "some/directory/root"
    err := godirwalk.Walk(dirname, &godirwalk.Options{
        Callback: func(osPathname string, de *godirwalk.Dirent) error {
            // Following string operation is not most performant way
            // of doing this, but common enough to warrant a simple
            // example here:
            if strings.Contains(osPathname, ".git") {
                return godirwalk.SkipThis
            }
            fmt.Printf("%s %s\n", de.ModeType(), osPathname)
            return nil
        },
        Unsorted: true, // (optional) set true for faster yet non-deterministic enumeration (see godoc)
    })

This library not only provides functions for traversing a file system directory tree, but also for obtaining a list of immediate descendants of a particular directory, typically much more quickly than using os.ReadDir or os.ReadDirnames.

Description

Here's why I use godirwalk in preference to filepath.Walk, os.ReadDir, and os.ReadDirnames.

It's faster than filepath.Walk

When compared against filepath.Walk in benchmarks, it has been observed to run between five and ten times the speed on darwin, at speeds comparable to the that of the unix find utility; and about twice the speed on linux; and about four times the speed on Windows.

How does it obtain this performance boost? It does less work to give you nearly the same output. This library calls the same syscall functions to do the work, but it makes fewer calls, does not throw away information that it might need, and creates less memory churn along the way by reusing the same scratch buffer for reading from a directory rather than reallocating a new buffer every time it reads file system entry data from the operating system.

While traversing a file system directory tree, filepath.Walk obtains the list of immediate descendants of a directory, and throws away the node type information for the file system entry that is provided by the operating system that comes with the node's name. Then, immediately prior to invoking the callback function, filepath.Walk invokes os.Stat for each node, and passes the returned os.FileInfo information to the callback.

While the os.FileInfo information provided by os.Stat is extremely helpful--and even includes the os.FileMode data--providing it requires an additional system call for each node.

Because most callbacks only care about what the node type is, this library does not throw the type information away, but rather provides that information to the callback function in the form of a os.FileMode value. Note that the provided os.FileMode value that this library provides only has the node type information, and does not have the permission bits, sticky bits, or other information from the file's mode. If the callback does care about a particular node's entire os.FileInfo data structure, the callback can easiy invoke os.Stat when needed, and only when needed.

Benchmarks

macOS
$ go test -bench=. -benchmem
goos: darwin
goarch: amd64
pkg: github.com/karrick/godirwalk
BenchmarkReadDirnamesStandardLibrary-12   50000       26250  ns/op       10360  B/op       16  allocs/op
BenchmarkReadDirnamesThisLibrary-12       50000       24372  ns/op        5064  B/op       20  allocs/op
BenchmarkFilepathWalk-12                      1  1099524875  ns/op   228415912  B/op   416952  allocs/op
BenchmarkGodirwalk-12                         2   526754589  ns/op   103110464  B/op   451442  allocs/op
BenchmarkGodirwalkUnsorted-12                 3   509219296  ns/op   100751400  B/op   378800  allocs/op
BenchmarkFlameGraphFilepathWalk-12            1  7478618820  ns/op  2284138176  B/op  4169453  allocs/op
BenchmarkFlameGraphGodirwalk-12               1  4977264058  ns/op  1031105328  B/op  4514423  allocs/op
PASS
ok  	github.com/karrick/godirwalk	21.219s
Linux
$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/karrick/godirwalk
BenchmarkReadDirnamesStandardLibrary-12  100000       15458  ns/op       10360  B/op       16  allocs/op
BenchmarkReadDirnamesThisLibrary-12      100000       14646  ns/op        5064  B/op       20  allocs/op
BenchmarkFilepathWalk-12                      2   631034745  ns/op   228210216  B/op   416939  allocs/op
BenchmarkGodirwalk-12                         3   358714883  ns/op   102988664  B/op   451437  allocs/op
BenchmarkGodirwalkUnsorted-12                 3   355363915  ns/op   100629234  B/op   378796  allocs/op
BenchmarkFlameGraphFilepathWalk-12            1  6086913991  ns/op  2282104720  B/op  4169417  allocs/op
BenchmarkFlameGraphGodirwalk-12               1  3456398824  ns/op  1029886400  B/op  4514373  allocs/op
PASS
ok      github.com/karrick/godirwalk    19.179s

It's more correct on Windows than filepath.Walk

I did not previously care about this either, but humor me. We all love how we can write once and run everywhere. It is essential for the language's adoption, growth, and success, that the software we create can run unmodified on all architectures and operating systems supported by Go.

When the traversed file system has a logical loop caused by symbolic links to directories, on unix filepath.Walk ignores symbolic links and traverses the entire directory tree without error. On Windows however, filepath.Walk will continue following directory symbolic links, even though it is not supposed to, eventually causing filepath.Walk to terminate early and return an error when the pathname gets too long from concatenating endless loops of symbolic links onto the pathname. This error comes from Windows, passes through filepath.Walk, and to the upstream client running filepath.Walk.

The takeaway is that behavior is different based on which platform filepath.Walk is running. While this is clearly not intentional, until it is fixed in the standard library, it presents a compatibility problem.

This library fixes the above problem such that it will never follow logical file sytem loops on either unix or Windows. Furthermore, it will only follow symbolic links when FollowSymbolicLinks is set to true. Behavior on Windows and other operating systems is identical.

It's more easy to use than filepath.Walk

While this library strives to mimic the behavior of the incredibly well-written filepath.Walk standard library, there are places where it deviates a bit in order to provide a more easy or intuitive caller interface.

Callback interface does not send you an error to check

Since this library does not invoke os.Stat on every file system node it encounters, there is no possible error event for the callback function to filter on. The third argument in the filepath.WalkFunc function signature to pass the error from os.Stat to the callback function is no longer necessary, and thus eliminated from signature of the callback function from this library.

Furthermore, this slight interface difference between filepath.WalkFunc and this library's WalkFunc eliminates the boilerplate code that callback handlers must write when they use filepath.Walk. Rather than every callback function needing to check the error value passed into it and branch accordingly, users of this library do not even have an error value to check immediately upon entry into the callback function. This is an improvement both in runtime performance and code clarity.

Callback function is invoked with OS specific file system path separator

On every OS platform filepath.Walk invokes the callback function with a solidus (/) delimited pathname. By contrast this library invokes the callback with the os-specific pathname separator, obviating a call to filepath.Clean in the callback function for each node prior to actually using the provided pathname.

In other words, even on Windows, filepath.Walk will invoke the callback with some/path/to/foo.txt, requiring well written clients to perform pathname normalization for every file prior to working with the specified file. This is a hidden boilerplate requirement to create truly os agnostic callback functions. In truth, many clients developed on unix and not tested on Windows neglect this subtlety, and will result in software bugs when someone tries to run that software on Windows.

This library invokes the callback function with some\path\to\foo.txt for the same file when running on Windows, eliminating the need to normalize the pathname by the client, and lessen the likelyhood that a client will work on unix but not on Windows.

This enhancement eliminates necessity for some more boilerplate code in callback functions while improving the runtime performance of this library.

godirwalk.SkipThis is more intuitive to use than filepath.SkipDir

One arguably confusing aspect of the filepath.WalkFunc interface that this library must emulate is how a caller tells the Walk function to skip file system entries. With both filepath.Walk and this library's Walk, when a callback function wants to skip a directory and not descend into its children, it returns filepath.SkipDir. If the callback function returns filepath.SkipDir for a non-directory, filepath.Walk and this library will stop processing any more entries in the current directory. This is not necessarily what most developers want or expect. If you want to simply skip a particular non-directory entry but continue processing entries in the directory, the callback function must return nil.

The implications of this interface design is when you want to walk a file system hierarchy and skip an entry, you have to return a different value based on what type of file system entry that node is. To skip an entry, if the entry is a directory, you must return filepath.SkipDir, and if entry is not a directory, you must return nil. This is an unfortunate hurdle I have observed many developers struggling with, simply because it is not an intuitive interface.

Here is an example callback function that adheres to filepath.WalkFunc interface to have it skip any file system entry whose full pathname includes a particular substring, optSkip. Note that this library still supports identical behavior of filepath.Walk when the callback function returns filepath.SkipDir.

    func callback1(osPathname string, de *godirwalk.Dirent) error {
        if optSkip != "" && strings.Contains(osPathname, optSkip) {
            if b, err := de.IsDirOrSymlinkToDir(); b == true && err == nil {
                return filepath.SkipDir
            }
            return nil
        }
        // Process file like normal...
        return nil
    }

This library attempts to eliminate some of that logic boilerplate required in callback functions by providing a new token error value, SkipThis, which a callback function may return to skip the current file system entry regardless of what type of entry it is. If the current entry is a directory, its children will not be enumerated, exactly as if the callback had returned filepath.SkipDir. If the current entry is a non-directory, the next file system entry in the current directory will be enumerated, exactly as if the callback returned nil. The following example callback function has identical behavior as the previous, but has less boilerplate, and admittedly logic that I find more simple to follow.

    func callback2(osPathname string, de *godirwalk.Dirent) error {
        if optSkip != "" && strings.Contains(osPathname, optSkip) {
            return godirwalk.SkipThis
        }
        // Process file like normal...
        return nil
    }

It's more flexible than filepath.Walk

Configurable Handling of Symbolic Links

The default behavior of this library is to ignore symbolic links to directories when walking a directory tree, just like filepath.Walk does. However, it does invoke the callback function with each node it finds, including symbolic links. If a particular use case exists to follow symbolic links when traversing a directory tree, this library can be invoked in manner to do so, by setting the FollowSymbolicLinks config parameter to true.

Configurable Sorting of Directory Children

The default behavior of this library is to always sort the immediate descendants of a directory prior to visiting each node, just like filepath.Walk does. This is usually the desired behavior. However, this does come at slight performance and memory penalties required to sort the names when a directory node has many entries. Additionally if caller specifies Unsorted enumeration in the configuration parameter, reading directories is lazily performed as the caller consumes entries. If a particular use case exists that does not require sorting the directory's immediate descendants prior to visiting its nodes, this library will skip the sorting step when the Unsorted parameter is set to true.

Here's an interesting read of the potential hazzards of traversing a file system hierarchy in a non-deterministic order. If you know the problem you are solving is not affected by the order files are visited, then I encourage you to use Unsorted. Otherwise skip setting this option.

Researchers find bug in Python script may have affected hundreds of studies

Configurable Post Children Callback

This library provides upstream code with the ability to specify a callback function to be invoked for each directory after its children are processed. This has been used to recursively delete empty directories after traversing the file system in a more efficient manner. See the examples/clean-empties directory for an example of this usage.

Configurable Error Callback

This library provides upstream code with the ability to specify a callback to be invoked for errors that the operating system returns, allowing the upstream code to determine the next course of action to take, whether to halt walking the hierarchy, as it would do were no error callback provided, or skip the node that caused the error. See the examples/walk-fast directory for an example of this usage.

Issues
  • Should work with Solaris

    Should work with Solaris

    Building on Solaris fails.

    karrick/godirwalk/readdir.go:20:9: undefined: readdirents
    karrick/godirwalk/readdir.go:46:9: undefined: readdirnames
    

    Adding solaris to the build tags in readdir_unix.go results in:

    karrick/godirwalk/readdir_unix.go:41:7: undefined: inoFromDirent
    karrick/godirwalk/readdir_unix.go:55:13: de.Type undefined (type *syscall.Dirent has no field or method Type)
    karrick/godirwalk/readdir_unix.go:56:9: undefined: syscall.DT_REG
    
    enhancement help wanted 
    opened by rprikulis 24
  • Build error - s.sde.Reclen undefined

    Build error - s.sde.Reclen undefined

    There is no d_reclen field in struct dirent in POSIX and some OSes (like DragonFly BSD) might not implement it.

    # github.com/karrick/godirwalk
    ./scandir_unix.go:146:37: s.sde.Reclen undefined (type *syscall.Dirent has no field or method Reclen)
    
    

    I'm wondering whether godirwalk could use the ReadDirent/ParseDirent combination instead of accessing Reclen directly. Since one of the key points of godirwalk, maybe ParseDirent is just slower and thus not used?

    opened by tuxillo 15
  • Symlink loops cause infinite recursion

    Symlink loops cause infinite recursion

    When symlinks are followed godirwalk may enter an infinite loop. A standard Linux installation can have these types of symlinks that create a directory loop, my Linux Mint machine certainly contains them out of the box.

    As a result godirwalk will never end on some systems if FollowSymbolicLinks is true.

    bug 
    opened by rasteric 11
  • [WiP] Allow to apply recursive function to directory

    [WiP] Allow to apply recursive function to directory

    Hello,

    I'm trying to use your library for recursive treatments, where Callback also takes as an argument the return value of Callback(child elements). This would allow to code things like 'du' (print directory size - including all descendants -, for every directory encountered). That is, to have callback functions which are of type like : func(osPathname string, directoryEntry *Dirent, sibling, child RecurseResult) (RecurseResult, error) where RecurseResult is any type (but same for result and params of the callback), 'sibling' is the result of the callback on previously treated elements in the same directory and 'child' is the result of the callback on children (more exactly, on the last children treated).

    I'm a beginner go programmer, so I may have made errors or made misconceptions. How i did :

    • i copied code from walk.go into recurse.go
    • i switched PostChildrenCallback to PreChildrenCallback (as 'Callback' in 'recurse' will always be called after children).
    • changed some code order in the main 'walk' (now 'recurse') function
    • i added RecurseResult and RecurseFunc types.

    The problem is that i don't know how to make go accept any type as 'RecurseResult'. When I try to define RecurseResult as 'interface {}' :

    ./main.go:16:11: cannot use func literal (type func(string, *godirwalk.Dirent, int64, int64) (int64, error)) as type godirwalk.RecurseFunc in field value

    If I manually switch from RecurseResult to int64 type in the code, it works, but ... well ... that's not the goal.

    Do you have any idea ? Are you interested in this 'recursive thing' ?

    Note: if it works, the classic 'walk' function could be considered as a particular case of recursive function (a one which doesn't use the result from children and doesn't return any particular value), except the difference in order of treatment (children before or after parent).

    enhancement 
    opened by Samuel-BF 9
  • go get fails: checksum mismatch

    go get fails: checksum mismatch

    On Windows 10 with Go 1.11 I ran go get github.com/UnnoTed/fileb0x, which has this as a dependency, and I get this error:

    go: verifying github.com/karrick/[email protected]3: checksum mismatch
            downloaded: h1:e5iv87oxunQtG7S9MB10jrINLmF7HecFSjiTYKO7P2c=
            go.sum:     h1:UP4CfXf1LfNwXrX6vqWf1DOhuiFRn2hXsqtRAQlQOUQ=
    

    This also happens on CircleCI, but it doesn't happen on OS X.

    I'm wondering if there isn't some CRLF or filename case sensitivity issues. Any thoughts?

    bug 
    opened by coolaj86 8
  • Missing directories

    Missing directories

    TL;DR Disabling sorting fixes my issue, if it's sorted, directories go missing from the results.

    Summary: I cannot, for the life of me, figure out what's going on, but there is an issue where directories are not appearing properly when sort is enabled.

    I have a folder /code in my top level directory that when sort is enabled does not get returned in the results, but when sort is commented out or disabled via options it magically appears again.

    Looping over pre/post sort shows that the folder exists, but as soon as the next range takes place, it's just .... gone ....

    So here is what the code looks like(I just added some simple prints to walk.go)

    walk.go:176

    
    for _, deChild := range deChildren {
    		if deChild.name == "code" {
    			fmt.Println("debug:presort")
    		}
    	}
    	if !options.Unsorted {
    		sort.Sort(deChildren) // sort children entries unless upstream says to leave unsorted
    	}
    
    	for _, deChild := range deChildren {
    		if deChild.name == "code" {
    			fmt.Println("debug:postsort")
    		}
    	}
    
    	for _, deChild := range deChildren {
    		if deChild.name == "code" {
    			fmt.Println("debug:innerrange")
    		}
    		osChildname := filepath.Join(osPathname, deChild.name)
    
    09:02 PM ✘ kcmerrill  Desktop ] cat block.txt | grep debug
    debug:presort
    debug:postsort
    09:02 PM ✔ kcmerrill  Desktop ]
    

    I was expecting to see debug:innerrange as I wouldn't think simply sorting would make a difference.

    Ok, so now lets try again, but this time lets comment out sort.Sort(deChildren).

    for _, deChild := range deChildren {
    		if deChild.name == "code" {
    			fmt.Println("debug:presort")
    		}
    	}
    	if !options.Unsorted {
    		//sort.Sort(deChildren) // sort children entries unless upstream says to leave unsorted
    	}
    
    	for _, deChild := range deChildren {
    		if deChild.name == "code" {
    			fmt.Println("debug:postsort")
    		}
    	}
    
    	for _, deChild := range deChildren {
    		if deChild.name == "code" {
    			fmt.Println("debug:innerrange")
    		}
    		osChildname := filepath.Join(osPathname, deChild.name)
    
    

    And here are the results:

    09:03 PM ✔ kcmerrill  Desktop ] cat block.txt | grep debug
    debug:presort
    debug:postsort
    debug:innerrange
    09:03 PM ✔ kcmerrill  Desktop ]
    

    This is the strangest thing .... when doing a compare on the slices in regards to the lengths of them both, they are equal(as you can tell because both are visible in post sort).

    I'm on Mac(latest and greatest) and this is my go version:

    09:07 PM ✔ kcmerrill  Desktop ] go version
    go version go1.9.2 darwin/amd64
    09:07 PM ✔ kcmerrill  Desktop ]
    
    question 
    opened by kcmerrill 7
  • Skip Callback functions

    Skip Callback functions

    This PR adds optional SkipPathCallback and SkipSymbolicLinkCallback functions that Walk will invoke before it stats a path or follows a symbolic link. SkipPathCallback can be used to exclude a directory from walking, such as /proc or /sys. SkipSymbolicLinkCallback can be used to avoid recursive loops. Some file systems have symbolic link targets deep in a file structure, so you can't easily exclude a directory from the outside. Example:

    SkipPathCallback: func(pathname string) bool {
      return strings.HasPrefix("/proc", pathname)
    },
    SkipSymbolicLinkCallback: func(pathname string, target string) bool 
      return target == "." || strings.HasPrefix(target, "..")
    }
    
    enhancement 
    opened by pjdufour 6
  • Possible memory leak?

    Possible memory leak?

    Hi, I am using this library to traverse a bunch of directories containing 70TB worth of files of about 2MB each.

    I noticed that my software is using more memory than it should and I profiled it to try and understand what's going on.

    The library is only used on startup. I made sure it returns and waited 1 hour before using pprof to analyze the heap and this is what I noticed:

     2670.24MB 33.01% 66.04%  2670.24MB 33.01%  0000000000b5786b github.com/karrick/godirwalk.(*Scanner).Scan /home/travis/gopath/pkg/mod/github.com/karrick/[email protected]/scandir_unix.go:163
             0     0% 66.04%  2670.24MB 33.01%  0000000000b57f45 github.com/karrick/godirwalk.Walk /home/travis/gopath/pkg/mod/github.com/karrick/[email protected]/walk.go:258
             0     0% 66.04%  2670.24MB 33.01%  0000000000b585ba github.com/karrick/godirwalk.walk /home/travis/gopath/pkg/mod/github.com/karrick/[email protected]/walk.go:329
             0     0% 66.04%  2670.24MB 33.01%  0000000000b59e2e github.com/lbryio/reflector.go/store/speedwalk.AllFiles.func2 /home/travis/gopath/src/github.com/lbryio/reflector.go/store/speedwalk/speedwalk.go:64
    

    this is unusual because the function returned and the files were further processed later, there is no reason as to why godirwalk would still have to hold memory.

    I tried looking into the code and I noticed that scandir_unix.go calls s.done() before returning false in Scan() which subsequently calls s.dh.Close() , however as the done() doc says:

    // done is called when directory scanner unable to continue, with either the
    // triggering error, or nil when there are simply no more entries to read from
    // the directory.
    

    the function is only called when the scanner is unable to continue which leads me to think that if something were to interrupt the walk before it finishes, memory could not be freed up.

    this could happen when we return in the for loop here:

    	for ds.Scan() {
    		deChild, err := ds.Dirent()
    		osChildname := filepath.Join(osPathname, deChild.name)
    		if err != nil {
    			if action := options.ErrorCallback(osChildname, err); action == SkipNode {
    				return nil
    			}
    			return err
    		}
    

    which would not close the file that the scanner has opened leaving the scanner noncollectable by the GC.

    If this happens enough times, this could potentially add up to the amount of memory I noticed while profiling (?)

    The directory i'm walking has 256 subdirectories and about 115k files in each subdirectory.

    Let me know what you think.

    Regards,

    Niko

    opened by nikooo777 5
  • Continuation: inquiry on the skipNode functionality in regards to file name filtering

    Continuation: inquiry on the skipNode functionality in regards to file name filtering

    Hello again,

    This is a continuation of a previous issue since I couldn´t re-open the old one again as instructed.

    I took a look at the find-fast example and it is indeed what I was looking for. However I was apparently not detailed enaugh when creating the issue.

    I had already tried using the filepath.SkipDir but it seems to break the walk if I return that error on a file instead of a directory.

    I did some debugging on the walk function and you can view the video here: https://www.twitch.tv/videos/708174351

    I find the (possible) issue @ 56:30 in the video and then I implement a solution that works for me.

    However, I obviously don´t know all the logic behind this package and my solution might break something somewhere else so I would love it if you could check it out.

    Regards,

    enhancement 
    opened by zveinn 5
  • Access is denied: maybe need a option to skip failed

    Access is denied: maybe need a option to skip failed

    	err := godirwalk.Walk("D:/", &godirwalk.Options{
    		Callback: func(osPathname string, de *godirwalk.Dirent) error {
    			fmt.Printf("%s %s\n", de.ModeType(), osPathname)
    			return nil
    		},
    	})
    	fmt.Println(err)
    
    d--------- D:\$RECYCLE.BIN\S-1-5-21-2045673143-4246043559-2053418641-500
    cannot ReadDirents: cannot Open: open D:\$RECYCLE.BIN\S-1-5-21-2045673143-4246043559-2053418641-500: Access is denied.
    
    enhancement 
    opened by fy0 5
  • Dirent.IsDirOrSymlinkToDir

    Dirent.IsDirOrSymlinkToDir

    Closes #44

    // IsDirOrSymlinkToDir returns true if and only if the Dirent represents a file // system directory, or a symbolic link to a directory. Note that if the Dirent // is not a directory but is a symbolic link, this method will resolve by // sending a request to the operating system to follow the symbolic link.

    enhancement 
    opened by karrick 4
  • Walk files of folder before recursing into sub-folders

    Walk files of folder before recursing into sub-folders

    Is there a way with this library to first invoke the callback on all direct files under the current folder before recursing into the sub-folders?

    Example:

    package main
    
    import (
    	"fmt"
    	"github.com/karrick/godirwalk"
    )
    
    func main() {
    	godirwalk.Walk("/tmp/test", &godirwalk.Options{
    		Callback: func(osPathname string, directoryEntry *godirwalk.Dirent) error {
    			fmt.Printf("%s\n", osPathname)
    			return nil
    		}})
    }
    

    currently results in:

    /tmp/test
    /tmp/test/afile
    /tmp/test/dir1
    /tmp/test/dir1/file2
    /tmp/test/file
    

    and the ask is to return:

    /tmp/test
    /tmp/test/afile
    /tmp/test/file
    /tmp/test/dir1
    /tmp/test/dir1/file2
    

    This way the results are still deterministic as all files of current folder are sorted, and the sub-folders are recursed into also in a sorted order.

    Thanks!

    opened by shlomi-dr 1
  • Feature Request: Walk Parents

    Feature Request: Walk Parents

    Would be great to see an option to walk up the parent directory until you find a file or other exit condition. This would simulate, for instance, running git status in a directory below the .git dir. In other words, the purpose would be to decide what a project root is for a cli.

    opened by maurerbot 0
  • Infinite recursion on symbolic link loops, README incorrect?

    Infinite recursion on symbolic link loops, README incorrect?

    The README says the following:

    This library fixes the above problem such that it will never follow logical file sytem [sic] loops on either unix or Windows.

    I think this is false. I have a directory structure like the following:

    walk-links/
    └── loop
        └── loop-link-upwards -> ../../walk-links/
    

    When calling godirwalk.Walk on walk-links, it descends infinitely when FollowSymbolicLinks is set to true. I know that this is an adversarial example, but unless godirwalk is somehow keeping track of all paths that have been visited, I don't think the README is accurate.

    opened by enricozb 0
  • Broken on illumos?

    Broken on illumos?

    Trying to track down some caching problems with navidrome https://github.com/navidrome/navidrome/issues/1048 and found that pulling down master and running "go test" fails:

    $ go version
    go version go1.16.3 illumos/amd64
    $ uname -a
    SunOS pergamum 5.11 omnios-r151030-5bd7739fe4 i86pc i386 i86pc illumos
    $ git clone https://github.com/karrick/godirwalk.git
    Cloning into 'godirwalk'...
    remote: Enumerating objects: 1141, done.
    remote: Counting objects: 100% (30/30), done.
    remote: Compressing objects: 100% (21/21), done.
    Receiving objects:  99% (1130/1141)remote: Total 1141 (delta 13), reused 21 (delta 9), pack-reused 1111
    Receiving objects: 100% (1141/1141), 255.23 KiB | 4.12 MiB/s, done.
    Resolving deltas: 100% (617/617), done.
    $ pushd godirwalk/
    ~/navidrome/godirwalk ~/navidrome
    $ go test
    --- FAIL: TestReadDirents (0.00s)
        --- FAIL: TestReadDirents/without_symlinks (0.00s)
            readdir_test.go:14: GOT: lstat /tmp/godirwalk-378882479/d0/aaaaaa: no such file or directory; WANT: []
        --- FAIL: TestReadDirents/with_symlinks (0.00s)
            readdir_test.go:51: GOT: lstat /tmp/godirwalk-378882479/d0/symlinks/nothin: no such file or directory; WANT: []
    --- FAIL: TestScanner (0.00s)
        --- FAIL: TestScanner/collect_names (0.00s)
            scandir_test.go:22: GOT: "aaaaaa\x03" (extra)
            scandir_test.go:22: WANT: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" (missing)
            scandir_test.go:22: GOT: "symlin\x03" (extra)
            scandir_test.go:22: WANT: "symlinks" (missing)
        --- FAIL: TestScanner/collect_dirents (0.00s)
            scandir_test.go:35: GOT: lstat /tmp/godirwalk-378882479/d0/aaaaaa: no such file or directory; WANT: []
    --- FAIL: TestWalkCompatibleWithFilepathWalk (0.00s)
        --- FAIL: TestWalkCompatibleWithFilepathWalk/test_root (0.00s)
            walk_test.go:79: GOT: lstat /tmp/godirwalk-378882479/d0/aaaaaa: no such file or directory; WANT: []
    --- FAIL: TestWalkSkipThis (0.00s)
        --- FAIL: TestWalkSkipThis/SkipThis (0.00s)
            walk_test.go:154: GOT: lstat /tmp/godirwalk-378882479/d0/aaaaaa: no such file or directory; WANT: []
    --- FAIL: TestWalkFollowSymbolicLinks (0.00s)
        walk_test.go:196: GOT: lstat /tmp/godirwalk-378882479/d0/symlinks/nothin: no such file or directory; WANT: []
    --- FAIL: TestErrorCallback (0.00s)
        --- FAIL: TestErrorCallback/halt (0.00s)
            walk_test.go:239: unexpected error callback for /tmp/godirwalk-378882479/d0/symlinks: lstat /tmp/godirwalk-378882479/d0/symlinks/nothin: no such file or directory
        --- FAIL: TestErrorCallback/skipnode (0.00s)
            walk_test.go:271: unexpected error callback for /tmp/godirwalk-378882479/d0/symlinks: lstat /tmp/godirwalk-378882479/d0/symlinks/nothin: no such file or directory
    --- FAIL: TestPostChildrenCallback (0.00s)
        walk_test.go:299: GOT: lstat /tmp/godirwalk-378882479/d0/aaaaaa: no such file or directory; WANT: []
    FAIL
    drwx------
    drwxrwxr-x /d0
    -rwxrwxr-x /d0/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
    drwxrwxr-x /d0/d1
    -rwxrwxr-x /d0/d1/f2
    -rwxrwxr-x /d0/f1
    drwxrwxr-x /d0/skips
    drwxrwxr-x /d0/skips/d2
    -rwxrwxr-x /d0/skips/d2/f3
    -rwxrwxr-x /d0/skips/d2/skip
    -rwxrwxr-x /d0/skips/d2/z1
    drwxrwxr-x /d0/skips/d3
    -rwxrwxr-x /d0/skips/d3/f4
    drwxrwxr-x /d0/skips/d3/skip
    -rwxrwxr-x /d0/skips/d3/skip/f5
    -rwxrwxr-x /d0/skips/d3/z2
    drwxrwxr-x /d0/symlinks
    drwxrwxr-x /d0/symlinks/d4
    Lrwxrwxrwx /d0/symlinks/d4/toSD1 -> ../toD1
    Lrwxrwxrwx /d0/symlinks/d4/toSF1 -> ../toF1
    Lrwxrwxrwx /d0/symlinks/nothing -> ../f0
    Lrwxrwxrwx /d0/symlinks/toAbs -> /tmp/godirwalk-378882479/d0/f1
    Lrwxrwxrwx /d0/symlinks/toD1 -> ../d1
    Lrwxrwxrwx /d0/symlinks/toF1 -> ../f1
    exit status 1
    FAIL    github.com/karrick/godirwalk    0.023s
    

    That weird truncation of "aaaaaa" is exactly what we're seeing with navidrome

    opened by whorfin 4
  • ModeType() doesnt show permissions and modes on macos

    ModeType() doesnt show permissions and modes on macos

    go version: go version go1.16.3 darwin/amd64 godirwalk version: github.com/karrick/godirwalk v1.16.1 code used:

    godirwalk.Walk(startingDir, &godirwalk.Options{
    	Callback: func(osPathname string, de *godirwalk.Dirent) error {
    		fmt.Printf("%s %s\n", de.ModeType(), osPathname)
    		return nil
    	},
    	Unsorted:            true,
    	FollowSymbolicLinks: false,
    })
    

    Observed Output:

    d--------- /Library/Caches
    ---------- /Library/Caches/.DS_Store
    d--------- /Library/Caches/ColorSync
    ---------- /Library/Caches/ColorSync/com.apple.colorsync.devices
    d--------- /Library/Caches/com.apple.cloudkit
    ---------- /Library/Caches/com.apple.cloudkit/com.apple.cloudkit.launchservices.hostnames.plist
    d--------- /Library/Caches/com.apple.iconservices.store
    

    Expected Output: based on filewalker and os.FileInfo.Mode()

    dtrwxrwxrwx /Library/Caches
    -rw-r--r-- /Library/Caches/.DS_Store
    drwxr-xr-x /Library/Caches/ColorSync
    -rw-r--r-- /Library/Caches/ColorSync/com.apple.colorsync.devices
    drwxr-xr-x /Library/Caches/com.apple.cloudkit
    -rw-r--r-- /Library/Caches/com.apple.cloudkit/com.apple.cloudkit.launchservices.hostnames.plist
    drwx--x--x /Library/Caches/com.apple.iconservices.store
    

    Possible cause: Walk.go line 244. ModeType is & with os.ModeType

    Is this intentional?

    opened by vireshwali 0
Owner
Karrick McDermott
Karrick McDermott
Fancy Git Clone that preserves directory structures

git go-clone This is fancy wrapper around git clone that preserves directory structures. For example, if you have some complex organization, and you w

Michael Jarvis 2 Sep 24, 2021
Active Directory & Red-Team Cheat-Sheet in constant expansion.

This AD attacks CheatSheet, made by RistBS is inspired by the Active-Directory-Exploitation-Cheat-Sheet repo. Edit : Thanks for 100 stars :D it is the

null 621 Jul 4, 2022
Fast cross-platform HTTP benchmarking tool written in Go

bombardier bombardier is a HTTP(S) benchmarking tool. It is written in Go programming language and uses excellent fasthttp instead of Go's default htt

Максим Федосеев 3.6k Jun 28, 2022
Fast, concurrent, streaming access to Amazon S3, including gof3r, a CLI. http://godoc.org/github.com/rlmcpherson/s3gof3r

s3gof3r s3gof3r provides fast, parallelized, pipelined streaming access to Amazon S3. It includes a command-line interface: gof3r. It is optimized for

Randall McPherson 1.1k Jun 8, 2022
Automatically deploy from GitHub to Replit, lightning fast ⚡️

repl.deploy Automatically deploy from GitHub to Replit, lightning fast ⚡️ repl.deploy is split into A GitHub app, which listens for code changes and s

Khushraj Rathod 69 Jun 25, 2022
Fast docker image distribution plugin for containerd, based on CRFS/stargz

[ ⬇️ Download] [ ?? Browse images] [ ☸ Quick Start (Kubernetes)] [ ?? Quick Start (nerdctl)] Stargz Snapshotter Read also introductory blog: Startup C

containerd 627 Jul 2, 2022
Gohalt 👮‍♀🛑: Fast; Simple; Powerful; Go Throttler library

Gohalt ??‍♀ ?? : Fast; Simple; Powerful; Go Throttler library go get -u github.com/1pkg/gohalt Introduction Gohalt is simple and convenient yet powerf

Kostiantyn Masliuk 255 Jun 17, 2022
Container Registry Synchronization made easy and fast

?? booster - Makes synchronization of container images between registries faster.

Silvio Moioli 11 May 12, 2022
KinK is a helper CLI that facilitates to manage KinD clusters as Kubernetes pods. Designed to ease clusters up for fast testing with batteries included in mind.

kink A helper CLI that facilitates to manage KinD clusters as Kubernetes pods. Table of Contents kink (KinD in Kubernetes) Introduction How it works ?

Trendyol Open Source 353 Jul 3, 2022
a fast changelog generator sourced from PRs and Issues

chronicle A fast changelog generator that sources changes from GitHub PRs and issues, organized by labels. chronicle --since-tag v0.16.0 chronicle --s

Anchore, Inc. 21 Jun 12, 2022
kubectl-fzf provides a fast and powerful fzf autocompletion for kubectl

Kubectl-fzf kubectl-fzf provides a fast and powerful fzf autocompletion for kubectl. Table of Contents Kubectl-fzf Table of Contents Features Requirem

null 1 Nov 3, 2021
FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes at Alibaba Cloud Function Compute (USENIX ATC'21)

FaaSNet FaaSNet is the first system that provides an end-to-end, integrated solution for FaaS-optimized container runtime provisioning. FaaSNet uses l

LeapLab @ CS_GMU 31 Jun 26, 2022
Fast, Docker-ready image processing server written in Go and libvips, with Thumbor URL syntax

Imagor Imagor is a fast, Docker-ready image processing server written in Go. Imagor uses one of the most efficient image processing library libvips (w

Adrian Shum 2.3k Jun 29, 2022
Next generation recitation assignment tool for 6.033. Modular, scalable, fast

Next generation recitation assignment tool for 6.033. Modular, scalable, fast

Jay Lang 1 Feb 3, 2022
Using the Golang search the Marvel Characters. This project is a web based golang application that shows the information of superheroes using Marvel api.

marvel-universe-web using the Golang search the Marvel Universe Characters About The Project This project is a web based golang application that shows

Burak KÖSE 2 Oct 10, 2021
Golang-tutorials - This repository contains golang tutorials right from basic to advanced.

Golang-tutorials This repository contains golang tutorials right from basic to advanced. Go is a statically typed, compiled programming language desig

Prayas Gautam 0 Jan 3, 2022
Golang-for-node-devs - Golang for Node.js developers

Golang for Node.js developers Who is this video for? Familiar with Node.js and i

TomDoesTech 2 Mar 10, 2022
Poc rsa - A simple golang scaffolding to help me to create new api projects or workers with golang on k8s

go-scaffold A simple golang scaffolding to help me to create new api projects or

André Luis 0 Feb 3, 2022
Golang-samples - Help someone need some practices when learning golang

GO Language Samples This project is to help someone need some practices when lea

Gui Chen 1 Jan 11, 2022