Extract endpoints marked as disallow in robots files to generate wordlists.

Overview

roboXtractor

This tool has been developed to extract endpoints marked as disallow in robots.txt file. It crawls the file directly on the web and has a wayback machine query mode (1 query for each of the previous 5 years).

Possible uses of roboXtractor:

  • Generate a customized wordlist of endpoints for later use in a fuzzing tool (-m 1).
  • Generate a list of URLs to visit (-m 0).


🛠️ Installation

If you want to make modifications locally and compile it, follow the instructions below:

> git clone https://github.com/Josue87/roboxtractor.git
> cd roboxtractor
> go build

If you are only interested in using the program:

> go get -u github.com/Josue87/roboxtractor

Note If you are using version 1.16 or higher and you have any errors, run the following command:

> go env -w GO111MODULE="auto"

🗒 Options

The flags that can be used to launch the tool:

Flag Type Description Example
u string URL to extract endpoints marked as disallow in robots.txt file. -u https://example.com
m uint Extract URLs (0) // Extract endpoints to generate a wordlist (>1 default) -m 1
wb bool Check Wayback Machine. Check 5 years (Slow mode) -wb
v bool Verbose mode. Displays additional information at each step -v
s bool Silen mode doesn't show banner -s

You can ignore the -u flag and pass a file directly as follows:

cat urls.txt | roboxtractor -m 1 -v

Only the results are written to the standard output. The banner and information messages with the -v flag are redirected to the error output,

👾 Usage

The following are some examples of use:

roboxtractor --help
cat urls.txt | roboxtractor -m 0 -v
roboxtractor -u https://www.example.com -m 1 -wb
cat urls.txt | roboxtractor -m 1 -s > ./customwordlist.txt
cat urls.txt | roboxtractor -s -v | uniq > ./uniquewordlist.txt
echo http://example.com | roboxtractor -v
echo http://example.com | roboxtractor -v -wb

🚀 Examples

Let's take a look at some examples. We have the following file:

image

Extracting endpoints:

image

Extracting URLs:

image

Checking Wayback Machine:

image

Github had many entries in the file, which were not useful, a cleaning process is done to avoid duplicates or entries with *. Check the following image:

image

For example:

  • /gist/\*/\*/\* is transformed as gist.
  • /\*/tarball is trasformed as tarball.
  • /, /* or similar entries are removed.

🤗 Thanks to

The idea came from a tweet from @remonsec that did something similar in a bash script. Check the tweet.

Issues
  • Wayback robots

    Wayback robots

    Does it pull robots.txt from wayback machine or commoncrawl??

    opened by mr0xE 2
Owner
Josué Encinar
Offensive Security Engineer
Josué Encinar
Password generator written in Go

go-generate-password Password generator written in Go. Use as a library or as a CLI. Usage CLI go-generate-password can be used on the cli, just insta

Miles Croxford 27 Aug 25, 2021
A rest application to update firewalld rules on a linux server

Firewalld-rest A REST application to dynamically update firewalld rules on a linux server. Firewalld is a firewall management tool for Linux operating

Prashant Gupta 307 Aug 22, 2021
DockerSlim (docker-slim): Don't change anything in your Docker container image and minify it by up to 30x (and for compiled languages even more) making it secure too! (free and open source)

Minify and Secure Docker containers (free and open source!) Don't change anything in your Docker container image and minify it by up to 30x making it

docker-slim 10.6k Sep 13, 2021
gosec - Golang Security Checker

Inspects source code for security problems by scanning the Go AST.

Secure Go 5.4k Sep 15, 2021
An easy-to-use SHA-1 hash-cracker written in Golang.

wrench - An easy-to-use SHA-1 hash-cracker. Wrench is an SHA-1 hash-cracker that relies on wordlists for comparing hashes, and cracking them. Before W

null 4 Aug 29, 2021
Container Signing

cosign Container Signing, Verification and Storage in an OCI registry. Cosign aims to make signatures invisible infrastructure. Info Cosign is develop

sigstore 955 Sep 13, 2021
An opinionated helper for generating tls certificates

Certificates helper This is an opinionated helper for generating tls certificates. It outputs only in PEM format but this enables you easily generate

Martijn van Maasakkers 21 Jul 9, 2021
androidqf (Android Quick Forensics) helps quickly gathering forensic evidence from Android devices, in order to identify potential traces of compromise.

androidqf androidqf (Android Quick Forensics) is a portable tool to simplify the acquisition of relevant forensic data from Android devices. It is the

Nex 8 Aug 31, 2021
A collection of cool tools used by Mobile hackers. Happy hacking , Happy bug-hunting

A collection of cool tools used by Mobile hackers. Happy hacking , Happy bug-hunting Family project Table of Contents Weapons Contribute Thanks to con

HAHWUL 254 Sep 1, 2021
A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability.

age age is a simple, modern and secure file encryption tool, format, and library. It features small explicit keys, no config options, and UNIX-style c

Filippo Valsorda 8.3k Sep 12, 2021
A scalable overlay networking tool with a focus on performance, simplicity and security

What is Nebula? Nebula is a scalable overlay networking tool with a focus on performance, simplicity and security. It lets you seamlessly connect comp

Slack 7.8k Sep 15, 2021
Declarative penetration testing orchestration framework

Decker - Penetration Testing Orchestration Framework Purpose Decker is a penetration testing orchestration framework. It leverages HashiCorp Configura

Steven Aldinger 264 Aug 24, 2021
A convenience library for generating, comparing and inspecting password hashes using the scrypt KDF in Go 🔑

simple-scrypt simple-scrypt provides a convenience wrapper around Go's existing scrypt package that makes it easier to securely derive strong keys ("h

Matt Silverlock 172 Jul 12, 2021
✒ A self-hosted, cross-platform service to sign iOS apps using any CI as a builder

iOS Signer Service A self-hosted, cross-platform service to sign iOS apps using any CI as a builder Introduction There are many reasons to install app

null 459 Sep 16, 2021
:lock: acmetool, an automatic certificate acquisition tool for ACME (Let's Encrypt)

acmetool is an easy-to-use command line tool for automatically acquiring certificates from ACME servers (such as Let's Encrypt). Designed to flexibly

Hugo Landau 1.9k Sep 5, 2021
Cossack Labs 793 Sep 8, 2021
Fast web fuzzer written in Go

/'___\ /'___\ /'___\ /\ \__/ /\ \__/ __ __ /\ \__/ \ \ ,__\\ \ ,__\/\ \/\ \ \ \ ,__\ \ \ \_/ \ \ \_/\ \ \_\ \ \ \

null 5.2k Sep 15, 2021
Encrypt your files or notes by your GPG key and save to MinIO or Amazon S3 easily!

Super Dollop Super Dollop can encrypt your files and notes by your own GPG key and save them in S3 or minIO to keep them safe and portability, also yo

Nedim AKAR 41 Sep 8, 2021
Convenience of containers, security of virtual machines

Convenience of containers, security of virtual machines With firebuild, you can build and deploy secure VMs directly from Dockerfiles and Docker image

null 23 Aug 21, 2021