When I run muffet against a local site I see in the logs that some pages are being requested many times in a single run. This seem unnecessary and puts extra load on the server being tested.
Here is a simple example. Create "test.html" with this content:
<html><body>
<a href="/foo.html">foo</a>
<a href="/test2.html">test2</a>
</body></html>
and "test2.html" with this:
<html><body>
<a href="/foo.html">foo</a>
</body></html>
Then serve this content with python3 -m http.server
.
And run muffet http://localhost:8000/test.html
.
The python http.server output I get is this:
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
127.0.0.1 - - [24/Aug/2018 16:14:01] "GET /test.html HTTP/1.1" 200 -
127.0.0.1 - - [24/Aug/2018 16:14:01] "GET /test2.html HTTP/1.1" 200 -
127.0.0.1 - - [24/Aug/2018 16:14:01] code 404, message File not found
127.0.0.1 - - [24/Aug/2018 16:14:01] "GET /foo.html HTTP/1.1" 404 -
127.0.0.1 - - [24/Aug/2018 16:14:01] code 404, message File not found
127.0.0.1 - - [24/Aug/2018 16:14:01] "GET /foo.html HTTP/1.1" 404 -
This shows that "/foo.html" was requested multiple times.
Strangely, small changes to those html files cause different results. If I add a link to test.html, muffet requests foo.html only once in the run.
duplicate