When enumerating a web application, you will want to determine what pages exist. A common practice that is normally used is called spidering. Spidering works by going to a website and then following every single link within that page and any subsequent pages within that website. However, for certain sites, such as wikis, this method may result in the deletion of data if a link performs an edit or delete function when accessed. This recipe will instead take a list of commonly found filenames of web pages and check whether they exist.
For this recipe, you will need to create a list of commonly found page names. Penetration testing distributions, such as Kali Linux will come with word lists for various brute forcing tools and these could be used instead of generating your own.
The following script will take a list of possible filenames and test to see whether the pages exist within a website:
#bruteforce file names import sys import urllib2 if len(sys.argv) !=4: print "usage: %s url wordlist fileextension " % (sys.argv[0]) sys.exit(0) base_url = str(sys.argv[1]) wordlist= str(sys.argv[2]) extension=str(sys.argv[3]) filelist = open(wordlist,'r') foundfiles = [] for file in filelist: file=file.strip(" ") extension=extension.rstrip() url=base_url+file+"."+str(extension.strip(".")) try: request = urllib2.urlopen(url) if(request.getcode()==200): foundfiles.append(file+"."+extension.strip(".")) request.close() except urllib2.HTTPError, e: pass if len(foundfiles)>0: print "The following files exist: " for filename in foundfiles: print filename+" " else: print "No files found "
The following output shows what could be returned when run against Damn Vulnerable Web App (DVWA) using a list of commonly found web pages:
python filebrute.py http://192.168.68.137/dvwa/ filelist.txt .php The following files exist: index.php about.php login.php security.php logout.php setup.php instructions.php phpinfo.php
After importing the necessary modules and validating the number of arguments, the list of filenames to check is opened in read-only mode, which is indicated by the r
parameter in the file's open
operation:
filelist = open(wordlist,'r')
When the script enters the loop for the list of filenames, any newline characters are stripped from the filename, as this will affect the creation of the URLs when checking for the existence of the filename. If a preceding .
exists in the provided extension, then that also is stripped. This allows for the use of an extension that does or doesn't have the preceding .
included, for example, .php
or php
:
file=file.strip(" ") extension=extension.rstrip() url=base_url+file+"."+str(extension.strip("."))
The main action of the script then checks whether or not a web page with the given filename exists by checking for a HTTP 200
code and catches any errors given by a nonexistent page:
try: request = urllib2.urlopen(url) if(request.getcode()==200): foundfiles.append(file+"."+extension.strip(".")) request.close() except urllib2.HTTPError, e: pass