Featured Posts

Tuesday, June 3, 2014

HTTP Security Headers Nmap Parser

Click here to download source code

Recently there have been some reports on how often major sites such as the Alexa top sites use security-related HTTP headers. Surprisingly (or maybe not) most are NOT taking full advantage of these headers. Among many system owners there seems to be a lack of awareness in regards to http headers, especially those related to security. In many architectures, these headers can be configured without changing the application, so why not take a look at them for your own sites? The reward for implementing (or removing) some of these headers can be extremely beneficial. It is worth noting that some headers are only supported by specific browsers and only offer a certain level of protection, so these headers should not be solely relied on from a security perspective.

What’s one of the first things we do when we start testing the security posture of a network? Discovery with Nmap. Nmap has a built in NSE script ‘http-headers’ which will return the headers via a HEAD request of a web server. Manually looking through a large Nmap output file to see which headers are being used can be really difficult, so I wrote a small parser in python which takes in the Nmap .xml output file and generates an .html report with only security-related header information.


  1. Run Nmap with http-headers script and xml output:
    nmap --script=http-headers <target> -oX output_file.xml
  2. Run with the .xml Nmap output file:
    python -f output_file.xml

Usage: { -f file } [-o output_filename]
There is one required argument which is the .xml Nmap output file. The user can also specify the output filename (default: Security-Headers-Report.html)

After running the script we have a nicely formatted table which contains every asset (ip:port) from the Nmap scan. Each asset displays information about nine different security-related headers: Access Control Allow Origin, Content Security Policy, Server, Strict Transport Security, Content Type Options, Frame Options, Cross Domain Policies, Powered By, and XSS Protection. This table can be copied into software such as Microsoft Excel and modified or sorted as necessary.

The reason behind creating this table is to get a clear view of the headers used in a large environment. With this report we can search for individual IPs and report on them or get a general feeling for the security posture of many servers.


Monday, January 27, 2014

SmeegeScrape: Text Scraper and Custom Word List Generator

Click Here to Download Source Code

Customize your security testing with! It's a simple python script to scrape text from various sources including local files and web pages, and turn the text into a custom word list. A customized word list has many uses, from web application testing to password cracking, having a specific set of words to use against a target can increase efficiency and effectiveness during a penetration test. I realize there are other text scrapers publicly available however I feel this script is simple, efficient, and specific enough to warrant its own release. This script is able to read almost any file which has cleartext in it that python can open. I have also included support for file formats such as pdf, html, docx, and pptx.

Usage: {-f file | -d directory | -u web_url | -l url_list_file} [-o output_filename] [-s] [-i] [-min #] [-max #]

One of the following input types is required:(-f filename), (-d directory), (-u web_url), (-l url_list_file)

-h, --help show this help message and exit
-f LOCALFILE, --localFile LOCALFILE Specify a local file to scrape
-d DIRECTORY, --fileDirectory DIRECTORY Specify a directory to scrape the inside files
-u URL, --webUrl URL Specify a url to scrape page content (correct format: http(s)://
-l URL_LIST_FILE, --webList URL_LIST_FILE Specify a text file with a list of URLs to scrape (separated by newline)
-o FILENAME, --outputFile FILENAME Specify output filename (default: smeegescrape_out.txt)
-i, --integers Remove integers [0-9] from all output
-s, --specials Remove special characters from all output
-min # Specify the minimum length for all words in output
-max # Specify the maximum length for all words in output

Scraping a local file: -f Test-File.txt

This is a sample text file with different text.

This file could be many different filetypes including html, pdf, powerpoint, docx, etc.  Anything which can be read in as cleartext can be scraped.

I hope you enjoy SmeegeScrape, feel free to comment if you like it!

Each word is separated by a newline. The options -i and -s can be used to remove any integers or special characters found. Also, the -min and -max arguments can be used to specify desired word length.

Scraping a web page: -u -si

To scrape web pages we use the python urllib2 module. The format of the url is checked via regex and it must be in the correct format (e.g. http(s)://

Scraping multiple files from a directory: -d test\ -si -min 5 -max 12

The screen output shows each file which was scraped, the total number of unique words found based on the user’s desired options, and the output filename.

Scraping multiple URLs: -l weblist.txt -si -min 6 -max 10

The -l option takes in a list of web urls from a text file and scrapes each url. Each scraped URL is displayed on the screen as well as a total number of words scraped.

This weblist option is excellent to use with Burp Suite to scrape an entire site. To do this, proxy your web traffic through Burp and discover as much content on the target site as you can (spidering, manual discovery, dictionary attack on directories/files, etc.). After the discovery phase, right click on the target in the site map and select the option “Copy URLs in this host” from the drop down list. In this instance for even a small blog like mine over 300 URLs were copied. Depending on the size of the site the scraping could take a little while, be patient!

Now just paste the URLs into a text file and run that as input with the -l option. -l SmeegeScrape-Burp-URLs.txt -si -min 6 -max 12:

So very easily we just scraped an entire site for words with specific attributes (length and character set) that we want.

As you can see there are many different possibilities with this script. I tried to make it as accurate as possible however sometimes the script depends on modules such as nltk, docx, etc. which may not always work correctly. In situations like this where the script is unable to read a certain file format, I would suggest trying to convert it to a more readable file type or copy/paste the text to a text file which can always be scraped.

The custom word list dictionaries you create are up to your imagination so have fun with it! This script could also be easily modified to extract phrases or sentences which could be used with password cracking passphrases. Here are a couple examples I made:

Holy Bible King James Version of 1611: -f HolyBibleDocx.docx -si -min 6 -max 12 -o HolyBible_scraped.txt
HolyBible_scraped.txt sample:
Shakespeare’s Romeo and Juliet: -u -si -min 6 -max 12 -o romeo_juliet_scraped.txt
romeo_juliet_scraped.txt sample:
Feel free to share your scraped lists or ideas on useful content to scrape. Comments and suggestions welcome, enjoy!

Wednesday, November 6, 2013

HashTag: Password Hash Identification

Click here to download source code or access it online at OnlineHashCrack

Interested in password cracking or cryptography? Check this out. is a tool written in python which parses and identifies various password hashes based on their type. HashTag was inspired by attending PasswordsCon 13 in Las Vegas, KoreLogic’s ‘Crack Me If You Can’ competition at Defcon, and the research of iphelix and his toolkit PACK (password analysis and cracking kit). HashTag supports the identification of over 250 hash types along with matching them to over 110 hashcat modes. HashTag is able to identify a single hash, parse a single file and identify the hashes within it, or traverse a root directory and all subdirectories for potential hash files and identify any hashes found.

One of the biggest aspects of this tool is the identification of password hashes. The main attributes I used to distinguish between hash types are character set (hexadecimal, alphanumeric, etc.), hash length, hash format (e.g. 32 character hash followed by a colon and a salt), and any specific substrings (e.g. ‘$1$’). A lot of password hash strings can’t be identified as one specific hash type based on these attributes. For example, MD5 and NTLM hashes are both 32 character hexadecimal strings. In these cases I make an exhaustive list of possible types and have the tool output reflect that. During development I created an excel spreadsheet which contains much of the hash information which can be found here or here.

Usage: {-sh hash |-f file |-d directory} [-o output_filename] [-hc] [-n]

Note: When identifying a single hash on *nix operating systems remember to use single quotes to prevent interpolation. (e.g. python -sh '$1$abc$12345')

-h, --help show this help message and exit
-sh SINGLEHASH, --singleHash SINGLEHASH Identify a single hash
-f FILE, --file FILE Parse a single file for hashes and identify them
-d DIRECTORY, --directory DIRECTORY Parse, identify, and categorize hashes within a directory and all subdirectories
-o OUTPUT, --output OUTPUT Filename to output full list of all identified hashes
--file default filename: HashTag/HashTag_Output_File.txt
--directory default filename: HashTag/HashTag_Hash_File.txt
-hc, --hashcatOutput --file: Output a file per different hash type found, if corresponding hashcat mode exists
--directory: Appends hashcat mode to end of separate files
-n, --notFound --file: Include unidentifiable hashes in the output file. Good for tool debugging (Is it Identifying properly?)

Identify a single hash (-sh): -sh $1$MtCReiOj$zvOdxVzPtrQ.PXNW3hTHI0 -sh 7026360f1826f8bc -sh 3b1015ccf38fc2a32c18674c166fa447

Parsing and identifying hashes from a file (-f): -f testdir\street-hashes.10.txt -hc

Here is the output file. Each identified hash outputs the hash, char length, hashcat modes (if found) , and possible hash types:
Using the -hc/--hashcat argument we get a file for each hash type if a corresponding hashcat mode is found. This makes the process of cracking hashes with hashcat much easier as you immediately have the mode and input file of hashes:
Output from a file with many different hash types (the filenames are hashcat modes and inside are all hashes of that type):

Traversing Directories and Identifying Hashes (-d): -d ./testdir -hc

The output consists of three main things:

  • Folders containing copies of potentially password protected files. This makes it easy to group files based on extension and attempt to crack them.
  • HashTag default files - A listing of all hashes, password protected files the tool doesn’t recognize, and hashes the tool can’t identify (good for tool debugging).
  • Files for each identified hash type - each file contains a list of hashes. The -hc/--hashcat argument will append the hashcat mode (if found) to the filename.

Resources: Quite a bit of research went into the difference between password hash types. During this research I found a script called Hash Identifier which was actually included in one of the Backtrack versions. After looking it over I feel my tool has a lot more functionality, efficiency, and accuracy. My other research ranged from finding different hash examples to generating my own hashes via the passlib module. I would like to give credit to the following resources which all had some impact in the creation of this tool.

As always, if you see any coding errors, false assumptions (hash identification), or have constructive criticism please contact me. Hope you like it!

Saturday, August 3, 2013

Defcon 21: Password Cracking KoreLogic's Shirt

For the past couple of years KoreLogic Security has had a wonderful presence at Defcon with their 'Crack Me If You Can' contest where some of the best password crackers in the world join up in teams or go solo to compete against each other. Although I haven't competed in this competition (mainly due to lack of hardware and wanting to spend most of my time at briefings) I always make it a point to stop by the KoreLogic booth and grab a shirt. No, I'm not doing it because it's 'free swag', I always stop by for great conversation and to get one of their shirts which have relatively simple hash(es) on them. It's rather fun to practice even the simple things with password cracking. This year for Defcon 21 the shirt they gave out looked like this:

Clearly there is a hash in there somewhere.. looking a little closer it's clear there is a 32 character pattern which wraps around the logo. Hmm.. the most common 32 character hash.. md5! Let's give it a try.

Wait.. how do we know where the hash starts and ends? We don't. There are 32 characters which means 32 different possibilities for the correct hash. To generate the different permutations I wrote a quick python script which writes all of the possible hashes to the file hashlist.hash.

hashArray = ['b','2','c','b','b','e','c','9','1','6','d','c','8','2','b','2','f','b','2','0','d','1','2','b','e','1','d','7','e','3','d','b'] 
hashList = list() 
fullHash = '' 

charIndex = 0 
while charIndex < len(hashArray): 
	hashArray = hashArray[charIndex:] + hashArray[:charIndex] 
	for i in range(len(hashArray)): 
		fullHash += hashArray[i] 
	fullHash = '' 
	charIndex += 1 

f = open('hashlist.hash', 'w') 
for item in hashList: 
	f.write(item + '\n') 
	print item

We now have the following possibilities:

Since I am using a netbook instead of a crazy GPU rig I decided to use hashcat which does CPU cracking (plus and lite for GPU). I then downloaded the rockyou.txt wordlist from skullsecurity. Everything is now ready, I have my cracking tool, list of hashes, and wordlist. Since I am assuming it's md5 I use the following hashcat command:

./hashcat-cli32.bin -m 0 -r rules/best64.rule hashlist.hash rockyou.txt

After about 30 seconds of running we get a hit!


A little anticlimactic but fun nonetheless. This was rather simple, some would say trivial, but the script to make multiple hash permutations with only characters may be helpful to someone. Big thanks to KoreLogic for putting on the CMIYC contest and giving out shirts with challenges.