Microsoft Hackcon CTF - Challenge 8 - Email Harvesting

The question was to find all the emails of the given site. The initial page of the site showed just a link, a link to another page. That page contained 2 links both leading to 2 different pages. From that point on wards each page showed 2 different links to 2 other pages and so on. The 11th level of page showed one email id.

From this style of information, we could deduce that the arrangement is similar to that of a binary tree. Since the page at the 11th level showed the email address, there would be 2^10 ie., 1024 email ids. So we needed to design a version of binary tree traversal algorithm to get all the email ids.

So our objective is as follows :

Send a request
Analyse the response
If it is an Email, append it to a file
If it contains links, push them to a stack
pop the stack, and repeat the tasks until the stack is empty

I chose python to write the code for this. Requests module was used for sending requests and receieving responses. To parse the html response, I used a module called Beautiful Soup. Python just rocks you know.

 
import requests
import BeautifulSoup
link = "http://hackcon14.cloudapp.net:8080"
linkstack = ['/']

def checkIfEmail(data):
    if data[-12:]=='@hackcon.com':
        return True
    return False

def emailharvest(link):
    r = requests.get(link)
    bs = BeautifulSoup.BeautifulSoup(r.text)
    data= bs.findAll('a')
    for x in range(0, len(data)):
        isEmail = checkIfEmail(data[x].contents[0].encode('ascii','ignore'))
        if isEmail :
            print "Found an email : " + data[x].contents[0].encode('ascii','ignore')
            f = open('emails.txt', 'a')
            f.write(data[x].contents[0].encode('ascii','ignore'))
            f.write('\n')
        else:
            print "Found a link : /Pages/" +data[x].contents[0].encode('ascii','ignore')
            linkstack.append('/Pages/'+data[x].contents[0].encode('ascii','ignore'))

def doit():
    while(len(linkstack)!=0):
       linkpart = linkstack.pop()
       print 'Going to visit ' + linkpart
       emailharvest(link+linkpart)

Well, That did the trick, gave me 1024 emails on a text file. Uploaded it and Voila, 80 points to my team, xbios.

Crave To Code

Pages

Saturday, January 25, 2014

Microsoft Hackcon CTF - Challenge 8 - Email Harvesting

No comments:

Post a Comment