Please note that this article is only a proof of concept and it may only be used for studying and learn how easily spammers can retrieves e-mail adresses by using Google and a simple Python script.
For those to whom the idea would cross the mind, remember that Christopher William Smith, often reffered as the king of spam, has recently been sentenced to 30 years of jail…
Well, enough discussion, let’s see what we are interested to know the code itself:
#!/usr/bin/python
import sys
import re
import string
import httplib
import urllib2
import re
def StripTags(text):
finished = 0
while not finished:
finished = 1
start = text.find("<")
if start >= 0:
stop = text[start:].find(">")
if stop >= 0:
text = text[:start] + text[start+stop+1:]
finished = 0
return text
if len(sys.argv) != 2:
print "\nrsx.py : Find hundreds of e-mail adresses on Google.\n"
print "\nUsage : ./rsx.py
\n"
print "\nexemple: ./rsx.py gmail.com \n"
sys.exit(1)
domain_name=sys.argv[1]
d={}
page_counter = 0
try:
while page_counter <400:
results = 'http://groups.google.com/groups?q='+str(domain_name)+'&hl=en&lr=&ie=UTF-8&start=' + repr(page_counter) + '&sa=N'
request = urllib2.Request(results)
request.add_header('User-Agent','Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)')
opener = urllib2.build_opener()
text = opener.open(request).read()
emails = (re.findall('([\w\.\-]+@'+domain_name+')',StripTags(text)))
for email in emails:
d[email]=1
uniq_emails=d.keys()
page_counter = page_counter +10
except IOError:
print "No result found!"+""
page_counter_web=0
try:
print "\n\n+++++++++++++++++++++++++++++++++++++++++++++++++++++"+""
print "+ Results:"+""
print "+++++++++++++++++++++++++++++++++++++++++++++++++++++\n\n"+""
while page_counter_web >400 :
results_web = 'http://www.google.com/search?q=%40'+str(domain_name)+'&hl=en&lr=&ie=UTF-8&start=' + repr(page_counter_web) + '&sa=N'
request_web = urllib2.Request(results_web)
request_web.add_header('User-Agent','Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)')
opener_web = urllib2.build_opener()
text = opener_web.open(request_web).read()
emails_web = (re.findall('([\w\.\-]+@'+domain_name+')',StripTags(text)))
for email_web in emails_web:
d[email_web]=1
uniq_emails_web=d.keys()
page_counter_web = page_counter_web +10
except IOError:
print "No results found!"+""
for uniq_emails_web in d.keys():
print uniq_emails_web+""
This command line program should be launched this way (assuming you named the file rsx.py):
python rsx.py gmail.com
It’s impressing to see how many e-mail adresses you can find with only 67 lines of code!






My name is Jean-Baptiste Jung and I'm the man behind Cats Who Code. I started to use the Internet back in 1998 and started to create websites three years laters in 2001.
10 Comments
I do this with awk in 26 lines…
@KINGSPAMMER: Interresting, any exemple to share with us?
Impressing, but I hope this will not be used for spam.
not working
File “rsx.py”, line 16
stop = text[start:].find(.>.)
@internet: Which python version do you use?
let see if it works
Don’t you have something I can just type into Google?
Google doesn’t seem to take the “@”-sign. So I cannot search for e-mail-addresses! (Is this for spam-prevention?)
Why I want this:
I found a site on the net where people can publish texts anonymously, But some type their e-mail-addresses in the text so that you could answer them. Now what I wanted to do is to search the site (using Google’s “site:”-operator) for texts where this is the case (and therefore searching for any e-mail addresses).
Is there a way or would that already be illegal?
Sorry for being stupid. You seem to be really clever and know about things!
Pretty amazing how such a simply code can do so much damage! jk jk
You should look at installing commentluv or disqus on your blog. This will make it dofollow and hence you will gain more traffic.
This sounds interesting. And only with 67 lines! Quick question: will this script be applicable even to older versions of Python?
@KINGSPAMMER: Would you mind sharing how you did it with only 26 lines?
6 Trackbacks
[...] other modules we should include in the list? Leave us a comment! Into Python? Be sure to check out this article! Tags: python, python [...]
How to: Using Python and Google to find hundreds of e-mail adresses…
[...]Who never received lots of unwanted messages on their e-mail? Certainly, few of us. You probably know it, you should never leave your email address in a web page. To understand why, I propose you to study this small Python script, which will scan …
[...] page. To understand why, I propose you to study this small Python script, which will scan the Googhttp://www.catswhocode.com/blog/web-development/ind-hundreds-of-e…MapQuest Maps – Driving Directions – MapMaps Maps – Enter as much as you know. find a Business [...]
[...] [...]
[...] page. To understand why, I propose you to study this small Python script, which will scan the Googhttp://www.catswhocode.com/blog/web-development/how-to-using-python-and-google-to-find-hundreds-of-e…Google Chrome Tips and Pointers Chrome is Google’s newly released browser. It’s currently available [...]
[...] manera de encontrar emails para hacer spam … Este articulo lo he leido en Cats Who Code , tambien dicen que el spamming es penado con la [...]