Hi. I have a list of the domain portion of URLs which looks a bit like this:
Whois lookup for fycnds.digitalpoimt.com Whois lookup for wvgpzdea.digitalpoimt.com Whois lookup for zhnsht.digitalpoimt.com Whois lookup for frigo25.php5.cz Whois lookup for handrovina.php5.cz Whois lookup for blabota.php5.cz Whois lookup for pctuzing.php5.cz Whois lookup for viagraviagra.php5.cz Whois lookup for poiu.php5.cz Whois lookup for flasa.php5.cz Whois lookup for yoy4.digitalpoimt.com Whois lookup for hskly.digitalpoimt.com Whois lookup for 2i0wjwbc.digitalpoimt.com Whois lookup for harnhjc.digitalpoimt.com Whois lookup for gqru.digitalpoimt.com
I need some code which determines which portion of these hostnames is a whois-able domain name. My problem is this doesn’t seem all that simple to do — some countries have a second layer of TLDs, and some do not.
Does anyone know of a python library, or failing that simple algorithm, which will do this for me?
(For those left wondering, I am trying to do some analysis of the spam I get on this blog, and for that I want to know if the whois information for a domain that left a suspect comment indicates anything suspicious.)