A Lot Of The Web Is Not There

Over the last few weeks, WhoisIreland.com has been checking all websites in .com/net/org/info/biz/ie. The initial results are quite surprising. A lot of the web just is not there. Or to be more precise, it is coming soon. These “coming soon” websites tend to share the same IP. In some cases the IP of one of these “coming soon” websites can have millions of associated websites. The smaller ones have thousands.

Another interesting aspect is that the number of distinct IPs is far smaller than first expected. With approximately 54 million domains in com/net/org/biz/info/ie, the number distinct IPs of the associated websites is probably less than a tenth of that number.

The hard part of the work begins now - crunching this data to provide usable results. The Ghosthunter Algorithm should show up a lot of the hidden Irish websites. These are websites on servers outside of Irish IP space and identified Irish hosters. A simple test detected two Irish hosters (IEDR resellers) who use UK and US nameservers and IPs. It should provide a superior irish search engine index. The same algorithm can be applied to other countries. The algorithm itself uses a number of elements to go beyond the simplistic IP/ccTLD based categorisation that Google/Yahoo/Microsoft use to generate their country level search indices.

Tags: - - - - -

Written by John McCormac on August 22nd, 2005 with comments disabled.
Read more articles on Domains And Statistics.

Related articles

Comments disabled

Comments on this article have been disabled.