August 2005

You are currently browsing the articles from WhoisIreland Review written in the month of August 2005.

Finding The Missing Irish Websites

Over the last few years, the Irish web has changed drastically. Most of it had been hosted outside of Ireland due to the extortionate fees charged by Irish ISPs to host websites. It was cheaper to host Irish websites in the UK and the US. But with the growth of the Irish hosting industry, it is now easier and cheaper than it was to host a website in Ireland. The Irish ISPs now only account for 25% of the Irish website business and that continues to fall. Irish Hosting Service Providers (HSPs) account for the rest. But there is still a section of the Irish web that hosts on US and UK servers.

The reasons for this are many. The traditionally high cost of hosting in Ireland is a factor but one of the more important reasons is that these websites are hosted abroad because of the webdevelopers. The Irish hosting business is a curious mix between dedicated HSPs like Hosting365,Novara and Blacknightsolutions.com and web development companies with a lot of clients.

While the HSPs are easy to categorise, but many of the web development companies still use legacy solutions and host their clients in the US or UK on shared or dedicated hosting. WhoisIreland.com tracks the hosting patterns of approximately 970 Irish hosters. These are the easily identified Irish hosters and the statistics on these hoster forms part of the Irish hosting industry reports that WhoisIreland.com provides each month. But there is a part of the Irish web that has been, up until now, difficult to track - the missing Irish web.

For the Irish web, that missing section hosted in the UK, US and elsewhere could be as much as 15% of the overall Irish web. That’s potentially thousands of Irish websites that are missing from the “pages from Ireland” searches of the large search engines.

To date, the large search engines like Google, Yahoo and MSN have not solved this problem of the missing web. WhoisIreland.com has come a lot closer to solving it. The Ghosthunter algorithm developed to detect these missing hosters has identified 2979 potential Irish micro hosters on US IP space and 1814 potential Irish hosters on UK IP space. In real terms, these figures will drop as the different levels of the algorithm are applied.

Three significant Irish web development companies on UK IP space were identified with 100% accuracy on the first run along with the websites hosted by one of IEDR’s resellers. The size of these Irish micro-hosters varies from a couple of sites hosted to over a hundred sites hosted.

Tags: - - - -

Written by John McCormac on August 25th, 2005 with comments disabled.
Read more articles on Search Engines.

Irish Times Blog Listings?

The Irish Times has had a bumpy relationship with the web. Its coverage of technology during the dot.bomb era ran the gamut from utterly clueless to to downright wrong with the odd detour to accurate reporting. The web really does not know what goes on in that walled garden of a site that is the Irish Times. Apparently it now has an embryonic blogs listings. Though it refers to them as “weblogs” rather than the more common term “blogs”. The range of blogs is limited and is more like a short list of the blogs that the Irish Times people read rather than a comprehensive list of blogs - Irish or otherwise. The Irish Times blog directory was mentioned here by Dave O’Neill on his blog.

The Irish Times’ technology section is Pay Per View. Most people on the Irish web don’t pay to read the IT’s technology section’s OpEds, blatent product pimping and unfortunately few news items. Like the technology sections in most newspapers since the dot.bomb era, the IT’s technology section has shrunk too. Since web events and business are now day-to-day news topics, the idea of a dedicated technology section is perhaps harder to defend.

Though if the Irish Times did add a weekly blog listing to its Friday “Technology In Business” [1] section, it would, in a strange way, be returning to its roots. The IT’s Computimes section, a computer/net/comms section that preceded the “Technology In Business” section, used to have a listing of websites, bulletin board systems (BBSes) and interesting links. The Computimes section was respected and considered clueful.

So could the Irish Times be starting its own blog listings? Stranger things have happened - the Sunday Tribune has its own little column on the Irish blogosphere. Though Irishblogs.ie and planetoftheblogs.com are far more useful and comprehensive listings of Irish blogs.

[1]: Apparently I confused the title of the Irish Times “Technology In Business” section with the Sunday Business Post’s “Computers In Business” supplement. Sometimes it is hard to tell the difference between them but ironically the error was pointed out by John Collins, former editor of consumer PC publication PCLive who is now an Ireland.com employee and freelance contributor to the Irish Times technology section.

Tags: - - - -

Written by John McCormac on August 23rd, 2005 with 1 comment.
Read more articles on Tech Commentary.

A Lot Of The Web Is Not There

Over the last few weeks, WhoisIreland.com has been checking all websites in .com/net/org/info/biz/ie. The initial results are quite surprising. A lot of the web just is not there. Or to be more precise, it is coming soon. These “coming soon” websites tend to share the same IP. In some cases the IP of one of these “coming soon” websites can have millions of associated websites. The smaller ones have thousands.

Another interesting aspect is that the number of distinct IPs is far smaller than first expected. With approximately 54 million domains in com/net/org/biz/info/ie, the number distinct IPs of the associated websites is probably less than a tenth of that number.

The hard part of the work begins now - crunching this data to provide usable results. The Ghosthunter Algorithm should show up a lot of the hidden Irish websites. These are websites on servers outside of Irish IP space and identified Irish hosters. A simple test detected two Irish hosters (IEDR resellers) who use UK and US nameservers and IPs. It should provide a superior irish search engine index. The same algorithm can be applied to other countries. The algorithm itself uses a number of elements to go beyond the simplistic IP/ccTLD based categorisation that Google/Yahoo/Microsoft use to generate their country level search indices.

Tags: - - - - -

Written by John McCormac on August 22nd, 2005 with comments disabled.
Read more articles on Domains And Statistics.

Yahoo and Google Argue Over Search Index size

It seems that Yahoo and Google have different views as to which has the bigger search index. A post on the Yahoo Search blog announced that Yahoo’s index had grown to 19.2 Billion web documents. The New York Times quoted Sergey Brin of Google as saying : “The comprehensiveness of any search engine should be measured by real Web pages that can be returned in response to real search queries and verified to be unique,”. In the same article, he was quoted as saying that: “We [Google] report the total index size of Google based on this approach.” Google’s webpage count currently stands at 8,168,684,336 web pages

A brief study at the US National Centre for Supercomputing Application put the claims to the test. The study was based on approximating the sizes of the indices. But it does express doubt over Yahoo’s claims.

With hundreds of thousands of domains and websites being deleted globally each day and hundreds of thousands of new domains and websites being created, it would only be possible to give an approximation as to the size of the web.

Tags: - - - -

Written by John McCormac on August 16th, 2005 with comments disabled.
Read more articles on Search Engines.

Deadwood In The DNS

The differences between the zonefiles and the active domains on a nameserver can show more than just badly set up domains. They can show domains that have not been paid for by their owners. While these domains still appear in the zone file, the nameservers referred to in the zonefiles do not provide any data for these domains. The effect is that these domains do not exist on the net. But the reasons behind this can be interesting.

One of the tactics that some hosters use when a domain or hosting fee has not been paid is to pull the domain data from the nameservers. It is the old world model of suspending service until the bill has been paid. It also breaks the model of how the Domain Name System should work.

The August statistics for the Irish hosting business showed that one ISP had a major problem with dead domains. Approximately 30% of the domains it hosted are dead for one reason or the other. The dead domains percentage can vary considerably. Sometimes domains will be registered but never set up. They will be in the zonefiles but the nameservers will not provide any data. This happens a lot with speculative domain registration. But the way that the ISP section of the Irish hosting business is continuing to lose market share, the chances are that many of these dead domains are, to use a rather bad cliche, the roadkill of the information superhighway.

Tags: - - - - -

Written by John McCormac on August 15th, 2005 with comments disabled.
Read more articles on Domains And Statistics.