July 16th, 2005
You are currently browsing the articles from WhoisIreland Review written on July 16th, 2005.
Local search is more than just matching websites to a location. A few years ago, I did a lot of research on local search, theorising and also building experimental local search engines. One of them was a mobile phone based search engine. It was perhaps a bit more advanced than a simplistic SMS based query search engine interface in that it took the user’s location into account in generating the results.
The quoting of an entire Google labs newsgroups post of mine from around that time by the operator of the searchtheowl website in a post on his blog shows how badly understood “local search” is, even today.
Localised and Local search is more than just stuffing a pile of URLs in a database and claiming that they are local because they are in the same country or even in the same county. The problem with local search is that the user wants to know what websites or resources are “near” to them. It is the definition of the term “near” that is at the heart of local search.
Tags: Irishblogs - Search - Local Search - Irish Search Engines
Written by John McCormac on July 16th, 2005 with comments disabled.
Read more articles on Search Engines.
Steve Rubel’s Micro Persuasion blog came across Yahoo’s test RSS search and posted images on this post. Yahoo pulled the test site. But it is interesting that Yahoo is taking RSS search so seriously.
The large search engines (Google, Yahoo, MSN) have been slowly evaluating adding blog search to their indices. Some like Google have incorporated a lot of blogs in their live index. Yahoo and MSN have also been busy. This
Business Week article outlines some of the background.
Tags: Irishblogs - Search - Search Engines
Written by John McCormac on July 16th, 2005 with comments disabled.
Read more articles on Search Engines.
One of the current projects at WhoisIreland.com is building a country based map of the web. The main datasets are the com/net/org/biz/info gTLDs. Essentially it means building an IP based map of the websites associated with approximately 54 million domains.
From a purely computational viewpoint, it cannot be thought about in numerical terms. It has to be thought about in abstract mathematical terms. The small gTLDs such as .info and .biz only take a few days to map. But the larger gTLDs take longer.
This is the approximate size of the problem:
.com 39.5 Million domain names.
.net 6.1 Million domain names.
.org 3.7 Million domain names.
.info 3.64 Million domain names.
.biz 1.2 Million domain names.
On any day, there can be upwards of 450K new domains and 450K deleted domains. The utilisation for the gTLDs is around 70%. That means that approximately 70% of the domains are active. But that figure drops once the on-hold and parked domains are removed from the dataset. The .info gTLD was articifically inflated by the addition of “free” .info domains to owners of existing .com domains. In real terms, it is slightly bigger than the size of the .biz gTLD.
Many of these domains, perhaps as many as 30% are speculative and are either parked on a hosting company’s servers or are not properly set up. There is no integrity checking in these gTLDs so it is not unusual to see a nameserver on an IP that does not exist. Microscopic country code TLDs like .ie have integrity checks built into the system - they check to see if the nameservers are answering for the domain prior to including the domain in the ccTLD zonefile. However the size of the gTLDs make such prior checking impractical.
So what can be done with all this data? The obvious answer is that it can provide a good starting point for the “Ghosthunter Algorithm” mentioned previously. It also provides raw search engine indices for every country with a presence on the net.
Tags: Irishblogs - Search - Irish Search Engines
Written by John McCormac on July 16th, 2005 with 2 comments.
Read more articles on Domains And Statistics.