WHAT'S NEW?
SITE OF THE WEEK
A funny thing happened on the way to the Internet last week. I picked up a nasty
bug. Some crazed geek created a virus that blocks the ability of a personal computer
to access the search engine Google.
VISIT the Open Directory web site
Some geek designed it and I caught it. On a slow day I use Google twenty or
thirty times. On a busy day of research I may go there a hundred times. I use
Google like a phone book, like a dictionary, a research library and a photo archive.
For one horrific day I had to use those "other" search engines.
"You should use Dmoz instead," Norman O’Neil told me.
Norm is a Seacoast businessman who has been working with computers and software
for two decades. Currently with his partner Louise he develops web applications
for companies using the name E-norm. Norm is one of those freethinking computer
types who is not afraid to do yoga and smoke cigarettes. He eats health food and
hamburgers. Like many of us, he has a healthy paranoia about anything that has
grown too big – like Microsoft, or AOL, or even Google. He’s always looking for
alternative approaches and he’s a big fan of "open source" software. Those are
the free programs co-developed by volunteer experts around the globe, not for
profit, but to get the most practical computer applications to the most people
for the least cost.
"I’m a heavy user and contributor to and believer in open source software," Norm
says. "Because I don’t want to be held hostage to repetitive annual licensing
fees. With open source there’s no cost for major upgrade and no danger of products
being discontinued."
Who needs the added cost of the latest Microsoft bloatware, he asks? It just
gets bigger, not better.
Norm likes to use PHP, the open source scripting code for web developers, an
alternative to html and cold fusion. He favors the MySQL database software, the
free Linux operating system and the open source Mozilla browser.
So when it comes to search engines, Norm likes the Open Directory Project (ODP).
This is a giant search engine created by real people. Google is so big that it
relies on computer analogues to decide which web sites go to the top of the list,
and which go to the bottom. It is a complex formula based on traffic patterns
and crawling robots. ODP, by contrast, does not list the most popular sites, but
instead lists every site that its thousands of editors deem worthy.
Instead of searching on a single word, the user drills down through layers of
categories. Currently there are 16 main categories like Sports, Society, Shopping,
Science, Business. To find out what newspapers are in New Hampshire, for example,
I click on NEWS and then "Newspapers" and then "Regional" and then "United States"
and then "New Hampshire". In that process I cull the list from 3,587 choices down
to 37 New Hampshire papers. Searching on Google for "newspapers, New Hampshire"
turns up – surprisingly – not a single NH paper in the top 50 listings. Instead,
I got long lists of secondary-source web sites offering their own lists. Some
were good. Some were awful. Okay Norm, you made a point.
THE WEB SITE MAKERS
Dmoz is more about people using machinery than about machinery using people.
The grassroots search engine project first appeared in 1998 when Yahoo was winning
the search engine wars. Dmoz editors wanted to catalog every single site on the
Internet. That was the plan anyway, but the originators didn’t want to hire people
or spend money, or even make money. This was to be a utopian search engine of,
by and for the people. If the Internet is the new collective brain of all humanity,
Dmoz is the new Dewey Decimal System. At least that’s the plan.
Originally it was call Gnu-hoo. That name was changed to New-hoo, but Yahoo was
not happy with that. When Netscape offered to help, the still fledgling directory
was hosted on Mozilla (we talked about that site here recently), the Netscape
open source code. The original NewHoo URL was located at "directory.mozilla.org",
so the site was renamed Dmoz.
Currently the Open Directory Project (ODP) has classified 3.8 million sites in
400,000 categories thanks to the volunteer effort of 58,832 editors. Norm points
out that while search engines are, by nature, arbitrary lists of web sites, the
ODP tries harder. Google ranks sites in large part based on how much traffic they
engender. If lots of other web sites link to a source site, Google’s robots assume
it must be doing something right, and moves that site up in the rankings.
Those of us with high traffic sites love that rating system. But owners of new
sites with little traffic and few links find it is next to impossible to get into
the ranking game. On Dmoz, all they have to do is register. If the editors who
visit the site determine that it is what it claims to be – that new site is included
in the open directory. It then has an fair chance at showing up in a Dmoz search,
since the deeper the user searches, the more web sites bubble to the surface.
On Google, most of us look at the Top Ten list, pick something, and quit searching.
We find what we "think" we are looking for, Norm says – but do we really find
what we need? Those who get on Dmoz, at least, can find their way into the Google
ranking.
THE UPSHOT
I plan to stick with Google until they pry my cold dead fingers off my laser
mouse, but Norm’s point about using alternative sources is well made. Dmoz is
a great alternative. And Google knows this. In fact, Google has incorporated the
ODP site into its own listing. Just go to Google.com and click on "Directory"
for the Google-ized version of the open directory search engine.
"Google’s whole purpose in life is to categorize and prioritize information,"
Norm O’Neil says. "If they view something as a shopping experience, for example,
if they feel there is no value there, they dump it. Shopping is not their mantra."
"Google is good if you know what you want. It’s awesome for research and information,"
Norm explains. "Dmoz is good if you don’t know. Dmoz is counting on the fact that
people can intelligently navigate down and are willing to go down alphabetically.
Google says -- we are providing you what we think is relevant. Dmoz says we are
going to provide you with everything."
That’s the dream anyway. In order for Dmoz to truly fulfill its promise, it needs
a lot more editors to add a lot more sites. Norm says he signed up to be an editor
there, but never heard back. The ODP data is spotty at best. If you are shopping
for antiques, for example, there is a world of choices. If you are searching for
regional poets in New Hampshire, there is no such category. There are not even
any writers in the Granite State, nor any theaters since no one has yet entered
that data into Dmoz. What you find, depends on where you look. Drill down through
the wrong category, and you may miss important links.
I’m no stranger to the process. One of my web sites is a search engine that uses
a category system similar to ODP. I had a programmer build the database in PERL,
a now old-fashioned but powerful program. I created all 400 categories myself
and entered nearly 3,000 web sites. I decide what sites get in and under what
categories, so I have to anticipate what my users are looking for. It ain’t easy.
But like ODP, I put in every web site I can find – or every site that contacts
me. It sometimes takes months for me to catch up. Google has to do this for millions
and millions of web pages, so they use a robot. Robots are fast, but robots aren’t
people.
"Google is a haphazard search," Norm explains. "I’m not saying that’s bad. That’s
just the way it is."
Inevitably we find what we’re looking for, thanks to Google’s crawling robots,
but we can only find what the robots deem worthy. We’re letting them make decisions
for us.
The problem for the open source search engine is that we seem to like having
our decisions made for us. The more Google grows in popularity, the harder it
is to attract people to Dmoz. It is refreshingly noncommercial, but it takes more
than a single click and sometimes comes up empty.
"People like to be in the safety zone." Norm says, but he does not believe the
grassroots revolution is lost.
Millions of web sites are currently built in open source code. While most consumers
blindly accept the commercial software bundled onto their new computers, some
resist. Some sheep simply follow the Google shepherd. Others seek a more daring
path. For the moment, anyway, I’m with the flock.
Please visit these SeacoastNH.com ad partners.