Allan Cho over at Allan’s Library links to the announcement that semantic search engine site Hakia is seeking submissions of credible websites and sees it both as an opportunity and a cause for concern. Hakia is asking -

Librarians and information professionals can suggest URLs leading to the most credible Websites on a given topic. hakia will process the sites with its proprietary QDEX (Query Detection and Extraction) technology and make them available to Web searchers in credibility-stamped search results.

The press release has been written up over at Mashable and ReadWrite Web. Quite rightly, there are concerns that a company expects us to volunteer for this type of project (though the press release notes, “Each month hakia will give away thank-you prizes, ranging from a book donation to two conference grants, to participants”). Of greater concern, which ReadWrite Web hints at, is the critieria of what constitutes a credible website -

Hakia asks submitters for their professional credentials, but it is not clear if the company will actually check these.

Hakia uses a very strict definition for what makes a site credible. To be included in the index, a site should have gone through a peer review process, not have any commercial bias, and the information should be current. The fact that Hakia insists on only adding peer reviewed sites should greatly enhances the signal-to-noise ratio of the search results.

The Hakia blog outlines in greater detail what they consider a credible website. If the Semantic Web is going to be a trustworthy web (ie – that meaningful data, links and information can be trusted) the criteria we use to create that trust must be robust. Web credibility is an area I’ve begun researching very recently. As I posted in my query to AIR-L, there is a need to be able to inform people about criteria that goes beyond the surface of ownership, date last updated, presence of peer review. We need criteria that will help individuals to be able to critically evaluate reputation, depth, and quality of peer review.

One of the criterions Hakia is using is -

Source authenticity. The publisher (preferably) should be the owner/producer of the content

What becomes of repositories, data archives, statistical websites etc? The material I archive at DLIST isn’t owned by them, but I implicitly give them a license to host and reproduce it. Another criterion is currency, not everything old is irrelevant, especially when considering research, statements, archives, and government information.

Credibility means different things to different groups, interests, and people. But that doesn’t mean we shouldn’t work towards providing tools and metrics for people to evaluate websites for themselves. I remember a plugin project of a few years ago where you could install a toolbar in your browser and when you visited a site that you thought was credible, you clicked a button. That project is no longer, not only because it was time consuming, but because everyone approaches information in different ways. We can agree that there are certain sites that are obviously not credible (eg, the oft-used teaching example, DHMO.org, but the challenge begins when sites are less obviously biased.

This is an area I will be continuing to research. There is a great deal of information available already on approaches to web credibility, here’s a tiny sample of the work being done -

Cooke, A. (2001). A guide to finding quality information on the Internet: selection and evaluation strategies. London: Library Association.

Eastin, M. S. (2007). Toward a Cognitive Developmental Approach to Youth Perceptions of Credibility. The John D. and Catherine T. MacArthur Foundation Series on Digital Media and Learning, 29-47.

IFLA/FAIFE. (2008). Internet Manifesto Workshop Manual

Lankes, R. D. (2007). Trusting the Internet: New Approaches to Credibility Tools. The John D. and Catherine T. MacArthur Foundation Series on Digital Media and Learning, 101-121.

Credibility projects:

Stanford University Web Credibility research

Credibility and Digital Media @UCSB

Originally published on the semanticlibrary.net blog