Web sites and applications burst on the scene out of nowhere, attract massive usage and undergo continual improvements to make them better. We wonder how we ever got along without them, until they get bought out, put up access or paywalls, or just disappear.

Libraries have long been concerned with preserving information for the future, and increasingly that includes digital information and websites (for example, Pandora at the National Library of Australia which archives everything from blogs to the 2000 Games site).

So where do they intersect? And how can we take a more proactive approach to design for sustainability rather than saving retrospectively? The Semantic Web is all about linking, openness, and relationships between data. In some ways the Semantic Web is, in my view, how we will move towards a more Sustainable Web.

What might the Sustainable Web be?

Adapting the Triple Bottom Line approach to sustainability, web developers and those who create data could take a lifecycle approach to how they create, manage and produce sites and information. When planning a new website, dataset or service, in addition to deciding on purpose, standards and features, you could also include a statement about how you would -

  • Distribute the data if you were no longer maintaining the site (using a LOCKSS principle, perhaps?)
  • Migrate to future standards
  • Ensure that your site is indexed in the Internet Archive (all pages and data, not just the index)
  • Give people ownership of their data (if you’re running an online service where people store or save information) so they can get it out when they want, or own it if the site closes or the terms of service changes significantly (eg, in the instance of a buyout).

Depending on what type of site it is, there may be governance and political impacts now or in the future. If you’re running a scientific research portal, how might changes in government policy affect the site? What obligations might be imposed on sharing or accessing the data you provide?

Using open standards as the backbone

A starting point is to use open standards. In addition to W3C standards most of us already know (like HTML and CSS), we can extend this to Semantic Web standards like OWL and RDF. Adherance to standards allows information to be interpreted correctly, exchanged, and migrated to newer standards in the future. Standards may also make it easier to hand datsets over to someone else or distribute copies to keep it accessible. It’s a key part of understanding the potential of the Semantic Web according this summary of a talk by Nova Spivack at last week’s The Next Web -

“The semantic web is not so much about “semantics” as it is set of open standards defined at W3C. The semantic web approach builds on open standard meta data which is in line with previous presentations that supported the open data approach. The idea is that everyone profits from everyone’s metadata. The semantic web is a compromise in making the data smarter and the software smarter. It is the best of both worlds.”

Keeping data usable

Over the past two years, libraries, museums, companies and other organisations have set up pages in Facebook, mySpace and other social networking sites. In some libraries, this is the work of an emerging technologies specialist, in others it’s an added role for an individual that may or not be sustained if that person leaves or changes job focus.

Whatever the situation, it’s not the best use of time to have to create a new profile and create networks in every service. This is where a move towards data standards and portability is a plus. Being able to move data between and in/out of these services saves time and sustains online networks and communities. Data Portability is one of the major projects looking at these issues. According to Chris Saad from the project, “The new innovation platform is data” and this is certainly true if looking at things from a Semantic Web point of view.

Libraries and the sustainable web

A recent article in Interactions stresses the importance of designing for sustainability of content on the web - the authors note that libraries and other cultural insitutitions will be at the heart of these efforts,

“Digital technology makes it possible to extend the walls of the archive beyond a single space or person, as well as ensure preservation and acccess in locations around the world […] Libraries, museums, and archives will need to collaborate with business interests to build lasting social structures that are sustainable over time.” (Churchill E, Ubois J, 2008)

Libraries have played a significant role in participating in a variety of digital and web preservation projects over the years, but what’s the next step? How do we get more involved in conversations that take place in business?

Originally published on the semanticlibrary.net blog