Web@25: Exploring the Web-related research topics in space and time

Blog

With people around the world celebrating the 25th birthday of the Web, the STKO lab is not an exception. To contribute to this great festival, we designed a Linked-Data powered Web portal using the state-of-the-art Semantic Web technologies. As a lab focusing on spatiotemporal knowledge and information observatory, this Web portal features an exploration of the evolution of Web-related research topics.

The link to the Web portal can be accessed at http://stko-exp.geog.ucsb.edu/web25portal/

The data used for this spatiotemporal exploration is from the World Wide Web Conference (WWW), which is widely recognized as the top conference in the field of Web and which has the highest impact factor (h5-index: 87) according to Google Scholar. We extracted the WWW publication data from Scopus API, and convert these data into Resource Description Framework (RDF), the standard data model on the Semantic Web. The WWW publication data contain a rich amount of information, including paper titles, abstracts, authors, affiliations, keywords, and so forth (unfortunately, the data in 1995 and 2001 are missing in Scopus). We further enrich this dataset by designing a Wikification algorithm to extract important concepts and terms from the paper titles. All these data have been published using a Semantic Web triple store (Jena Fuseki).

We provide two perspectives for users to explore the evolution of the Web: a timeline perspective and a geospatial perspective. In the timeline perspective, users are presented with a series of word clouds attached to different years. In each word cloud, the size of the word is calculated based on the percentage of its appearance in all the papers in that year's WWW conference. To compare the popularity trends, users can hover their mouse over the terms, and see the popularity change in different years. While a larger font often indicates a higher popularity of the research topic in that year, it is important to be aware that different terms may be used to express similar concepts as the Web evolves with the time. For example, the term "World Wide Web" has been widely used before 2000, but less frequently after that (Figure 1). This does not mean "World Wide Web" is no longer an interesting research topic, but it is just people are beginning to use other terms such as "Web". Similarly, the concept "Semantic Web" may display a declining trend at the first glance (Figure 2). However, a closer examination reveals the increase of more specific topics in the general field of Semantic Web, such as "Linked Data" and "SPARQL".


Figure 1

Figure 2

Moving away from a purely temporal perspective, the geospatial view provides spatiotemporal analysis and visualization of the WWW publication data. We uses a method called Kernel Density Estimation (KDE) to compute a statistical surface (Gaussian kernel has been employed here). The density in a given region is determined by the number of publications in this area, and the location of publications are derived from the authors' affiliations recorded in the Scopus data. Users can select multiple research topics and specify a year range (e.g., 1994-2013), and then click the "Map Range" button to visualize the distribution of these research topics on the world map. Users can also click the "Animation" button to animate the topic distribution year by year. To give a concrete example about the geospatial view, Figure 3 shows the densities in the US and Europe for the term Semantic Web (without any further keywords or variations) in the time range 2002-2004 as well as five years later in 2007-2009. Figure 4 shows another example in which multiple topics ("Semantic Web", "Linked Data", and "Ontology") have been selected and mapped.


Figure 3

Figure 4

We hope that you will enjoy the Web@25 system and would like to congratulate the W3C and everybody else again. Let us hope the Web will stay the Web we want for the next 25 years.