Sunday, June 24, 2007

Digital Libraries and User Attention

This past week, I attended to the JCDL'2007 (Joint Conference in Digital Libraries) and the CAMA'2007 (International ACM/IEEE Workshop on Contextualized Attention Metadata).

Since I cannot comment every single interesting paper that I've seen and discussed about (there are so many), I will point two interesting papers:

The first, which was also the very first presentation of the conference, World Explorer: Visualizing Aggregate Data from Unstructured Text in Geo-Referenced Collections (presented by Rahul Nair). This is a cool tool built on top of Flickr geotagging features. It is really nice to see how many applications are possible considering on-line communities like Flickr and that incorporate tagging features.

The main opportunity explored by the authors is to use geo-reference to cluster content. Besides a simply photo sharing mechanism, I think geotagging also opens up several research challenges/opportunities on designing applications for urban sensing.

The second paper was Can Social Bookmarking Enhance Search in the Web? (presented by Y. Yanbe).

The authors propose the introduction of what I would call a "Ranking Aggreagator" mechanism between the user, Google PageRank and ranking. Thus, their observation is that extremely fresh web pages tend to get low PageRank, but they may have a fair number of bookmark occurrences in Therefore, they propose a combination of both ranking schemes to improve the ranking of fresh pages and allow the user to get good mix of 'reputable' pages via PageRank and popular pages on Actually, I wondered if a combination of Google search history and the user interest sharing could be combined to provide better personalized search results.

I also participated to an interesting workshop, CAMA'2007, organized by Erik Duval, Martin Wolpers and Jehad Najjar.

The first talk by Seth Goldstein was exciting, possibly because it shows the incredibly large number of business opportunities are orbiting on-line social networks and how much value there is on online users attention

Joe Pagano from the Library of Congress presented some results on measuring the audience of a newly launched web site. His main finding was that more visitors come from blogs than from search engines, just to reinforce the intuition on the blog influence on the information consumption in the Web.

Personally, a positive aspect of this workshop was to identify possible applications that may validate our preliminary studies on interest sharing in collaborative tagging communities. For example, an extension on Joe Pagano's work would be the application of our interest sharing graph to understand how these visitors relate to each other and whether they from sub-communities of interest.

Also, Erik Duval made a nice comment on the fact that the large number of unique users we have found in our investigation over CiteULike and Bibsonomy may still present rich information to recommendation systems, even though these users are not connected to any island of interest we depicted in the interest sharing graph (more details here).

This is a brief summary of what I have seen this past week. Now, a lot of ideas to refine and put in practice...

