Freitag, 5. Februar 2010

Stop thinking, start tagging - Tag Semantics emerge from Collaborative Verbosity

Maybe you've asked yourself from time to time "What are these BibSonomy developers doing the whole day?" Of course the first answer is simple - we develop BibSonomy :) - but apart from that, most of us are researchers, running experiment, discussing results, writing papers - and the latter is sometimes rewarded: Our work "Stop thinking, start tagging - Tag Semantics emerge from Collaborative Verbosity" got accepted at this year's WWW conference in Raleigh, USA!

As you can guess from the title, the paper is basically concerned with emergent semantics. This term is often used to describe semantic structures that "grow" in a bottom-up and uncontrolled manner within collaborative tagging systems. For the case of emergent tag semantics this means that despite people are free to choose arbitrary tags (which leads to typical language-related phenomenons like homonymy, polysemy, ..), one can successfully extract meaningful tag relations from the aggregated mass of tagged content. As an example, different people might use different tags to describe the web2.0 paradigm, possibly "web2.0", "web-2.0", "webtwo", "web20", "web.2.0", and many others. By using the appropriate tag relatedness measures, one can identify those cases and extract a semantic "concept" web2.0 which all these users are talking about.

Up to this point, there's nothing too new - the question we asked ourselves then was how the characteristics of individual users influence the quality of the learned semantic structures. One possibility is to distinguish users according to their tagging motivation into "Categorizers" and "Describers" - the first group uses a small and systematic vocabulary, wherby the latter uses a wealth of different keywords for annotation. Simply spoken, describers can be seen as the "verbose" users tagging with many keywords. So we splitted up the whole folksonomy dataset into several partitions containing different mixtures from categorizers and describers. And here is an interesting thing we found:

On the x-axis, you see the percentage of included users. The y-axis depicts the quality of the inferred semantic tag relations (measures by grounding against a thesaurus; as we used the JCN distance, smaller values indicate better quality). The green line depicts the semantic quality obtained from the full dataset. The interesting thing is now that already with 40% of the "talkative" describers, one can reach the semantic precision of the full dataset! The best quality is found for 70% of describers. So the claim that "mass matters" holds only partially - a crucial aspect seems to be from which kind of users the mass is composed. The collaborative verbosity of describers seems to have a positive effect on the emergent semantics. On a more general level, this exhibits a causal link between tagging pragmatics (how people tag) and tag semantics (what tags mean). If you're interested in further details, we'd be happy to discuss with you on WWW2010!

Labels: , , , ,

Freitag, 29. Januar 2010

Feature of the week: New Firefox-plugin released!


For some weeks now the first version of our BibSonomy-Firefox-Addon is available for Download.
This new feature integrates your BibSonomy bookmarks into your Firefox browser, thus offering comfortable use and storage of your bookmarks without visiting your BibSonomy account. You might also want to synchronize your list of local bookmarks with the ones in BibSonomy.
All you need to get started is the new plugin, your user name and your API-key as password.

  • The blue star button indicates whether or not a page is already bookmarked. Clicking it opens the dialogue for storing or changing the bookmark of the current web page.









  • To the left in your navigation bar you'll find the quick link to your BibSonomy page and the hide/show button for the sidebar.
    The sidebar displays the cloud or list of your bookmark's tags and your tag relations. Much like in BibSonomy bookmarks are retrieved by clicking on one of its tags or using the full-text search.




  • Our Addon is fully customizable and also allows to remove the standard yellow Firefox-Star-button from the browser. Moreover the settings feature the option to import your Firefox bookmarks into BibSonomy and vice versa.

Labels: , , ,

Mittwoch, 20. Januar 2010

New Release

Yesterday we deployed a new release which mainly contains bugfixes and a big cleanup in the underlying database module. It already contains the backend methods for some new features, e.g., the Inbox. During this week we will also try to switch the full text search to Lucene.

A small new feature which has been added is the layoutinfo JSON which provides metadata about the available JabRef layouts (you can see this currenly on the export page). This will be used by our Typo3 plugin to provide a selection of available layouts.

Labels: ,

Mittwoch, 16. Dezember 2009

Feature of the week: 2009 in review

2009 brought many improvements and new features for BibSonomy but also interesting research activities. We briefly review this year before next week's post gives an outlook on 2010.
Tag Recommendations
As part of the ECML PKDD 2009 conference we organized the Discovery Challenge, where the participants could test their tag recommendation methods on a BibSonomy dataset. A particularly interesting part of the challenge was the online evaluation which allowed the researchers to evaluate their approaches in the running system and actually show their recommendations to our users. The underlying infrastructure was provided by our new tag recommendation framework which proved to be very useful. It allowed us to distribute the tag recommendation work over several machines located all over the world. E.g., the winner's recommender was running in Canada.
Research Projects
Two new projects centered around BibSonomy started this year: PUMA, which will improve academic publication management in cooperation with the University Library Kassel, and Info 2.0 (in German), which investigates chances and risks of the Web 2.0 with respect to informational self-determination in cooperation with the Institute for Public Law.
Plugins
We released three new plugins which better integrate BibSonomy with other tools. The JabRef plugin allows you to synchronize your publication references with the bibliography manager JabRef and the Typo3 extension integrates publication lists from BibSonomy into the content management system Typo3. Just released two weeks ago and ready for testing is the new Firefox add-on which better integrates BibSonomy into the Firefox web browser. We will introduce this add-on in one of the next FOTWs.
Personalization
You now see similar users in your sidebar on which you can click to surf their posts in a personalized ranking. Furthermore, you can follow users you find interesting to stay tuned on what they post.
Dumps of the Dataset
Since quite some time we offer a dataset of the BibSonomy database in form of an SQL dump for research purposes to interested people. A web page now describes the available dumps and how to get one. Newly, the dumps also contain the users' tag relations.
Development
In an ongoing effort to open the BibSonomy source code to the public, we released some of the core modules in a public Maven repository. E.g., now you have access to our screen scrapers, which allow you to extract publication metadata from more than 60 digital libraries. Most modules have a GPL or LGPL license.

Next week we will present our current activities and discuss the plans for 2010.

Labels: , , , , , , , , ,

Donnerstag, 12. November 2009

Feature of the Week: New version of JabRef-plugin released!

As you will have noticed, we are maintaining a plugin for the open-source bibliography manager JabRef, which allows to easily download and upload entries from BibSonomy. We believe that this approach nicely combines the advantages of maintaining a local BibTeX-file with the comfort and usefulness of a centralized publication sharing platform like BibSonomy.

We have just now released a new version of this plugin, which offers some nice features to ease the maintenance of both collections (local + within BibSonomy)! Check it out:

  • Added document management: In JabRef and within BibSonomy, it is possible to attach a private copy (PDF, PS, ...) to a publication entry. The new version of our plugin allows to download all your private documents present in BibSonomy by a single click (first image). Furthermore, you can control in the settings menu that local documents are automatically uploaded to BibSonomy when you storethe publication (second image).












  • Automatic Synchronization: A typical problem is to keep both collections (your local .bib file and your BibSonomy account) synchronized. We are proud to offer a comfortable feature to automatically perform this task (third image on the right). This feature automatically checks for entries present in both collections if they are equal; if there is a difference, you can decide which version to keep. A 'diff-like' view helps you to see what has changed (4th image on the right).




  • Full-text search: In prior versions, it was only possible to retrieve posts from BibSonomy by tag. Now you can also perform a full-text search in your personal or in the global collection.
  • Further small additions & bugfixes: Apart from the above-mentioned new features, we improved the interface, fixed some bugs, and generally made the plugin more stable and better :)
You can download the latest version of the plugin here: The updated documentation can be accessed via http://www.bibsonomy.org/help/doc/jabref-plugin/index.html. We hope this new release helps you to be more efficient in your personal and shared publication managent - we are as usual always happy about feedback, comments, suggestions!!

Best,
Dominik

Labels: , ,

Dienstag, 6. Oktober 2009

Main Server Crashed

Today we had a crash of our main machine. It took us 1 hour to restart everything as it was late and no one was in the office. This was the reason BibSonomy was not available. Unfortunately this is the third time within 4 weeks that the machine crashed. We are now searching for the reason but currently we have no clue as we did not observe any special situation. It seem to be some strange hardware defect. Lets cross the finger that we can figure out the problem soon.

Labels: ,

Montag, 28. September 2009

New Release

Those of you which have recently tried to delete a post probably have noticed a small but helpful change: a dialog box is now asking you for confirmation. If you accidentally clicked on the "delete" link, you have now the chance to stop the process. If you don't like this feature: just disable it on the settings page and you get back the old behaviour.

This is just one of the changes the new release contains but obviously the most noticeable one. Furthermore, we updated the code to import bookmarks from Delicious and Firefox, to upload JabRef layouts and to pick/unpick posts for the basket. Several smaller bugfixes also made it into the release.

As always a small sidenote: although we tested the code, it might contain bugs we did not find. So if you think you've found an error, don't hesitate to contact us!

Labels:

Dienstag, 8. September 2009

Tagging for Championship

As a social bookmarking system, assigning tags to resources is one of BibSonomy's most important and frequent processes. Since a while, the user is assisted by a set of recommended tags as shown in Figure 1.



The Challenge


Recommender systems are subject to active research and different approaches emerged. In the context of this year's ECML PKDD Discovery Challenge, BibSonomy's tag recommendations were provided by 14 different recommender systems from 10 different research teams in 7 different countries during the last five weeks. The challenge consisted of three tasks where the first two tasks were dealing with fixed datasets obtained from BibSonomy, while the third task's subject was to provide tag recommendations to the user in the running system.

Yesterday, during the ECML PKDD Discovery Challenge Workshop, the challenge's participants presented their recommender systems and discussed the different approaches, still ignorant of the third task's winning team, which finally was announced in the evening during the conference's opening session.

Rating the Systems


Algorithms for tag recommendations are typically evaluated by computing some performance measure in an "off-line" setting, that is, by iterating over posts in a dataset, which was derived from a social bookmarking system, presenting only a user and a resource to the recommender system. Thus, for each post, the set of suggested tags can be compared with those the user had assigned. Participants in Task 1 and Task 2 were evaluated in such a setting.

But these "off-line" settings not only ignore some constraints in real live applications (e.g. cpu usage and memory consumption), they also can't take into account the effect of presenting a set of recommended tags to the user. To evaluate these effects, we set up Task 3, were recommender systems were integrated into BibSonomy and the recommender systems had to deliver their tag recommendations within a timeout of 1000 ms.

For evaluating the different recommender systems (in the off-line settings as well as Task 3), we calculated precision and recall for each system. While precision measures, how many recommended tags where adequate, recall takes into account, how many of the tags the user actually assigned to the resource where recommended.

Figure 2 shows the final results of the on-line challenge (which is available here). For each recommender system, we calculated precision and recall, considering only the first n tags (for n=1,2,..., 5) and averaged over all posts. The top blue graph for example shows, that from the corresponding recommender system's five recommended tags (the very right point) around 18% were chosen by the user (precision 0.18) and around 23% of the tags which the user finally assigned to the resource were "predicted" by the recommender.



The winning teams are:

  • Task 1: Marek Lipczak, Yeming Hu, Yael Kollet, and Evangelos Milios (Paper)

  • Task 2: Steffen Rendle and Lars Schmidt-Thieme (Paper)

  • Task 3: Marek Lipczak, Yeming Hu, Yael Kollet, and Evangelos Milios (Paper)



We are happy to say, that it was an interesting challenge which gave substantial insight into the performance of different approaches to the task of tag recommendation. We'd like to thank everybody who contributed to this challenge - last but not least each of BibSonomy's users.

Labels: , , , ,

Mittwoch, 26. August 2009

PUMA - Project on Academic Publication Management started on August 1st

BibSonomy technology will be used in a project that fosters the open access movement and a better support of the researchers publications work. The project "PUMA - Academic Publication Management" is funded by the German Research Foundation DFG and has been started on August 1st, 2009. PUMA is a joint project of the University Library and the Knowledge & Data Engineering Group of the University of Kassel.

Open access is a publication model that allows authors to publish their articles free of charge, and users to freely access them. The costs are borne by the institution that is providing the institutional repository. There are several reasons for this publication model. With reduced budgets and increased costs for journals, many university libraries cannot afford the subscription of all relevant journals any longer. Furthermore, open access supports a timely publication and broader visibility of articles so that research results can be taken up earlier and by more researchers, decreasing thus the turn around time of scientific results.

Even though many researchers support the open access movement in principle, they often do not contribute their publications to the institutional repository of their university. Key reasons are that they do not see an immediate benefit from this additional effort, and that the upload is not integrated in their usual work flow. PUMA aims therefore for an integrated solution, where the upload of a publication results automatically in an update of both the personal and institutional homepage, the creation of an entry in BibSonomy, an entry in the academic reporting system of the university, and its publication in the institutional repository. At the time of upload, meta data from several data sources (SHERPA/RoMEO list, online library catalogue, BibSonomy) will be collected automatically in order to support the user. Further, PUMA aims to provide a publication management platform for all researchers and students to be used on a daily basis, which reduces not only the open access publication effort but also the effort to manage one's own publications.

The PUMA platform will be based on BibSonomy technology and will be hosted by the University Library; it will be setup in a Web 2.0 style. The platform will include all features known from BibSonomy, like tagging of publications, easy usage, an API and scalability. BibSonomy will continue to be run by the Knowledge & Data Engineering Group. As a showcase, PUMA will be integrated with the open access repository platform DSpace, the libary system PICA, the Typo3 content management system, and BibSonomy. The system is open for adaption to other standard systems. The project results will be published as open source software. This implies that the complete BibSonomy source code will become available under an open source licence at the end of the project.

Labels: , , , ,

Freitag, 24. Juli 2009

Feature of the week: Stay tuned to interesting content by following interesting users

A good starting point when searching for interesting resources in BibSonomy are other users with similar interests. In a prior post, we showed how BibSonomy can help you to discover these similar users. We're now happy to announce a new feature which makes it easy for you to keep track of interesting resources of these people - you can now just follow them!

The basic idea is like this: Once you stumble upon a user who seems to be interesting, you can use the follow-link on his user page to add him to your list of followed users. Think of this list as a buddy list of people with similar interests as you have. Here are two examples where you can find this link (on the user page and on the personalized user page):

On the followers page, you find then a list of all users you are following (and all users following you :) ). This page summarizes all recent posts of all users you are following, ranked personally for you. So the most relevant posts to you are shown at the top of the resource lists (we compute relevancy based on the tags you use). Here is what this page looks like:
You can also add and remove users from your list of followed users on this page. In addition, you can change some settings of the applied ranking algorithm and see which method is best in finding the most relevant posts for you.

Feel free to play around with this feature - we hope it can help you to "dig" through the resources of users with similar interests and finally find some pretty cool and relevant stuff for you!

Best,
Dominik