Friday, December 11, 2015

Investigating The Use of Social Tagging in Image Description



INVESTIGATING THE USE OF SOCIAL TAGGING IN IMAGE DESCRIPTION

            It is the goal of all information professionals, whether in a library, archive or museum, to provide exceptional access to their materials through meaningful description—a.k.a. metadata (data about data).  One way this is done is through the use of controlled vocabularies such as Library of Congress Subject Headings (LCSH), Getty’s Art & Architecture Thesaurus (ATT), or the Thesaurus for Graphic Materials (TGM).  A relatively new approach comes in the form of adding user-generated content—or “tags”—to materials in hopes of enriching that data-base and increasing access to materials.  This paper discusses a couple different examples of the use of user-generated tags in digital image description in order to better understand some of the potential benefits such an approach may offer.
According to Taylor and Joudrey (2009), tagging is defined as “a process by which a distributed mass of users applies keywords to various types of Web-based resources for the purposes of collaborative information organization and retrieval” (p. 364).  For the purposes of this paper, “tags;” “social tags;” “user-tags;” and “user-generated tags;” are meant to be synonymous.  Tagging allows users to group similar things together by using their own terms and labels, with little to no restrictions (Taylor and Joudrey, 2009, p. 364).  Folksonomy (the merging of folks and taxonomy) is defined by Taylor and Joudrey (2009) as “the aggregation of these tags created by a large number of individual users” (p. 366).  Taylor and Joudrey (2009) highlight the importance of tagging, stating that it is an approach to subject metadata “by the people, for the people—without restrictions, unfamiliar jargon, and complex application rules” (pp. 366-367).  Taylor and Joudrey’s thoughts mirror those of Thomas Vander Wal.  Vander Wal argues that with tagging, “People are not so much categorizing, as providing a means to connect items…to provide their meaning in their own understanding” (as cited in Taylor and Joudrey, 2009, p. 366). 
Describing and searching for images in particular creates an interesting challenge not found with cataloging books.  As Stvilia, Jorgensen, and Wu (2012) simply put it, “Unlike the content of text documents, raw image content is not linguistic” (p.99).  This makes having a good description of the image all the more essential if one wishes for that image to be easily searchable in a database.  Unfortunately, very often with controlled taxonomies like LCSH, the terms are highly specialized in such a way that they are not necessarily useful to the needs of the everyday user (Menard and Smithglass, 2011, p. 294).  The vocabulary used in controlled taxonomies does not necessarily align with that of “normal speech.”  This can be a particular problem when an “everyday user” attempts to search for an image using his/her “normal speech” terms.  This is where user tags could potentially play an important role by adding searchable terms the user might be more likely to use. 
In 2011 Elaine Menard and Margaret Smithglass conducted the first phase of a research project aimed at developing a bilingual taxonomy for the description of digital images (Menard and Smithglass, 2011, p. 291).  In this initial stage of the project, Menard and Smithglass evaluated a total of 150 resources for organizing and describing images; 70 that used controlled vocabularies, and 80 that used user-generated tags.  The goal was to assess existing standards for image description to see how they might be integrated in the development of this new bilingual taxonomy (Menard and Smithglass, 2011, p. 291).   
            First, they looked at traditional image description using controlled vocabularies.  To do this Menard and Smithglass examined 70 image collections held by four different types of organizations: 21 from libraries, 16 from museums, 18 from image search engines, and 15 from commercial web sites (Menard and Smithglass, 2011, pp. 291, 194).  Comparing the metadata associated with images from libraries and museums, Menard and Smithglass found that a number of description elements were consistently found: title, date, creator, subject, original source, and collection (p. 295).  For analysis of image search engines, they looked at such sites as ARTstor, Wellcome Images, Art Images for College Teaching, Index of Christian Art, and Beazley Classical Archive.  Among these they found consistency in displaying info for title, date, creator, subject, original source, and collection (Menard and Smithglass, 2011, p. 296).  For commercial sites, they only analyzed sites that did not actually sell any images—the images featured on these sites were for the purpose of selling other items.  With this group, Menard and Smithglass found that the website taxonomies were unique to each resources; providing only a localized vocabulary and not one that could be transferred to another site (p. 296).
            After looking at the descriptive metadata formed with controlled vocabularies in these four types of institutions, Menard and Smithglass concluded that:
Successful retrieval depends on the presence of a particular term in the collective description of an item and, where images are concerned, this does not (for this resource group) include what might be considered the obvious components of colour or shape.  User interfaces for basic and advanced searching have been built on traditional descriptive practice, which is text-based and does not therefore accommodate a more intuitive and/or visual approach.  (p. 296)

Menard and Smithglass did find that all four types of organizations had comparable description elements; however, as is alluded to above, these may not necessarily include some of the most obvious elements that the everyday person might use when searching for an image.
For the second step in this study, Menard and Smithglass explored the use of user-generated tags in 80 image-sharing systems accessed online.  Thirty-two of these were image upload and sharing sites, 14 were image hosting, and 34 were stock photography sites (Menard and Smithglass, 2011, p. 297).  For the image upload and sharing sites, Menard and Smithglass went to places like Photobucket and Shutterfly.  These sites allow users to tag their own images during or after the upload process; then, other users can retrieve those images in a search by using those same tags (Menard and Smithglass, 2011, p. 298).  Amongst the 32 different sites they analyzed in this category they found that every site offered the following metadata options for uploaded images: user/creator, albums, comments, ratings, and links to social network sites (Menard and Smithglass, 2011, p. 298).  Additionally, “25 percent also provided discrete, predetermined categories for image organization (e.g. travel, family, pets, vacation), none of which were mandated and all of which were unique to the site under consideration” (Menard and Smithglass, 2011, p. 298).  Menard and Smithglass also note that tag clouds were present in 90 percent of these sites, and provide and efficient method for retrieving images with the same tag; however, they point out that this method was far from precise due to the “lack of internal web site structure, and the inconsistency of tags chosen by individual users” (Menard and Smithglass, 2011, p. 298).
The exploration of image hosting sites proved to be not very useful for Menard and Smithglass.  They found that these type of sites track very little metadata and, surprisingly, did not always provide the option for users to tag images (Menard and Smithglass, 2011, p. 298).  The stock photography sites Menard and Smithglass analyzed were commercial ventures that sold the images described on their websites.  Examples from their study include: Stock Xchng, iStockphoto, and Freepixels.  Menard and Smithglass found tag clusters and predetermined categories to be the norm in this group, and noted that many are detailed and hierarchical in nature (functioning much like a taxonomy) (p.298).  Although the organizational structures are unique to each site, Menard and Smithglass found that they accommodate tagging and intuitive user interfaces for both display and retrieval, and determined this group to be the most intuitive and precise of all three in the tagging category (Menard and Smithglass, 2011, p. 298). 
Analysis of the uncontrolled vocabulary group provided Menard and Smithglass with a couple observations:
(1)                           in image-upload systems, there was a virtual absence of mandated structure beyond user name and tags; and
(2)                           in stock photography resources, we encountered a hybrid of taxonomies working in combination with user tags.                                        (p. 299) 

Menard and Smithglass found that the combination of the structure of a taxonomy and the flexibility of user-tagging (found in stock photography sites) lends itself to be a more appropriate description model for visual materials (Menard and Smithglass, 2011, p. 299).  They write, “We believe the best practices associated with stock photography web sites provide a potentially useful model for future image taxonomies and retrieval” (Menard and Smithglass, 2011, p. 299). 
            Overall, the examination of both controlled and uncontrolled vocabularies in the use of digital image description left Menard and Smithglass with the sense that there could be a real future in combining these two methods.  In moving forward with their project to create a new bilingual taxonomy, they write, “The hypothesis of our research project supposes that the combination of “classical” terminologies used by indexing specialists and terms extracted from innovative approaches such as image tagging could facilitate access to images by producing metadescriptors useful in all retrieval environments” (Menard and Smithglass, 2011, p. 302). 
            Menard and Smithglass are certainly not alone in their feelings; over the years, many studies have discussed the potential benefits to a combined system of description for various materials.  One such discussion took place after Abebe Rorissa content analyzed and compared a sample of 975 Flickr images (with 4,159 Flickr tags) to 996 images from the University of St. Andrews Library Photographic Archive (with 3,709 index terms) (Rorissa, 2010, p.2234).  The findings of this study showed that there are fundamental differences between the underlying structures of user-generated tags and professionally assigned index terms, and that perhaps these differences could be a good thing (Rorissa, 2010, p.2239).  Rorissa writes, “The prevailing recommendation is that social tagging and traditional/professional indexing should be used together to complement each other” (Rorissa, 2010, p.2239).  Where one system lacks, the other can make up.  Rorissa also mentions the fact that traditional description and indexing of visual materials is quite often a rather time-consuming and labor-intensive task; he suggests that social and collaborative tagging could help alleviate this matter (Rorissa, 2010, p.2239). 
            It seems a gross misstep to discuss social tagging in digital image description without at least briefly mentioning one very important project: the Flickr Commons.  On January 16, 2008 the pilot project of the Flickr Commons was launched with the partnership of The Library of Congress (Flickr).  Just like nearly every other cultural heritage organization—big or small—The Library of Congressed faces several challenges when it comes to increasing discovery and use of its collections (Springer et al., 2008, p. iii).  Even at The Library of Congress “resources are limited to provide detailed descriptions and historical context for the many thousands of items in research collections” (Springer et al., 2008, p. iii).  One solution to this problem was the pilot program with Flickr.
            In January 2008 two collections of Historical Photographs from The Library of Congress were made available on Flickr for the public to view and add descriptive tags (Springer et al., 2008, p. iv).  Even within the first 10 months of the project the results were more than anyone expected.  Springer et al. explain:
--As of October 23, 2008, there have been 10.4 million views of the photos on Flickr.
--79% of the 4,615 photos have been made a “favorite” (i.e., are incorporated into personal Flickr collections).
--Over 15,000 Flickr members have chosen to make the Library of Congress a “contact,” creating a photostream of Library images on their own accounts.
--For Bain images placed on Flickr, views/downloads rose approximately 60% for the period January-May 2008, compared to the same time period in 2007. Views/downloads of FSA/OWI image files placed on Flickr rose approximately 13%.
--7,166 comments were left on 2,873 photos by 2,562 unique Flickr accounts.
--67,176 tags were added by 2,518 unique Flickr accounts.
--4,548 of the 4,615 photos have at least one community-provided tag.
--Less than 25 instances of user-generated content were removed as inappropriate.
--More than 500 Prints and Photographs Online Catalog (PPOC) records have been enhanced with new information provided by the Flickr Community.
--Average monthly visits to all PPOC Web pages rose 20% over the five month period of January-May 2008, compared to the same period in 2007.                                     
                                                                                                                    (p. iv).

On the subject of potential benefits of having users tag these historic photographs Springer et al. remark, “The contribution of additional information to thousands of photographs was invaluable” (p. iv). In June of 2011 there were 52 participating institutions representing ten countries (Menard and Smithglass, 2011, p. 300); in December 2015, there are now 108 from all over the world (Flickr)!  The program is still thriving and continually seeks to perpetuate its two main objectives:
  1. To increase access to publicly-held photography collections, and
  2. To provide a way for the general public to contribute information and knowledge. (Then watch what happens when they do!)
                                                                                                                    (Flickr).
            During the summer of 2009 Jason Vaughan conducted a web-based survey of all the institutions participating in the Flickr Commons (there were 27 at the time).  The survey consisted of 21 questions based around the experience with the Flickr Commons, structured into five sections: background, institutional staff involvement, social interactions, statistics, and assessment (Vaughan, 2010, p. 188).  Results of this survey reflected an overwhelmingly positive experience on behalf of the participating institutions.  Vaughan writes, “From the institutional perspective, exposing their collections to a broader audience, building online communities, enhancing their own knowledge of their collections, and testing the ‘no known copyright restrictions’ license were all positive outcomes of joining The Commons” (Vaughan, 2010, p. 199).  He goes on to add, “Concerns observed by The Commons’ members were basically nonexistent and were strongly outweighed by the benefits” (Vaughan, 2010, p. 199). 
Since 2005 a growing number of institutions have been allowing members of the general public to tag digital objects in their collections with descriptive keywords (Cairns, 2013, p. 109).  Examples of social tagging in digital image description, likes the ones mentioned in this paper, highlight some of the potential benefits of using such a system in our cultural heritage intuitions.  Although further study needs to be done, there appears to be a real potential for enriching metadata and expanding user access to materials through the integration of such a system with existing controlled index terms.  It will be exciting to see where the future of social tagging takes us!


Reference List

--Cairns, S. (2013). Mutualizing Museum Knowledge: Folksonomies and the Changing Shape of Expertise. Curator, 56(1), 107-119.
--Flickr. The Commons. Retrieved from: https://www.flickr.com/commons/ (accessed 10 December 2015).
--Menard, E., & Smithglass, M. (2012). Digital image description: a review of best practices in cultural institutions. Library Hi Tech, 30(2), 291-309.
--Rorissa, A. (2010). A comparative study of Flickr tags and index terms in a general image collection. Journal Of The American Society For Information Science & Technology, 61(11), 2230-2242. doi:10.1002/asi.21401
--Springer, M., Dulabahn, B., Michel, P., Natanson, B., Reser, D., Woodward, D. and Zinkham, H. (2008), For the common good: The Library of Congress Flickr pilot project. Retrieved from: www.loc.gov/rr/print/flickr_report_final.pdf (accessed 10 December 2015).
--Stvilia, B., Jörgensen, C., & Wu, S. (2012). Establishing the value of socially-created metadata to image indexing. Library And Information Science Research, 3499-109. doi:10.1016/j.lisr.2011.07.011
--Taylor, A. G. & Joudrey, D. N. (2009). The Organization of Information. Westport, CT: Libraries Unlimited.
--Vaughan, J. (2010). Insights Into The Commons on Flickr. Portal: Libraries And The Academy, (2), 185-214.

No comments:

Post a Comment