Please tell us what you think of this issue! Feedback
Bulletin, June/July 2008
Situating Relevance Through Task-Genre Relationships
by Luanne Freund
Luanne Freund is an assistant professor at the School of Library, Archival and Information Studies at the University of British Columbia in Vancouver, Canada. She can be reached at luanne.freund<at>ubc.ca.
One of the most discussed concepts in information science is relevance, a relationship in which one thing has a direct bearing on another. We assume that people seek out relevant information to deal with a question or problem, and we design information systems to identify the most relevant information for any given query. Yet assessing the relevance of information is far from straightforward in real-life situations. Several decades of user studies have shown that people use a wide range of criteria to select one information source over another, such as subject matter, style, format, novelty, quality and authorship. Furthermore, perceptions of relevance change from person to person over time and from one situation to the next. In information seeking, as in yard sales, one man’s treasure is another man’s trash.
The complexity and subjectivity of real-world relevance led early information retrieval researchers to focus on the relatively static and measurable concept of logical or topical relevance (is a given document about the topic of a searcher’s information need?), rather than the dynamic and subjective concept of utility or situational relevance (is a given document suited to a particular searcher’s situation at a given point in time and space?). As Cooper noted in 1971
“The system designer cannot currently do much to ensure that only the most credible material will be retrieved: such a goal lies largely beyond the state of the art at present. He cannot gauge the relative importance to the user of different component statements either….Logical relevance is almost the only factor in utility which the designer does know how to deal with very effectively at present. This suggestion, if true, would help to explain why topic relevance and not utility has received the most attention in the literature.” [1, p. 36]
However, it is now almost 40 years later. Search systems have developed rapidly since the 1970s, and yet the state of the art still pays very little attention to the impact of the searcher’s situation on relevance. Search engines retrieve thousands of documents for every query and depend upon the searcher to sift through and find the treasures among the trash. Despite their indiscriminating nature, search systems are the most effective tools available to deal with the swelling sea of digital information. In the face of our growing dependency upon them, it is essential that we begin to develop systems capable of discriminating between documents that are simply on topic and those which are genuinely useful to the searcher. In other words, it is time for information professionals and system designers to come to grips with situational relevance.
Matching Documents to a Searcher’s Situation
In order to design systems capable of measuring and making use of situational relevance, we need to be able to model the searcher’s situation, identify the associated document features and find a way to match these up. By and large, topical relevance is measured by comparing search queries with documents and identifying the degree of term overlap. The strength of the query-document match varies according to the frequency and location of the overlapping terms. If we apply the same logic to situational relevance, we should look for evidence of situational overlap between searchers and documents. The concept of situation is not as well-defined as topic or subject, but refers to some combination of personal, social, organizational and physical factors that exist at a given point in time and influence a person’s behavior. Of these factors, the task in which a searcher is engaged has been identified as a dominant feature of the situation. So one way in which overlapping situations are expressed in information seeking is that, in a given task situation, a searcher will prefer document genres designed to support that task. For example, a homeowner installing a dishwasher is likely to prefer a set of instructions rather than a product review or technical specifications. In the workplace, a project manager preparing a marketing presentation for a line of dishwashers is likely to prefer promotional materials or other marketing presentations over user manuals or installation instructions.
Of course, these genre preferences are secondary and supplementary to topical relevance. Information that is not about the specific brand of dishwasher in question will be largely irrelevant, regardless of the searcher’s task and the document genre. And, if only a handful of documents about the specific dishwasher are available, then distinguishing between them on the basis of situational relevance would not add much value. Nevertheless, given that most search queries are very brief (one to two terms) and retrieve many more documents than searchers need or want, techniques to further refine search results are sorely needed. The remainder of this article will expand upon this notion of using task-genre relationships to determine situational relevance.
Task as a Situational Model for the Searcher
Tasks are activities undertaken to achieve goals. All of us are engaged in tasks, both assigned and self-motivated, in the course of our daily lives. These may be work or leisure tasks, for example, researching a new drug under development or designing a poster for a community dance. In the course of carrying out work or leisure tasks, problems and questions arise that prompt more specific tasks in the form of information tasks and search tasks – activities focused on the acquisition and use of information. Some common types of information tasks are learning, making decisions, finding facts, carrying out a process and solving problems. So, for example, in order to design a poster (work task), it might be necessary to find facts (information task) on the location and venue for the dance and to learn (information task) to use a graphic design software application. These information tasks are motivated by and nested in the broader work and/or leisure tasks. Together they provide a task-framework for information behavior.
The value of focusing on such a task framework is that it provides a means of distinguishing patterns of information behavior at an intermediate level instead of generalizing at the level of the entire population or trying to tease out differences among individuals. One of the main problems with search engines is that they are designed based on the assumption that everyone who submits the query “border collie” wants to see the same set of results (in fact, they assume that we all want to see the Wikipedia page of that name). The other extreme is personalization systems, designed on the assumption that each individual has unique and abiding personal preferences for “border collie” documents, based on his or her interests and abilities. The former approach is excessively generic and the latter is likely to require more intrusive personal data collection than most of us would be willing to allow. A middle ground is to assume that searchers’ preferences vary according to the task situation. A set of common tasks associated with the query “border collie” might include purchasing, caring for and learning about, and different sets of results would be best suited to each. So the task framework of a search can serve as a simple model of the searcher’s situation, which takes us part way in coming to grips with situational relevance.
Genre as a Situational Model for Documents
But how can we determine which results are best suited to these task scenarios? This is where genre steps up to the plate. Genre variation is a prominent feature of most large, mixed-content digital document collections, such as the Internet, organizational information spaces and digital libraries. Although the concept of genre has been around for a very long time, current genre research focusing on these large digital collections emphasizes the social and communicative functions of genre, defined as “typified communicative actions characterized by similar substance and form and taken in response to recurrent situations” [2, p. 299]. Genres emerge within communities of authors and readers as rhetorical devices to facilitate and enrich text-based communication. Because genres arise out of recurring situations, they carry the stamp of those situations and convey additional contextual information to the reader.
Consider the executive summary as a genre. It emerged within organizational settings to meet the need to summarize and highlight the major points of much longer reports. Without even looking at it, the reader knows a number of things about the document – why it was written, the audience for whom it was intended, that it is part of a larger document, that it will be relatively easy to read and brief and, importantly, that it presents a carefully constructed scale version of the truth. So genres serve as cues, informing a familiar reader about the situation in which a document was created, the intent(s) of the author, the structure of the document, the style of language and the uses to which it is suited. Within any given community or domain, repertoires of genres emerge in response to common needs and situations, become more familiar and formalized through use and may evolve or die out in response to changing needs. So like task, genre serves as a situational model. It offers a means of subdividing large sets of documents about a given topic into intermediate groups based on genre types, which vary as to their fit with a given situation.
The Relationship of Task and Genre
The next step is to determine in what sense task and genre “overlap.” Of course, there is a certain obvious connection between them, which is clear from some of the examples above. Who wouldn’t prefer a nice, clear set of step-by-step instructions to a 250-page user manual when trying to install a new piece of software? However, to move beyond these obvious individual cases and consider applying this idea to system design requires a deeper look at the nature of task-genre associations.
Our studies of task-genre associations in a software engineering work environment provided evidence that task and genre are, indeed, related. One study examined a database of shared documents, which had been indexed by the employees using task and genre descriptors. We found that the descriptors overlapped in non-random patterns throughout the dataset. For example, documents with the descriptor “cookbook” (step-by-step instructions) also tended to be assigned the descriptor “software installation” more often than other task descriptors. Another study asked participants to assign usefulness scores to 16 genres with respect to five information tasks. Again, scores varied significantly across tasks and genres, and it was interesting to see that for each task type, a different genre received the highest average usefulness score, suggesting a strong alignment (see Table 1).
Table 1. The highest scoring genre types for common information tasks
|Information Task Types||Genre Types|
decision making task
procedural (how-to) task
problem solving task
| product documentation
Based on these studies, we were able to identify some patterns in task-genre associations:
- Genre repertoires of varying sizes are associated with specific work and information tasks and, at a broader level, with various professional roles performed in the workplace.
- Task and genre are associated on the basis of the level of detail in the work and the information content. High level, conceptual tasks, such as design or planning, are associated with genre types with a broader scope and more abstract content; the opposite is true for low-level technical tasks.
- There are varying degrees of strength and scope in the associations of a given genre with workplace tasks. For example, software manuals were found to be useful in almost all task situations, while demos were associated with a very limited number of tasks.
- Some genres are more strongly associated with information tasks and others with work tasks. Some were not strongly associated with any tasks in this setting.
- Some genres, such as design patterns, clearly emerged to support a particular work task (software architecture), while other genres were co-opted from other contexts and used in task situations for which they were not originally intended.
These observations all point to the complex, dynamic and somewhat messy nature of the task-genre relationship. Therein lies its strength, as an organic expression of real-world information practices; and therein lies its weakness, as finding a way to harness this relationship is a challenging proposition.
Making Use of Task-Genre Relationships
Returning to situational relevance, it seems evident that an association exists between a searcher’s task frameworks and document genre types that represent some measure of situational relevance. However, there is still considerable work to be done before this approach can be implemented in search systems, not least of which is to develop more robust methods for automatic genre classification. We also need to find a scalable method of measuring strengths of associations between tasks and genre types in a given domain. These tasks are not trivial, but as the articles in this special issue indicate, considerable research attention is being turned toward genre, and the same is true of task. In fact, there are a number of reasons why I am more optimistic about the role that genre can play in information retrieval than some of my fellow authors.
Genre can be used explicitly in a search system by allowing searchers to either pre-select genre limits when they issue queries or post-select genre categories to limit or cluster their search results. Either method requires that the searcher be familiar with the genre types in the collection, be cognizant of which genre will best serve his or her purpose and be willing to spend the time to provide this input. It also requires that the collection be accurately classified by genre and that genre type labels be familiar and recognizable to users. I am not confident that these conditions can ever be met satisfactorily, and as Dillon notes in his article in this section, the benefits of such a system over the status quo are not likely to be substantial.
However, as I have suggested here, genre can also be used implicitly, through its association with task, to influence the ranking of search results based on situational relevance. This approach requires that the searcher’s task be specified or inferred, but does not require that the searcher be familiar with specific genres or labels. In this type of system, genre classification can be fuzzy and approximate, since genre only serves as one of many sources of evidence used to rank documents, rather than as a display or filtering feature that requires a high level of accuracy. I believe that this type of search system is not unobtainable, and given that we have already built a prototype using this approach, it may not be too far over the horizon.
Articles in this Issue
Situating Relevance Through Task-Genre Relationships