1.3. Design Decisions in Organizing Systems

A set of resources is transformed by an Organizing System when the resources are described or arranged to enable interactions with them. Explicitly or by default, this requires many interdependent decisions about the identities of resources; their names, descriptions and other properties; the classes, relations, structures and collections in which they participate; and the people or technologies interacting with them.

One important contribution of the idea of the Organizing System is that it moves beyond the debate about the definitions of “things,” “documents,” and “information,” with the unifying concept of “resource” while acknowledging that “what is being organized” is just one of the questions or dimensions that need to be considered. These decisions are deeply intertwined, but it is easier to introduce them as if they were independent.

We introduce six groups of design questions, itemizing the most important dimensions in each group:

  • What is being organized? What is the scope and scale of the domain? What is the mixture of physical things, digital things, and information about things in the Organizing System? Is the Organizing System being designed to create a new resource collection, catalog an existing and closed resource collection, or manage a collection in which resources are continually added or deleted? Are the resources unique, or are they interchangeable members of a category? Do they follow a predictable “life cycle” with a “useful life”? (§1.3.2)

  • Why is it being organized? What interactions or services will be supported, and for whom? Are the uses and users known or unknown? Are the users primarily people or computational processes? Does the Organizing System need to satisfy personal, social, or institutional goals? (§1.3.3)

  • How much is it being organized? What is the extent, granularity, or explicitness of description, classification, or relational structure being imposed? What organizing principles guide the organization? Are all resources organized to the same degree, or is the organization sparse and non-uniform? (§1.3.4)

  • When is it being organized? Is the organization imposed on resources when they are created, when they become part of the collection, when interactions occur with them, just in case, just in time, all the time? Is any of this organizing mandated by law or shaped by industry practices or cultural tradition? (§1.3.5)

  • How or by whom, or by what computational processes, is it being organized? Is the organization being performed by individuals, by informal groups, by formal groups, by professionals, by automated methods? Are the organizers also the users? Are there rules or roles that govern the organizing activities of different individuals or groups? (§1.3.6)

  • Where is it being organized? Is the resource location constrained by design or by regulation? Are the resources positioned in a static location? Are the resources in transit or in motion? Does their location depend on other parameters, such as time? (§1.3.7)

How well these decisions coalesce in an Organizing System depends on the requirements and goals of its human and computational users, and on understanding the constraints and tradeoffs that any set of requirements and goals impose. How and when these constraints and tradeoffs are handled can depend on the legal, business, and technological contexts in which the Organizing System is designed and deployed; on the relationship between the designers and users of the Organizing System (who may be the same people or different ones); on the economic or emotional or societal purpose of the Organizing System; and on numerous other design, deployment, and use factors.

1.3.1. Organizing Systems in a “Design Space”

Classifying Organizing Systems according to the kind of resources they contain is the most obvious and traditional approach. We can also classify Organizing Systems by their dominant purposes, by their intended user community, or other ways. No single fixed set of categories is sufficient by itself to capture the commonalities and contrasts between Organizing Systems.

We can augment the categorical view of Organizing Systems by thinking of them as existing in a multi-faceted or multi-dimensional design space in which we can consider many types of collections at the same time.

1.3.1.1. Conventional Ways to Classify Organizing Systems

We distinguish law libraries from software libraries, knowledge management systems from data warehouses, and personal stamp collections from coin collections primarily because they contain different kinds of resources. Similarly, we distinguish document collections by resource type, contrasting narrative document types like novels and biographies with transactional ones like catalogs and invoices, with hybrid forms like textbooks and encyclopedias in between.

But there are three other conventional ways to classify Organizing Systems. A second way to distinguish Organizing Systems is by their dominant purposes or the priority of their common purposes. For example, libraries, museums, and archives are often classified as “memory institutions” to emphasize their primary emphasis on resource preservation. In contrast, “management information systems” or “business systems” are categories that include the great variety of software applications that implement the Organizing Systems needed to carry out day-to-day business operations.

A third conventional approach for classifying Organizing Systems is according to the nature or size of the intended user community. This size or scope can range from personal Organizing Systems created and used by a single person; to “community-based” Organizing Systems used by informal social groups; to those used by the employees, customers or stakeholders of an enterprise; to those used by an entire community or nation; to global ones potentially used by anyone in the world.

A fourth way to distinguish Organizing Systems is according to the technology used to implement them. Large businesses use different software applications for inventory management, records management, content management, knowledge management, customer relationship management, data warehousing and business intelligence, e-mail archiving, and other subcategories of collections.14[Com]

[14][Com] Sometimes many of these Organizing Systems and their associated applications are implemented using a unified storage foundation provided by an enterprise content management (ECM) or enterprise data management (EDM) system. An integrated storage tier can improve the integrity and quality of the information but is invisible to users of the applications.

We can become overwhelmed by this proliferation of ways to classify collections of resources, especially when the classification is not clearly based on just one of these many approaches. For example, the list of “library types” used by the International Federation of Library Associations to organize its activities includes resource-based distinctions (e.g., art libraries, law libraries, social science libraries), purpose-based ones (e.g., academic and research libraries), and user-based distinctions (e.g., public libraries, school libraries, libraries serving persons with print disabilities).15[LIS]

1.3.1.2. A Multifaceted or Multidimensional View

A type of resource and its conventional Organizing System are often the focal point of a discipline. Category labels such as library, museum, zoo, and data repository have core meanings and many associated experiences and practices. Specialized concepts and vocabularies often evolve to describe these. The richness that follows from this complex social and cultural construction makes it difficult to define category boundaries precisely. Consider Borgman’s commonly accepted definition of libraries as institutions that “select, collect, organize, conserve, preserve, and provide access to information on behalf of a community of users.” Many Organizing Systems are described as libraries, although they differ from traditional libraries in important respects. (See the sidebar, What Is a Library?)

We can always create new categories by stretching the conventional definitions of “library” or other familiar Organizing Systems and adding modifiers, as when Flickr is described as a web-based photo-sharing library. But whenever we define an Organizing System with respect to a familiar category, the typical or mainstream instances and characteristics of that category that are deeply embedded in language and culture are reinforced, and those that are atypical are marginalized. In the Flickr case, this means we suggest features that are not there (like authoritative classification) or omit the features that are distinctive (like tagging by users).

Another way to apply existing categories to understand or design an Organizing System is by using metaphor. Instead of stretching a category just a little bit by qualifying it, as we do when we describe something as a tool library or seed library, using a metaphor creates a new category by intersecting two existing ones that are quite different, thereby drawing attention to the small subset of their common properties. For example, the coach of a sports team might describe one player as a beast and another as a fox to highlight their dominant capabilities, “strong and ferocious” versus “quick and clever.” If we describe an organization as a machine, this metaphor evokes precision, reliability, and efficiency and suggests that the workers perform specialized and routinized tasks with little discretion because they are viewed as interchangeable components. If instead we describe an organization as an organism, this metaphor draws attention to what the organization needs to survive and thrive in its environment, helps us realize why different organizational “species” are adapted to different environments, and why and how they are born, grow, develop, and die during their typical lifetimes. However, just as with category labels, there are limitations when metaphors guide the design of Organizing Systems. Even though the metaphor is only meant to highlight the attributes at the category intersection, it is hard to ignore the often evocative imagery that comes from non-intersecting ones. A football player described as a beast might be strong and ferocious, but is definitely not a dangerous four-footed animal.16a[Bus]

[16a][Bus] This idea of using metaphor to describe the organization of businesses is developed in a bestselling book, Images of Organization (Morgan, et al 1997), Morgan discusses the machine, organism, brain, culture, political system, and other metaphors as they apply to business organizations. He notes, “Metaphor is inherently paradoxical. It can create powerful insights that also become distortions, as the way of seeing created through a metaphor becomes a way of not seeing.”

More generally, a categorical view of Organizing Systems makes it matter greatly which category is used to anchor definitions or comparisons. The Google Books project makes out-of-print and scholarly works vastly more accessible, but framing it in library terms to suggest it is a public good upsets many people with a more traditional sense of what the library category implies. We can readily identify design choices in Google Books that are more characteristic of the Organizing Systems in business domains, and the project might have been perceived more favorably had it been described as an online bookstore that offered many beneficial services for free.

A complementary perspective on Organizing Systems is that they exist in a multi-faceted or multi-dimensional design space. This framework for describing and comparing Organizing Systems overcomes some of the biases and conservatism built into familiar categories like libraries, museums, and archives, while enabling us to describe them as design patterns that embody characteristic configurations of design choices. We can then use these patterns to support inter-disciplinary work that cuts across categories and applies knowledge about familiar domains to unfamiliar ones. A dimensional perspective makes it easier to translate between category- and discipline-specific vocabularies so that people from different disciplines can have mutually intelligible discussions about their organizing activities. They might realize that they have much in common, and they might be working on similar or even the same problems.

A faceted or dimensional perspective acknowledges the diversity of instances of collection types and provides a generative, forward-looking framework for describing hybrid types that do not cleanly fit into the familiar categories. Even though it might differ from the conventional categories on some dimensions, an Organizing System can be designed and understood by its family resemblance on the basis of its similarities on other dimensions to a familiar type of resource collection.

Thinking of Organizing Systems as points or regions in a design space makes it easier to invent new or more specialized types of collections and their associated interactions. If we think metaphorically of this design space as a map of Organizing Systems, the empty regions or “white space” between the densely-populated centers of the traditional categories represent Organizing Systems that do not yet exist. We can consider the properties of an Organizing System that could occupy that white space and analyze the technology, process, or policy innovations that might be required to let us build it there. We can reason by analogy to identify and apply the principles used in one Organizing System to understand or design others. For example: Google Books is to Library as ? is to Natural History Museum.17[LIS]

[17][LIS] Depending on which characteristics of Google Books and libraries you think about, you might complete this analogy with an animal theme park like Sea World (http://www.seaworld.com/) or a private hunting reserve that creates personalized “big game” hunts. Or maybe you can invent something completely new.

But even though digital technology is radically subdividing the traditional categories of collections by supporting new kinds of specialized information-intensive applications, an opposite and somewhat paradoxical trend has emerged. Jennifer Trant argues that the common challenges of “going digital,” and the architectural and functional constraints imposed by web implementations, are causing some convergence in the operation of libraries, museums, and archives. Similarly, Anne Gilliland suggests that giving every physical resource in a collection a digital surrogate or proxy that is searchable and viewable in a web browser is “erasing the distinctions between custodians of information and custodians of things.”18[LIS]

Taken together, these two trends have one profound implication. If the traditional categories for thinking about collections are splintering in some respects and converging in others, they are less useful in describing innovative collections and their associated interactions. Thus, we need a new conceptthe Organizing Systemthat

  • applies comprehensively and consistently to collections of resources of any type

  • reuses familiar categories where they are appropriate, but does not impose them on new types of collections and services where they do not fit well;

  • makes it easier to trace the connections between specific requirements or constraints and particular functions or implementation choices.

1.3.2. What Is Being Organized?

“What is difficult to identify is difficult to describe and therefore difficult to organize.”

Before we can begin to organize any resource we often need to identify it. It might seem straightforward to devise an Organizing System around tangible resources, but we must be careful not to assume what a resource is. In different situations, the same “thing” can be treated as a unique item, one of many equivalent members of a broad category, or a component of an item rather than as an item on its own. For example, in a museum collection, a handmade, carved chess piece might be a separately identified item, identified as part of a set of carved chess pieces, or treated as one of the 33 unidentified components of an item identified as a chess set (including the board). When merchants assign a stock-keeping unit (SKU) to identify the things they sell, that SKU can be associated with a unique item, sets of items treated as equivalent for inventory or billing purposes, or intangible things like warranties.

You probably do not have explicit labels on the cabinets and drawers in your kitchen or clothes closet, but department stores and warehouses have signs in the aisles and on the shelves because of the larger number of things a store needs to organize. As a collection of resources grows, it often becomes necessary to identify each one explicitly; to create surrogates like bibliographic records or descriptions that distinguish one resource from another; and to create additional organizational mechanisms like shelf labels, store directories, library card catalogs and indexes that facilitate understanding the collection and locating the resources it contains. These organizational mechanisms often suggest or parallel the organizing principles used to organize the collection itself.

Organization mechanisms like aisle signs, store directories and library card catalogs are embedded in the same physical environment as the resources being organized. But when these mechanisms or surrogates are digitized, the new capabilities that they enable create design challenges. This is because a digital Organizing System can be designed and operated according to more abstract and less constraining principles than an Organizing System that only contains physical resources. A single physical resource can only be in one place at a time, and interactions with it are constrained by its size, location, and other properties. In contrast, digital copies and surrogates can exist in many places at once and enable searching, sorting, and other interactions with an efficiency and scale impossible for tangible things.

When the resources being organized consist of information content, deciding on the unit of organization is challenging because it might be necessary to look beyond physical properties and consider conceptual or intellectual equivalence. A high school student told to study Shakespeare’s play Macbeth might treat any printed copy or web version as equivalent, and might even try to outwit the teacher by watching a film adaptation of the play. To the student, all versions of Macbeth seem to be the same resource, but librarians and scholars make much finer distinctions.19[LIS]

[19][LIS] Organizing Systems that follow the rules set forth in the Functional Requirements for Bibliographic Records (FRBR) (Tillett 2005) treat all instances of Macbeth as the same “work.” However, they also enforce a hierarchical set of distinctions for finer-grained organization. FRBR views books and movies as different “expressions,” different print editions as “manifestations,” and each distinct physical thing in a collection as an “item.” This Organizing System thus encodes the degree of intellectual equivalence while enabling separate identities where the physical form is important, which is often the case for scholars.

Archival Organizing Systems implement a distinctive answer to the question of what is being organized. Archives are a type of collection that focuses on resources created by a particular person, organization, or institution, often during a particular time period. This means that archives have themselves been previously organized as a result of the processes that created and used them. The “original order” of the resources in an archive embodies the implicit or explicit Organizing System of the person or entity that created the documents; it is treated as an essential part of the meaning of the collection. As a result, the unit of organization for archival collections is the fondsthe original arrangement or grouping, preserving any hierarchy of boxes, folders, envelopes, and individual documentsand thus they are not re-organized according to other (perhaps more systematic) classifications.20[Arc]

[20][Arc] Typical examples of archives might be national or government document collections or the specialized Julia Morgan archive at the University of California, Berkeley (http://www.oac.cdlib.org/findaid/ark:/13030/tf7b69n9k9/), which houses documents by the famous architect who designed many of the university’s most notable buildings as well as the famous Hearst Castle along the central California coast. The “original order” organizing principle of archival Organizing Systems was first defined by 19th-century French archivists and is often described as respect pour les fonds.”

Some Organizing Systems contain legal, business or scientific documents or data that are the digital descendants of paper reports or records of transactions or observations. These Organizing Systems might need to deal with legacy information that still exists in paper form or in electronic formats like image scans that are different from the structural digital format in which more recent information is likely to be preserved. When legacy conversions from printed information artifacts are complete or unnecessary, an Organizing System no longer deals with any of the traditional tangible artifacts. Digital libraries dispense with these artifacts, replacing them with the capability to print copies if needed. This enables libraries of digital documents or data collections to be vastly larger and more accessible across space and time than any library that stores tangible, physical items could ever be.

An increasing number of Organizing Systems handle resources that are born digital. Ideally, digital texts can be encoded with explicit markup that captures structural boundaries and content distinctions, which can be used to facilitate organization, retrieval, or both. In practice the digital representations of texts are often just image scans that do not support much processing or interaction. A similar situation exists for the digital representations of music, photographs, videos, and other non-text content like sensor data, where the digital formats are structurally and semantically opaque.

This book does not emphasize systems that organize people, but it would be remiss not to mention them. Businesses organize their employees, schools organize their faculties and students, sports leagues and teams organize their players, and governments organize their citizens and residents to enable them to vote, drive, attend schools, and receive medical care and other benefits. Cemeteries organize people after they have died.

We often think and talk about time as a resource, and time fits the definition of “anything of value that supports goal-oriented activity” from §1.2.1. Furthermore, we could think of the calendar and clock as Organizing Systems that define time at different levels of granularity to support different kinds of interactions. However, it is probably more useful to think of time as a constraint that influences how and how much to organize.

If you're sorting your own mail, you can question whether the time you spend on sorting is worth the time you save on searching. But at scale—imagine 10 million books in a library—the considerable effort required to organize resources saves vastly more time for the many users of the system over its lifetime. Note the inherent tradeoff between time spent on organizing versus retrieval; this will be a recurring theme throughout this book. In a personal context the tradeoff is a matter of individual need or preference, but in social or institutional contexts organization and retrieval are generally done by different people, and their time is likely valued in different ways by the system owner.

1.3.3. Why Is It Being Organized?

“The central purpose of systems for organizing information [is] bringing like things together and differentiating among them.”

Almost by definition, the essential purpose of any Organizing System is to describe or arrange resources so they can be located and accessed later. The organizing principles needed to achieve this goal depend on the types of resources or domains being organized, and in the personal, social, or institutional setting in which organization takes place.

“Bringing like things together” is an informal organizing principle for many Organizing Systems. Almost as soon as libraries were invented over two thousand years ago, the earliest librarians saw the need to develop systematic methods for arranging and inventorying their collections.24[LIS] The invention of mechanized printing in the fifteenth century, which radically increased the number of books and periodicals, forced libraries to begin progressively more refined efforts to state the functional requirements for their Organizing Systems and to be explicit about how they met those requirements.

Today, the Organizing Systems in a large academic research library must also support many functions and services other than those that directly support search and location of resources in their collections. In these respects, the Organizing Systems in non-profit libraries have much in common with those in government agencies, corporate information repositories, and business applications. (See the sidebar, Library {and, or, vs.} Business Organizing Systems.)

Preserving documents in their physical or original form is the primary purpose of archives and similar Organizing Systems that contain culturally, historically, or economically significant documents that have value as long-term evidence. Preservation is also an important motivation for the Organizing Systems of information- and knowledge-intensive firms. Businesses and governmental agencies are usually required by law to keep records of financial transactions, decision-making, personnel matters, and other information essential to business continuity, compliance with regulations and legal procedures, and transparency. As with archives, it is sometimes critical that these business knowledge or records management systems can retrieve the original documents, although digital copies that can be authenticated are increasingly being accepted as legally equivalent.

This discussion of the requirements for organizing resources in memory institutions and businesses might convey the impression that storing and retrieving resources efficiently are paramount goals, and indeed they are in many contexts. But there are many other reasons for organizing resources, as is easily seen when we look at personal Organizing Systems. And there are many other ways to compare Organizing Systems than just how efficiently they enable storing and retrieval functions.

An overarching goal when people are organizing their personal resources is to minimize the effort needed to find the resources. But unlike the finding task in institutional Organizing Systems, which is generally facilitated with external resource descriptions, finding aids, classifications, search engines, and orientation and navigation mechanisms, the finding task in personal Organizing Systems is primarily a cognitive one: you need to remember where the resources are and how they are arranged. Because each person has unique experiences and preferences, it is not surprising that people often organize the same types of resources in different ways to make the organization easier to perceive and remember. The resulting resource arrangements often emphasize aesthetic or emotional goals, as when books or clothes are arranged by color or preference, or behavioral goals, as when most frequently used condiments and spices are kept on the kitchen counter rather than stored in a pantry.

When individuals manage their papers, books, documents, record albums, compact discs, DVDs, and other information resources, their Organizing Systems can vary greatly. This is in part because the content of the resources being organized becomes a consideration. Furthermore, many of the Organizing Systems used by individuals are implemented by web applications, and this makes them more accessible because their resources can be accessed from anywhere with a web browser.22[Web]

[22][Web] For example, many people manage their digital photos with Flickr, their home libraries with Library Thing, and their preferences for dining and shopping with Yelp. It is possible to use these “tagging” sites solely in support of individual goals, as tags like “my family,” “to read,” or “buy this” clearly demonstrate. But maintaining a personal Organizing System with these web applications potentially augments the individual’s purpose with social goals like conveying information to others, developing a community, or promoting a reputation. Furthermore, because these community or collaborative applications aggregate and share the tags applied by individuals, they shape the individual Organizing Systems embedded within them when they suggest the most frequent tags for a particular resource.

Put another way, an information resource inherently has more potential uses than resources like forks or frying pans, so it is not surprising that the Organizing Systems in offices are even more diverse than those in kitchens.

The fine distinctions between Organizing Systems that have many characteristics in common reflect subtle differences in the priority of their shared goals. For example, many Organizing Systems create collections and enable interactions with the goals of supporting scientific research, public education, and entertainment. We can contrast zoos, animal theme parks, and wild animal preserves in terms of the absolute and relative importance of these three goals with respect to animal resources.21[Cog]

[21][Cog] Seeking absolute boundaries between types of Organizing Systems is an impossible quest because how we define them varies with context or point of view. Zoos, animal theme parks, and wild animal parks all contain live animals, so we might conclude that they are more similar to each other than to a natural history museum in which the animals are all dead. Colonial Williamsburg (http://www.colonialwilliamsburg.com) has people re-enacting 18th-century Virginia and describes itself as a “living history museum,” but could it not be considered an animal theme park that has human animals? Is a cemetery in some ways a natural history museum?

When the scale of the collection or the number of intended users increases, not everyone is likely to share the same goals and design preferences for the Organizing System. If you share a kitchen with housemates, you might have to negotiate and compromise on some of the decisions about how the kitchen is organized so you can all get along. In more formal or institutional Organizing Systems conflicts between stakeholders can be much more severe, and the organizing principles might even be specified in commercial contracts or governed by laws or standards. For example, Bowker and Star note that physicians view the creation of patient records as central to diagnosis and treatment, insurance companies think of them as evidence needed for payment and reimbursement, and researchers think of them as primary data. Not surprisingly, policy making and regulations about patient records are highly contentious.23[LIS]

Once we acknowledge that stakeholders might not share the same goals, it is clear that efficiency is too narrow a measure for evaluating Organizing Systems. The ways that resources are organized embody the priorities and values of those doing the organizing, yielding arrangements and interactions designed to control or change the behaviors of the users. Put more bluntly, resources are always organized in ways that are designed to allocate value for some people (e.g., the owners of the resources, or the most frequent users of them) and not for others. From the perspective of the other types of user trying to interact with the system, this organization will likely seem unfair. In this way, organizing resources can often be seen as creating winners and losers, providing benefits to the former and imposing costs or constraints on the latter.

The emerging field of applied behavioral economics, popularized in books like Freakonomics and Nudge, explains how subtle differences in resource arrangement, the number and framing of choices, and default values can have substantial effects on the decisions people make. Consider the arrangement of salads, pasta dishes, bread, fish, meat, desserts and other types of food in a self-serve cafeteria buffet. In a school setting, the food might be organized and presented to encourage healthier eating, perhaps by making the fatty french fries and high-calorie desserts hard to reach or by providing smaller trays and plates. The same foods would likely be organized differently in an all-you-can-eat restaurant, where the goal is to minimize food costs, with less expensive items like salads at the front of the line to ensure that trays and plates will already be full when the customer gets to the more expensive items at the end of the line.23b[Bus]

The organization of cafeteria buffets to shape user behavior might not seem sinister. However, Organizing Systems can control behavior in ways that create or perpetuate inequities among their users. This unfairness is a matter of degree: a person who does not own a computer who goes to the public library to check out a popular book loses out when the library enables patrons with computer access to check out books online and assumes that everyone has an equal shot at accessing books via the Internet.

Looking to a much more insidious Organizing System, when the South African government adopted Apartheid policies to classify and segregate people by race, it systematized economic and political discrimination and great suffering for the nonwhite population. (See the sidebar, Power and Politics in Organizing.)

Chapter 7, “Classification: Assigning Resources to Categories more fully explains the different purposes for Organizing Systems, the organizing principles they embody, and the methods for assigning resources to categories.

1.3.4. How Much Is It Being Organized?

“It is a general bibliographic truth that not all documents should be accorded the same degree of organization.”

Not all resources should be accorded the same degree of organization. In this section we will briefly unpack this notion of degree of organization into three important and related dimensions: the amount of description detail or organization applied to each resource, the amount of organization of resources into classes or categories, and the overall extent to which interactions in and between Organizing Systems are shaped by resource description and arrangement. (Chapter 4 and Chapter 6, more thoroughly address these questions about the nature and extent of description in Organizing Systems.)

It is important to note that this section is not asking the question “how much stuff is being organized?” but rather to what degree is the stuff being organized. Another way to ask the same question is “how many organizing principles are at work?” in this organizing system. Your closet might be arranged only by body part covered and season; an online music store will likely organize resources and resource descriptions by genre, artist name, band name, album name, popularity, date released, and maybe others. So we would say that the online music store is organized much more than the closet, because more organizing principles are at work.

Not all resources in a collection require the same degree of description for the simple reason we discussed in §1.3.3, “Why Is It Being Organized?”: Organizing Systems exist for different purposes and to support different kinds of interactions or functions. Let us contrast two ends of the “degree of description” continuum. Many people use “current events awareness” or “news feed” applications that select news stories whose titles or abstracts contain one or more keywords (Google Alert is a good example). This exact match algorithm is easy to implement, but its all-or-none and one-item-at-a-time comparison misses any stories that use synonyms of the keyword, that are written in languages different from that of the keyword, or that are otherwise relevant but do not contain the exact keyword in the limited part of the document that is scanned. However, users with current events awareness goals do not need to see every news story about some event, and this limited amount of description for each story and the simple method of comparing descriptions are sufficient.

On the other hand, this simple Organizing System is inadequate for the purpose of comprehensive retrieval of all documents that relate to some concept, event, or problem. This is a critical task for scholars, scientists, inventors, physicians, attorneys and similar professionals who might need to discover every relevant document in some domain. Instead, this type of Organizing System needs rich bibliographic and semantic description of each document, most likely assigned by professional catalogers, and probably using terms from a controlled vocabulary to enforce consistency in what descriptions mean.

Similarly, different merchants or firms might make different decisions about the extent or granularity of description when they assign SKUs because of differences in suppliers, targeted customers, or other business strategies. If you take your car to the repair shop because windshield wiper fluid is leaking, you might be dismayed to find that the broken rubber seal that is causing the leak cannot be ordered separately and you have to pay to replace the wiper fluid reservoir for which the seal is a minor but vital part. Likewise, when two business applications try to exchange and merge customer information, integration problems will arise if one describes a customer as a single “NAME” component while the other separates the customer’s name into “TITLE,” “FIRSTNAME,” and “LASTNAME.”

Even when faced with the same collection of resources, people differ in how much organization they prefer or how much disorganization they can tolerate. A classic study by Tom Malone of how people organize their office workspaces and desks contrasted the strategies and methods of “filers” and “pilers.” Filers maintain clean desktops and systematically organize their papers into categories, while pilers have messy work areas and make few attempts at organization. This contrast has analogues in other Organizing Systems and we can easily imagine what happens if a “neat freak” and “slob” become roommates.25[Cog]

[25][Cog] (Malone 1983) is the seminal research study, but individual differences in organizing preferences were the basis of Neil Simon’s Broadway play The Odd Couple in 1965, which then spawned numerous films and TV series.

An equally wide range, from a little organization to a lot, can be seen in the Organizing Systems for businesses, armies, governments, or any other institutional Organizing Systems for people. Organizations with broad scope and many people usually have deep hierarchies and explicit reporting relationships with the CEO, general, or president at the top with numerous layers of vice presidents, directors, department heads, and managers (or colonels, majors, captains, lieutenants, and sergeants). Smaller organizations are more varied, with some embodying multi-layered management, and some embracing a flatter arrangement with fewer management levels, wider spans of authority, and more autonomy for individual workers. Many start-up firms try to grow without any management structure at all in the belief that it makes them more innovative and nimble, but evidence suggests that when no one is responsible for making decisions, the lack of accountability results in poor decisions, or in no decisions at all even when some were sorely needed.25a[Bus]

In any case, when people have to do it, describing and organizing resources is work. Stakeholders in an Organizing System often have disagreements among about how much organization is necessary because of the implications for who performs the work and who derives the benefits, especially the economic ones. Physicians prefer narrative descriptions and broad classification systems because they make it easier to create patient notes. In contrast, insurance companies and researchers want fine-grained “form-filling” descriptions and detailed classifications that would make the physician’s work more onerous.26[Com]

[26][Com] See Grudin’s classic work on non-technological barriers to the successful adoption of collaboration technology (Grudin 1994).

The cost-effectiveness of creating systematic and comprehensive descriptions of the resources in an information collection has been debated for nearly two centuries, beginning in 1841 when Sir Anthony Panizzi proposed rules for cataloguing the British Library. In the last half century, the scope of the debate grew to consider the role of computer-generated resource descriptions.27[LIS]

[27][LIS] Panizzi is most often associated with the origins of modern library cataloging. He (Panizzi 1841) published 91 cataloging rules for the British Library that defined authoritative forms for titles and author names, but the complexity of the rules and the resulting resource descriptions were widely criticized. For example, the famous author and historian Thomas Carlyle argued that a library catalog should be nothing more than a list of the names of the books in it. Standards for bibliographic description are essential if resources are to be shared between libraries. See (Denton 2007), (Anderson and Perez-Carballo 2001a, 2001b).

The amount of resource description is always shaped by the currently available technology for capturing, storing, and making use of it. Nineteenth century geologists and paleontologists typically recorded only general information about the depth and surrounding geological features when they found fossils because they had no technology for making more precise measurements and everything they noted they had to record by hand. Today, vastly more detailed information is recorded by instruments and exploited by sophisticated techniques for carbon dating and 3D reconstruction.27a[LIS]

Automatically generated descriptions are increasingly an alternative or complement to man-made ones. "Smart" resources use sensors to capture information about themselves and their environments (see §3.3.4). Our own computers and phones record information about our keystrokes, clicks, communications, and locations. Business and government computers analyze and index most of the text and speech content that flows through and between our personal phones and computers. These indexes typically assign weights to the terms according to calculations that consider the frequency and distribution of the terms in both individual documents and in the collection as a whole to create a description of what the documents are about. These descriptions of the documents in the collection are more consistent than those created by human organizers. They allow for more complex query processing and comparison operations by the retrieval functions in the Organizing System. For example, query expansion mechanisms can automatically add synonyms and related terms to the search. Additionally, retrieved documents can be arranged by relevance, while “citing” and “cited-by” links can be analyzed to find related relevant documents.

It is important to recognize the potential downside to automated resource description. A detailed description produced by sensors or computers can seem more accurate or authoritative than a simpler one created by a human observer, even if the latter would be more useful for the intended purposes. Moreover, the more detailed the description, the greater the opportunity to use it for new purposes. This might be desirable, as when a company realizes that it can cross- and up-sell because it has been tracking every click in a web store. But it could be undesirable, because detailed transaction data can be used to violate privacy and civil rights.

A second constraint on the degree of organization comes from the size of the collection within the scope of the Organizing System. Organizing more resources requires more descriptions to distinguish any particular resource from the rest, and more constraining organizing principles. Similar resources need to be grouped or classified to emphasize the most important distinctions among the complete set of resources in the collection. A small neighborhood restaurant might have a short wine list with just ten wines, arranged in two categories for “red” and “white” and described only by the wine’s name and price. In contrast, a gourmet restaurant might have hundreds of wines in its wine list, which would subdivide its “red” and “white” high-level categories into subcategories for country, region of origin, and grape varietal. The description for each wine might in addition include a specific vineyard from which the grapes were sourced, the vintage year, ratings of the wine, and tasting notes.

At some point a collection grows so large that it is not economically feasible for people to create bibliographic descriptions or to classify each separate resource, unless there are so many users of the collection that their aggregated effort is comparably large; this is organizing by “crowdsourcing.” (See the sidebar on “Web 2.0” in §1.3.6). This leaves two approaches that can be done separately or in tandem.

  • The simpler approach is to describe sets of resources or documents as a set or group, which is especially sensible for archives with its emphasis on the fonds (see §1.3.2, “What Is Being Organized?”).

  • The second approach is to rely on automated and more general-purpose organizing technologies that organize resources through computational means. Search engines are familiar examples of computational organizing technology, and §7.6, “Computational Classification” describes other common techniques in machine learning, clustering, and discriminant analysis that can be used to create a system of categories and to assign resources to them.

Finally, we must acknowledge the ways in which information processing and telecommunications technologies have transformed and will continue to transform Organizing Systems in every sphere of economic and intellectual activity. A century ago, when the telegraph and telephone enabled rapid communication and business coordination across large distances, these new technologies enabled the creation of massive vertically integrated industrial firms. In the 1920s, the Ford Motor Company owned coal and iron mines, rubber plantations, railroads, and steel mills so it could manage every resource needed in automobile production and reduce the costs and uncertainties of finding suppliers, negotiating with them, and ensuring their contractual compliance. Adam’s Smith’s invisible hand of the market as an organizing mechanism had been replaced by the visible hand of hierarchical management to control what Ronald Coase in 1937 termed “transaction costs” in The Nature of the Firm.

But in recent decades a new set of information and computing technologies enabled by Moore’s lawunlimited computing power, effectively free bandwidth, and the Internethave turned Coase upside down, leading to entirely new forms of industrial organization made possible as transaction costs plummet. When computation and coordination costs drop dramatically, it becomes possible for small firms and networks of services (provided by people or by computational processes) to out-compete large corporations through more efficient use of information resources and services, and through more effective information exchange with suppliers and customers, much of it automated. Herbert Simon, a pioneer in artificial intelligence, decision making, and human-computer interaction, recognized the similarities between the design of computing systems and human organizations and developed principles and mechanisms that could apply to both.28[Bus]

[28][Bus] Coase won the 1991 Nobel Prize in economics for his work on transaction costs, which he first published as a graduate student (Coase 1937). Berkeley business professor Oliver Williamson received the prize in 2009 for work that extended Coase’s framework to explain the shift from the hierarchical firm to the network firm (Williamson 1975, 1998). The notion of the “visible hand” comes from (Chandler 1977). Simon won the Nobel Prize in economics in 1978, but if there were Nobel Prizes in computer science or management theory he surely would have won them as well. Simon was the author or co-author of four books that have each been cited over 10,000 times, including (Simon 1997, 1996) and (Newell and Simon 1972).

Chapter 8, “The Forms of Resource Descriptions, focuses on the representation of resource descriptions, taking a more technological or implementation perspective. Chapter 9, “Interactions with Resources, discusses how the nature and extent of descriptions determines the capabilities of the interactions that locate, compare, combine, or otherwise use resources in information-intensive domains.

1.3.5. When Is It Being Organized?

“Because bibliographic description, when manually performed, is expensive, it seems likely that the ‘pre’ organizing of information will continue to shift incrementally toward ‘post’ organizing.”

The Organizing System framework recasts the traditional tradeoff between information organization and information retrieval as the decision about when the organization is imposed. We can contrast organization imposed on resources “on the way in” when they are created or made part of a collection with “on the way out” organization imposed when an interaction with resources takes place.

When an author writes a document, he or she gives it some internal organization via title, section headings, typographic conventions, page numbers, and other mechanisms that identify its parts and their significance or relationship to each other. The document could also have some external organization implied by the context of its publication, such as the name of its author and publisher, its web address if it is online or has a website, and citations or links to other documents or web pages.

Digital photos, videos, and documents are generally organized to some minimal degree when they are created because some descriptions, notably time and location, are assigned automatically to these types of resources by the technology used to create them. At a minimum, these descriptions include the resource’s creation time, storage format, and chronologically ordered, auto-assigned filename (IMG00001.JPG, IMG00002.JPG, etc.), but often are much more detailed.29[Com]

[29][Com] Most digital cameras annotate each photo with detailed information about the camera and its settings in the Exchangeable Image File Format (EXIF), and many mobile phones can associate their location along with any digital object they create. Nevertheless, these descriptions are not always correct. For example, Microsoft Office applications extract the author name from any template associated with a document, presentation, or spreadsheet and then embed it in the new documents. And if you have not set the time correctly in your digital camera any timestamp it associates with a photo will be wrong.

Digital resources created by automated processes generally exhibit a high degree of organization and structure because they are generated automatically in conformance with data or document schemas. These schemas implement the business rules and information models for the orders, invoices, payments, and the numerous other types of document resources that are created and managed in business Organizing Systems.

Before a resource becomes part of a library collection, its author-created organization is often supplemented by additional information supplied by the publisher or other human intermediaries, such as an International Standard Book Number (ISBN) or Library of Congress Call Number (LOC-CN) or Library of Congress Subject Headings (LOC-SH).

In contrast, Google and other search engines apply massive computational power to analyze the contents and associated structures (like links between web pages) to impose organization on resources that have already been published or made available so that they can be retrieved in response to a user’s query “on the way out.” Google makes use of existing organization within and between information resources when it can, but its unparalleled technological capabilities and scale yield competitive advantage in imposing organization on information that was not previously organized digitally. Indeed, Geoff Nunberg criticized Google for ignoring or undervaluing the descriptive metadata and classifications previously assigned by people and replacing them with algorithmically assigned descriptors, many of which are incorrect or inappropriate.30[Com] One reaction to the poor quality of some computational description has been the call for libraries to put their authoritative bibliographic resources on the open web, which would enable reuse of reliable information about books, authors, publishers, places, and subject classifications. This “linked data” movement is slowly gathering momentum.31[Web]

[30][Com] (Nunberg 2009) calls Google’s Book Search a “disaster for scholars” and a “metadata train wreck.” He lists scores of errors in titles, publication dates, and classifications. For example, he reports that a search on “Internet” in books published before 1950 yields 527 results. The first 10 hits for Whitman’s Leaves of Grass are variously classified as Poetry, Juvenile Nonfiction, Fiction, Literary Criticism, Biography & Autobiography, and Counterfeits and Counterfeiting.

Google makes almost all of its money through personalized ad placement, so much of the selection and ranking of search results is determined “on the way out” in the fraction of a second after the user submits a query by using information about the user’s search history and current context. Of course, this “on the way out” organization is only possible because of the more generic organization that Google’s algorithms have imposed, but that only reminds us of how much the traditional distinction between information organization and information retrieval is no longer defensible.

In many Organizing Systems the nature and extent of organization changes over time as the resources governed by the Organizing System are used. The arrangement of resources in a kitchen or office changes incrementally as frequently used things end up in the front of the pantry, drawer, shelf or filing cabinet or on the top of a pile of papers. Printed books or documents acquire margin notes, underlining, turned down pages or coffee cup stains that differentiate the most important or most frequently used parts. Digital documents do not take on coffee cup stains, but when they are edited, their new revision dates put them at the top of directory listings.

The scale of emergent organization of websites, photos on Flickr, blog posts, and other resources that can be accessed and used online dwarfs the incremental evolution of individual Organizing Systems. This organization is clearly visible in the pattern of links, tags, or ratings that are explicitly associated with these resources, but search engines and advertisers also exploit the less visible organization created over time by information about which resources were viewed and which links were followed.

The sort of organic or emergent change in Organizing Systems that takes place over time contrasts with the planned and systematic maintenance of Organizing Systems described as curation or governance, two related but distinct activities. Curation usually refers to the methods or systems that add value to and preserve resources, while the concept of governance more often emphasizes the institutions or organizations that carry out those activities. The former is most often used for libraries, museums, or archives and the latter for enterprise or inter-enterprise contexts. (For more discussion, see §2.5.4, “Governance”)

The Organizing Systems for businesses and industries often change because of the development of de facto or de jure standards, or because of regulations, court decisions, or other events or mandates from entities with the authority to impose them.

1.3.6. How (or by Whom) Is It Organized?

“The rise of the Internet is affecting the actual work of organizing information by shifting it from a relatively few professional indexers and catalogers to the populace at large. An important question today is whether the bibliographic universe can be organized both intelligently (that is, to meet the traditional bibliographic objectives) and automatically.”

In the preceding quote, Svenonius identifies three different ways for the “work of organizing information” to be performed: by professional indexers and catalogers, by the populace at large, and by automated (computerized) processes. Our notion of the Organizing System is broader than her “bibliographic universe,” making it necessary to extend her taxonomy. Authors are increasingly organizing the content they create, and it is important to distinguish users in informal and formal or institutional contexts. We have also introduced the concept of an organizing agent (§1.2.3.1) to unify organizing done by people and by computer algorithms.

Professional indexers and catalogers undergo extensive training to learn the concepts, controlled descriptive vocabularies, and standard classifications in the particular domains in which they work. Their goal is not only to describe individual resources, but to position them in the larger collection in which they reside.32[LIS] They can create and maintain Organizing Systems with consistent high quality, but their work often requires additional research, which is costly.

[32][LIS] This is an important distinction in library science education and library practice. Individual resources are described (“formal” cataloging) using “bibliographic languages” and their classification in the larger collection is done using “subject languages” (Svenonius 2000, Ch. 4 and Ch. 8, respectively). These two practices are generally taught in different library school courses because they use different languages, methods and rules and are generally carried out by different people in the library. In other organizations, the resource description (both formal and subject) is created in the same step and by the same person.

The class of professional organizers also includes the employees of commercial information services like Westlaw and LexisNexis, who add controlled and, often, proprietary metadata to legal and government documents and other news sources. Scientists and scholars with deep expertise in a domain often function as the professional organizers for data collections, scholarly publications and proceedings, and other specialized information resources in their respective disciplines. The National Association of Professional Organizers (NAPO) claims several thousand members who will organize your media collection, kitchen, closet, garage or entire house or help you downsize to a smaller living space.33[Bus]

[33][Bus] NAPO: http://www.napo.net The name and scope of this organization seems a bit odd given how much professional organizing takes place in business, science, government, medicine, education, and other domains where closets and garages are not the most important focus.

Many of today’s content creators are unlikely to be professional organizers, but presumably the author best understands why something was created and the purposes for which it can be used. To the extent that authors want to help others find a resource, they will assign descriptions or classifications that they expect will be useful to those users. But unlike professional organizers, most authors are unfamiliar with controlled vocabularies and standard classifications, and as a result their descriptions will be more subjective and less consistent.

Similarly, most of us do not hire professionals to organize the resources we collect and use in our personal lives, and thus our Organizing Systems reflect our individual preferences and idiosyncrasies.

Non-author users in the “populace at large” are most often creating organization for their own benefit. These ordinary users are unlikely to use standard descriptors and classifications, and the organization they impose sometimes so closely reflects their own perspective and goals that it is not useful for others. Fortunately most users of “Web 2.0” or “community content” applications at least partly recognize that the organization of resources emerges from the aggregated contributions of all users, which provides incentive to use less egocentric descriptors and classifications. The staggering number of users and resources on the most popular applications inevitably leads to “tag convergence” simply because of the statistics of large sample sizes.

Finally, the vast size of the web and the even greater size of the “deep” or invisible web, composed of the information stores of business and proprietary information services makes it impossible to imagine today that it could be organized by anything other than the massive computational power of search engine providers like Google and Microsoft.35[Web]

[35][Web] (He et al. 2007) estimate that there are hundreds of thousands of websites and databases whose content is accessible only through query forms and web services, and there are over a million of those. The amount of content in this hidden web is many hundreds of times larger than that accessible in the surface or visible web.

See http://www.worldwidewebsize.com/ for estimates of the size of the visible web calculated from comparisons of results from search engines.

Nevertheless, in the earliest days of the web, significant human effort was applied to organize it. Most notable is Yahoo!, founded by Jerry Yang and David Filo in 1994 as a directory of favorite websites. For many years the Yahoo! homepage was the best way to find relevant websites by browsing the extensive system of classification. Today’s Yahoo! homepage emphasizes a search engine that makes it appear more like Google or Microsoft Bing, but the Yahoo! directory can still be found if you search for it.

1.3.7. Where is it being Organized?

“Bibliographic control requires fixing a document in the bibliographic universe by its space-time coordinates.”

Having identified the resources, reasoned about our motivations, limited the scope and scale, and determined when and by whom the organization will occur, we come finally to the question of where the resources are being organized.

In ordinary use, "Where" refers to a physical location. But the answer to “where?” often depends on whether we are asking about the current location, a past location, or an intended destination for resources that are in transit or in process. The answer to the question “where?” can take a lot of different forms. We can talk about an abstract space like “a library shelf” or we can talk about “the hidden compartment in Section XY at the Library of Congress,” as depicted in the 2004 movie “National Treasure.” We can answer “where?” with a description of a set of environmental conditions that best suit a class of wildlife, or a tire, or a sleeping bag. We can answer “where?” with “Renaissance Europe” or “Colonial Williamsburg.” “Where?” can be a place in a mental construct, or even a place in an imagined location.

In the architectural design of an Organizing System, its physical location is usually not a primary concern. In most Organizing Systems, the matter of where the Organizing System and the resources are located can be abstracted away. So, in practice, resource location often is not as important as the other questions here. Physical constraints of the storage location should generally be relegated to an implementation concern rather than an architectural one. The construction of a special display structure for a valuable resource is not an independent design dimension; it is just the implementation of the user interface. (See §5.7, “The Implementation Perspective ”)

Physical resources are often stored where it is convenient and efficient to do so, whether in ordinary warehouses, offices, storerooms, shelves, cabinets, and closets. It can be necessary to adapt an Organizing System to characteristics of its physical environment, but this could undermine architectural thinking and make it harder to maintain the organization over time, as the collection evolves in scope and scale. (See §2.3.1, “Organizing Physical Resources”)

Digital resources, on the other hand, are increasingly organized and stored “in the cloud” and their actual locations are invisible, indeterminate, and generally irrelevant, except in situations where the servers and the information they hold may be subject to laws or practices of their physical location. For example, a controversy arose in Canada in 2013 when researchers discovered that internet service providers were, for various technical and business reasons, routinely routing trans-Canada web traffic through the United States. Because Canada has no jurisdiction over data traveling through cables and servers in another country, there was considerable outcry among Canadians who were concerned that their personal information was being subjected to the privacy laws and practices of another country without their knowledge or consent.

Sometimes location functions as an organizing principle in its own right, which in practice essentially collapses many of these architectural distinctions. This is frequently the case in our personal Organizing Systems, where we may exploit the innate human capability for spatial memory by always putting specific things like keys, eyeglasses, and cell phones in the same place, which makes them easy to find. But we can also see this happening in systems as complex and varied as: real estate information systems; wayfinding systems, such as road signage or mile markers; standardized international customs forms with position-specific data fields; geographic information systems; air, ground, sea, and space traffic control systems; and historic landmark preservation.

In §2.3.2, “Organizing Places” we consider the organization of the land, built environments, and wayfinding systems. §5.5, “The Structural Perspective” discusses the structural perspective on resource relationships, and in some systems, it may be very significant where resources are located in relation to one another. In The Barnes Collection, for example, works of art are physically grouped to enunciate common characteristics. Conversely, zoos do not mix the kangaroos with the wild dogs, and the military does not mix the ingredients for chemical weapons (at least, not until they plan to use them). There are also circumstances where resources can only exist in (or are particularly suited to) particular environments, such as the conditions required to grow wine grapes or mushrooms, or store spent nuclear fuel. UPS advises companies on where to put their warehouses and shipment centers. These are more substantial than questions of presentation, but it is debatable whether it falls under the storage or logic tier (you could have the principle of “keep the mushrooms somewhere moist” while not dictating where particularly).

Sometimes the location of an Organizing System seems particularly salient, as in the design of cities where the street plan can be essential for orientation and navigation, and is embodied in zoning, voting, and other explicit organization, as well as in informal organization like neighborhood identity. But even here, it is really the people who live in the city who are being organized and whose interactions with the city and with each other are being encouraged or discouraged, not the physical location on which they live.

Indeed, in designing an Organizing System you will often find that questions about location tumble naturally out of the other five design dimensions. For instance, questions about “when,” “what,” and “where” are often inseparable, particularly when an Organizing System is subject to outside regulations, which tend to have geographical jurisdictions. "Where" is also commonly bound up with “who” and “why,” when locational challenges or opportunities faced by a system's creators or users necessitate special design consideration. (See §3.5.2, “Effectivity”)

Location can be critically important to an Organizing System—too important, in fact, to be considered alone. The question of “where?” is best considered in context of the other five design dimensions as a whole; a narrow focus on where the resources are being organized too often privileges past convention over architectural thinking and perpetuates legacy issues and poorly organized systems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset