2

Institutional repositories intellectual property

Abstract:

This chapter provides an overview of intellectual property and copyright law as it pertains to institutional repositories. Issues related to collecting both published and unpublished works are addressed, along with author publication agreements, repository submission agreements, and licensing options for repository content. The chapter concludes with an examination of issues related to sharing data through a repository.

Key words

institutional repository

intellectual property

copyright

author agreements

data

open data

Intellectual property

Institutional repositories exist for the sole purpose of archiving and sharing intellectual property. As defined by the World Intellectual Property Organization (WIPO), intellectual property refers to “creations of the human mind”:

“Intellectual property relates to items of information or knowledge, which can be incorporated in tangible objects at the same time in an unlimited number of copies at different locations anywhere in the world. The property is not in those copies but in the information or knowledge reflected in them.”

(WIPO, n.d., pp. 3–4)

In 1967, the Convention Establishing the World Intellectual Property Organization outlined the categories of creations that are granted intellectual property rights – for example, “literary, artistic and scientific works”, “industrial designs”, and “scientific discoveries” (WIPO, n.d., p. 3). WIPO divides these types of creations into two broad classifications: industrial property and copyright. Industrial property refers to inventions, trademarks and other related ideas, while copyright refers to literary or artistic creations – and, importantly, the expression of ideas, not the ideas themselves (e.g., as opposed to inventions, the idea for which may be protected by a patent) (WIPO, n.d.). Both categories of intellectual property are granted protections by various national laws and by international treaties that govern the relationships between intellectual property created in, or by citizens of, different countries. In general, the owner of intellectual property – whether a patented or a copyrighted work – holds exclusive rights to the use of that work, and uses by anyone other than the rightsholder either require permission or must be specifically provided for in law.

The vast majority of works deposited in institutional repositories are literary or artistic (in the broadest sense of the terms) creations. So while it is possible that institutional repository managers may occasionally need to interact with materials for which patent or trademark law is an issue, every repository manager will encounter questions related to copyright law. A basic understanding of what copyright law is, what it intends to do, and what rights and responsibilities it places on creators and users of copyrighted works, is integral to the work of repository managers.

U.S. copyright law

In the U.S.A., the foundation for copyright law is found in Article I, Section 8, Clause 8 of the U.S. Constitution: “The Congress shall have Power […] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;” (U.S. Constitution, 1787). The core idea expressed is that those who create intellectual property should have the exclusive opportunity to profit from their creations – but that the exclusive nature of the opportunity should be limited. That limitation provides an opportunity for the public to use, and build on, the work of others.

U.S. copyright law is codified in Title 17 of the U.S. Code. The Copyright Act of 1976 provided the most significant recent revisions to U.S. law, though further amendments have been made since the passage of the Act. Under current law,

(a) Copyright protection subsists, in accordance with this title, in original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device. Works of authorship include the following categories:

(1) literary works;1

(2) musical works, including any accompanying words;

(3) dramatic works, including any accompanying music;

(4) pantomimes and choreographic works;

(5) pictorial, graphic, and sculptural works;

(6) motion pictures and other audiovisual works;

(7) sound recordings; and

(8) architectural works.

(b) In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.

(U.S.C. Title 17, Sec. 102)

In brief, copyright protection is afforded to any original work if that work is recorded (physically or electronically) in a manner in which it may be perceived by a person. Works that contain no original content/authorship are not eligible for copyright protection. Furthermore, copyright does not protect an idea (like a patent does); it only protects the particular original expression of that idea.

Box 2.1   International copyright law

As may be expected, intellectual property and copyright law varies by country. However, in order to ensure that creators’ copyrights are globally respected, the international community has created treaties to protect works that are copyrighted in another country. The two primary treaties that govern international copyright are the Berne Convention for the Protection of Literary and Artistic Work (1886) and the Universal Copyright Convention (1952). Countries who are signatories to these treaties must have domestic copyright laws that are consistent with the treaties and must offer foreign nationals’ works the same protections that are afforded to that country’s citizens (Heller, 2004).

A significant difference between copyright law in the U.S.A. and in other countries is the treatment of moral rights. “Moral rights” generally refer to the right of the author of a work to be identified as the author of that work, and to preserve the integrity of the work. For example, in some European countries, moral rights are granted to an author separately from the copyrights, which are designed to protect the economic interests of the author. However, in the U.S.A., moral rights are not explicitly granted to authors/creators of copyrightable works (except in the case of visual art); copyright law focuses primarily on property (economic) rights (Bird and Ponte, 2006).

Another notable distinction is seen in the ideas of “fair use” in the U.S.A. and “fair dealing” in the U.K., Canada, and other Commonwealth countries. Both are exceptions to the exclusive rights of copyright holders to make use of their works in specific ways, but the parameters of what is defined as “fair” utilization of copyrighted material differs. (The concept of “fair use” will be discussed in more detail throughout this book.)

It is not uncommon for faculty and researchers at higher education institutions to collaborate with peers from other countries. When these collaborations result in intellectual property that is subsequently submitted to an institutional repository, it may be advisable for repository managers to confirm the expectations of coauthors/cocreators from other countries with regard to their copyrights, and how their works may end up being used if they are made available through the repository. It should be noted, though, that significant conflicts related to international copyright law will likely occur infrequently – if at all – for most institutional repositories.

Copyright law: understanding the basics

While copyright law is complex both in its essence and in its application, a basic understanding of the characteristics of copyright should be sufficient for scholarly communication librarians or institutional repository managers to be able to identify when there may be a potential issue with a submission. At that point, most institutions have a copyright officer or legal counsel from whom additional advice may be sought.

Copyright protection does not require registration, notice, or publication. If a work qualifies for copyright protection (i.e., it is a work of original authorship), such protection is in place from the moment the work is “fixed in any tangible medium of expression” (U.S.C. Title 17, Sec. 102). In other words, as soon as a photograph is taken, a word-processing document created, or original ideas written in a notebook, the copyright in that work is owned by the author of the work under copyright law. The absence of a copyright statement on the work, or failure to register the work with the U.S. Copyright Office, does not negate the author’s rights. In addition, copyright applies to both published and unpublished works. There is no requirement that a work must be formally published in order to receive copyright protection.

Copyright grants specific rights to the copyright holder. As intended by the U.S. Constitution, copyright law provides the copyright holder – for the term of the copyright – with exclusive rights to use of his/her work. These include the right to:

image reproduce the work in copies or phonorecords;

image prepare derivative works based upon the work;

image distribute copies or phonorecords of the work to the public by sale or other transfer of ownership, or by rental, lease, or lending;

image perform the work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works;

image display the work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work

image perform the work publicly (in the case of sound recordings) by means of a digital audio transmission.

(U.S.C. Title 17, Sec. 106)

In the case of an institutional repository, a copy (reproduction) of a work is usually distributed to the public through the repository. This clearly involves two of these exclusive rights, which generally means the copyright holder must, at minimum, authorize the repository’s institution to both make the copy and to disseminate it.

Use of copyrighted material generally requires permission. Because copyright holders are granted exclusive rights, if any individual other than the copyright holder wishes to exercise any of the rights listed above, he/she must usually seek permission from the copyright holder. (It is important to note that the author of a published work may not always be the copyright holder; copyright ownership should be verified so that permission requests are sent to the appropriate party.) The only reason that permission would not be required is if the proposed use was addressed by one of the limitations to the exclusive rights enshrined in copyright law (see below).

There are limitations to the copyright holder’s exclusive rights. In the interest of balancing an author’s right to profit from his/her work with the public’s ability to use that work for the progress of art and science, copyright law includes constraints on the exclusive rights listed above. These limitations on the exclusivity of copyright grant educational institutions, libraries, and others, specific rights in the use of copyrighted works. For example, Sec. 108 (U.S.C. Title 17) provides libraries with the ability to engage in the copying and distribution of copyrighted materials for the purposes of interlibrary loan and preservation. If a library’s use falls within the scope of prescribed activities, it does not need to seek permission from the copyright holder for that use.

Fair use is the most generous limitation to the copyright holder’s exclusive rights. While limitations on the exclusivity of copyrights such as those in Sec. 108 allow for a wider variety of uses to be considered noninfringing, most limitations in copyright law are quite specific as to the range of activities they will permit. However, Sec. 107 (U.S.C. Title 17) contains a very broad, and frequently misunderstood, exception: fair use. (It is worth reiterating here that “fair use” is applicable only in the U.S. context; the concept of “fair dealing” in some other countries is more restrictive).

The concept of “fair use” is intended to limit the exclusive right of the copyright holder to make copies of (reproduce) his/her work or to give permission to others to do the same:

“[T]he fair use of a copyrighted work, including such use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright.”

(U.S.C. Title 17, Sec. 107)

Whether a specific use of a copyrighted work is considered “fair” is determined by consideration of four different factors:

(1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;

(2) the nature of the copyrighted work;

(3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and

(4) the effect of the use upon the potential market for or value of the copyrighted work.

(U.S.C. Title 17, Sec. 107)

Although the U.S. Congress, academic institutions, and professional organizations and societies have all developed more prescriptive interpretations of fair use (usually called “guidelines” or “best practices”), statutory copyright law (U.S.C. Title 17) does not include any further specification as to what constitutes fair use. Therefore, it is left to the individual (or institution) to make a reasonable determination on a case-by-case basis as to whether a particular use of copyrighted material may constitute fair use. Ultimately, only a court has the authority to evaluate the specific use and determine if, indeed, the use in question is fair. While current case law and precedent provides some guidance as to what the court may consider to be fair use, the unique circumstances of each use should always be considered when deciding whether to seek permission from the copyright holder or to make a determination that the use is fair.

Copyright may be transferred. Though the author of a work is usually granted ownership of the copyright in that work by default (except in the case of works for hire, works by federal government employees, and other limited exceptions), the author does not always remain the copyright holder. For example, in scholarly journal publishing, the author usually transfers the copyright to the publisher. The copyright holder of a work may choose, at any time, to transfer one or all of the exclusive rights provided by law. In practical terms, this means that the party to whom the copyrights are transferred is able to recognize the economic benefit of owning exclusive rights to that intellectual property. However, unless moral rights are specifically transferred, the author still retains the right to be identified as the creator of the work. This is why, as noted above, it is important not to assume that the author of record for a work is necessarily the copyright holder when pursuing copyright permissions.

Not all works that appear eligible for copyright have copyright protection. Even though a work may seem to qualify for copyright protection (in that it is a work of original authorship), some works do not fall under copyright. In some cases, this may be because copyright protection has expired. However, the majority of materials added to an institutional repository will have been created recently, which means their copyright protection will likely not expire for over 100 years (information on copyright terms may be found in the U.S. Copyright Office’s Circular 15a, Duration of Copyright). In other cases, the work may not actually have been eligible for copyright; for example, works created by the federal government are generally not afforded copyright protection. Finally, the owner of the copyright in a work may have elected to release his/her exclusive rights in the work and make it openly available to use. Works such as these that are not under copyright are considered to be in the public domain. This means that anyone may exercise any of the rights normally granted to a copyright holder, and there is no need to seek permission to do so.

Copyright and institutional repositories

It is clear that most materials in institutional repositories are governed by copyright law. Therefore, librarians and library staff who manage institutional repositories must ensure that their dissemination of copyrighted materials through the repository does not infringe on the exclusive copyrights of the rightsholders. With this aim in mind, it is important to understand:

image the types of works that may be submitted to the repository, and their concomitant issues;

image local institutional policy surrounding the dispensation of copyright in faculty and student work;

image how best the library/institution can support the fair use of copyrighted materials; and

image relevant laws and necessary policies and practices related to potential infringement by materials hosted in the repository.

With regard to the first point, the most basic consideration to guide policy and practice is the respective treatment of copyrighted works that are either published or unpublished. The latter three points are of primary relevance when considering the inclusion of unpublished works, and will be discussed in that framework following discussion of published works.

Copyright considerations for published works

Most institutional repositories have a stated objective of collecting and disseminating faculty journal articles as a means of opening up access to the scholarly literature. For such articles, copyright is usually transferred by the author(s) to the publisher, which means that, if necessary, permission requests to post the article in the repository must be directed to the publisher. However, as discussed later in this chapter (see “Contracts and licenses”), most publishers will explicitly grant limited rights to the authors of scholarly articles to allow the authors to share copies of their articles either informally (with colleagues) or more broadly (through faculty websites or institutional repositories).

The exact rights granted by a publisher to an author are usually dependent on the version of the scholarly article in question. Though terminology varies by publisher, there are generally considered to be three versions of a scholarly manuscript:

image Preprint: This is the original manuscript as submitted by the author to the journal.2

image Postprint: This is the manuscript after it has undergone peer review, the author has made revisions recommended by the reviewers, and the manuscript has been accepted for publication by the journal. Some publishers refer to this version as the accepted manuscript.

image Published: This is the manuscript after it has been copyedited by the publisher and has been laid out in the format in which it will be published. It is the same version that anyone who subscribes to the journal/pays for the article would be able to download from the publisher’s website. This version is often referred to as the publisher’s PDF.

The vast majority of scholarly publishers will allow an author to post the preprint version of a manuscript in that author’s institutional repository (or personal website). It is considered best practice (and required by some publishers) to place a citation on the posted manuscript that states it has been “Submitted to: Journal XYZ.” If the article is accepted for publication, many publishers will allow the author to keep the preprint version available online; however, the citation on the article must be changed to indicate that the manuscript has been published, and that the final version of record is available from the publisher. For example:

This is the pre–peer reviewed version of the following article: Smith, John (2012) The giving tree in academia: The economics of scholarly publishing. Journal of Unequal Transactions, 3(2): 195–207, which has been published in final form at doi: 10.1000/jut.0.0014.

The exact format for this citation will vary by publisher; it should also be noted that publishers vary as to whether or not they will allow a preprint to be posted online prior to the journal publishing the final article. Either the repository manager or the author should confirm the exact policies of the publisher in question prior to posting a preprint to the repository.

An increasing number of scholarly publishers (including major commercial publishers like Elsevier and Taylor & Francis) will also allow authors to deposit the postprint version of their manuscripts into institutional repositories. As with preprint manuscripts, most publishers have a specific citation that they require be placed on the postprint manuscript in order to direct readers to the publisher’s version of record. For example:

This is a peer-reviewed, electronic version of an article published in Smith, John (2012) The giving tree in academia: The economics of scholarly publishing. Journal of Unequal Transactions, 3(2): 195–207. Journal of Unequal Transactions is available online at: http://www.jut.com/jut.0.0014

Depending on the journal and publisher, there may be an embargo imposed on the posting of the postprint version of the manuscript in an institutional repository. An embargo refers to a delayed release (usually of 6–12 months); if an embargo exists, the manuscript may not be made available until after the embargo period is over. For example, Taylor & Francis requires a different embargo based on the journal’s disciplines:

“The right […] is subject always to an embargo of 12 months after first Publication (be it online or in print) in STM (science, technology and medicine) subjects and the behavioural sciences and of 18 months after first publication for SSH (social science, arts and humanities) journals.”

(Taylor & Francis Group, n.d.)

The ability of an author to deposit the postprint version of his/her article in an institutional repository may also be affected by local institutional policy. For example, at the time of writing, Elsevier’s policy regarding an “author accepted manuscript” (postprint) reads as follows:

“Elsevier believes that individual authors should be able to distribute their AAMs for their personal voluntary needs and interests, e.g. posting to their websites or their institution’s repository, e-mailing to colleagues. However, our policies differ regarding the systematic aggregation or distribution of AAMs to ensure the sustainability of the journals to which AAMs are submitted. Therefore, deposit in, or posting to, subject-oriented or centralized repositories (such as PubMed Central), or institutional repositories with systematic posting mandates is permitted only under specific agreements between Elsevier and the repository, agency or institution, and only consistent with the publisher’s policies concerning such repositories.” [emphasis added]

(Elsevier, n.d.)

According to this policy, faculty at institutions that have a mandate (open-access policy) for faculty to deposit their scholarly journal manuscripts in an institutional repository are unable to post Elsevier postprints absent a specific agreement between Elsevier and their institution. The practical impact and interpretation of this policy varies, however, especially because most institutional faculty mandates include an opt-out clause for faculty whose publishers will not allow such posting.

Box 2.2   University open-access policies

In an effort to increase access to their work, faculty at dozens of universities have created and implemented policies (known as “open-access mandates”) which require that they and their colleagues deposit their scholarly articles in their institutions’ repositories. Following the model established by the Harvard University Faculty of Arts and Sciences, many policies at U.S. institutions require faculty to grant a nonexclusive license to the institution to archive and distribute the final, peer-reviewed version of their manuscripts.

A best practices guide for the development and implementation of university open access policies has been created by scholarly communication leaders. The guide is hosted by the Harvard Open Access Project and is available from: http://bit.ly/goodoa

Though many publishers will allow deposit of preprint and postprint article versions in an institutional repository, the number of traditional publishers (open-access publishers are more liberal in their policies) that will permit authors to post the final published version of their articles in an institutional repository is much smaller. This is not surprising, as a freely available copy of the published version of an article represents direct economic competition with the copy of the article that the publisher is selling, either via subscription or pay per view. However, it is highly recommended that – absent a posted policy to the contrary – repository managers contact publishers to enquire about the possibility of posting the published version in the repository.

Regardless of the version of the manuscript that is finally posted in the repository, it is extremely important, both from a practical and legal standpoint, to ensure that appropriate metadata are included with the repository record for the manuscript. In addition to the citation required by the publisher, a statement of copyright ownership should be added on the manuscript itself (for preprint and postprint versions that may be edited by the repository manager) and in the repository metadata record. This is vital not only for complying with a publisher’s policy (or, absent a requirement, demonstrating a good-faith effort at recognizing the publisher’s ownership), but for alerting copyright users as to the copyright status of the article. Though the manuscript may be freely available to download, an individual user’s further use of the manuscript is governed by the copyright owned by the publisher. If the repository wishes to be absolutely clear, it can place a statement such as this on all published work it posts:

This article is posted with the permission of the copyright holder. Further use that extends beyond personal or fair use may require permission from XYZ Publisher.

Box 2.3   SHERPA/RoMEO

A significant challenge in determining what articles may legally be deposited in an institutional repository is identifying the publisher’s policy on self-archiving. While it is always best to examine the actual contract signed by the author, it is also helpful to review the publisher’s standard policy and publication agreement. The University of Nottingham’s SHERPA (Securing a Hybrid Environment for Research Preservation and Access) initiative hosts a directory, RoMEO, for precisely this purpose: “RoMEO is a searchable database of publisher’s policies regarding the self-archiving of journal articles on the web and in Open Access repositories” (description from SHERPA/RoMEO FAQ). RoMEO may be searched using a journal title or ISSN, as well as by a publisher’s name. An API is also available to support custom automated workflows.

SHERPA/RoMEO is available from: http://www.sherpa.ac.uk/romeo/index.php

Because repositories have focused primarily on collecting scholarly journal literature, there is a commonly understood set of issues and processes for addressing them. However, the approach to posting other complete3 published works in the repository – whether book chapters, images, or other works – should start with a similar process: identify the copyright holder and seek permission for the desired use.

Copyright considerations for unpublished works

As academic libraries and university administrators have recognized the potential for institutional repositories to showcase a broad variety of student and faculty work, repositories are increasingly used to disseminate unpublished works that would have formerly been shared only with an isolated audience – for example, at a disciplinary conference or within a classroom setting. Though this represents an opportunity for such work to have a greater impact, open dissemination of the work also creates opportunities for potential infringement of copyrights or privacy rights. The potential for disseminating either defamatory or obscene content also exists, though given the nature of work shared through most repositories, this is unlikely.

Similar to published work submitted to the repository, verifying the copyright ownership of unpublished work should be the first consideration for repository managers. It is the owner of the work who has the authority to deposit it in the repository, and the owner should also be the individual(s) taking responsibility for the content of the work.

As with published works, the owner may not necessarily be the author(s), especially depending on the type of work. At academic institutions, the dispensation of copyright in eligible works is usually determined by an institutional copyright policy. Though the exact policy will vary by institution, it is likely that the policy will address, at minimum, the following issues:

image Works made for hire: in U.S. copyright law (U.S.C. Title 17, Sec. 101 and Sec. 201(b)), if a copyrightable work is created within the scope of an individual’s job, or is specifically commissioned by his/her employer, copyright ownership in the work belongs to the employer. At academic institutions, this means that – as a rule – copyright in works created by staff (nonfaculty) is owned by the institution, unless there exists a specific agreement to the contrary.

image Faculty scholarship: scholarly works created by faculty members (e.g., journal articles, books, etc.) are usually not considered works made for hire, and are the sole property of the faculty members who create them (until such time as the faculty member transfers copyright to a publisher).

image Course content: there is greater variability between institutions as to the copyright ownership of course content (e.g., lecture materials or online courses). While some institutions view course materials as being the same as faculty scholarship, other institutions take whole or partial copyright ownership of such materials. In instances where an institutional policy delineates an institutional right in such works, the scope/nature of institutional resources that the faculty member used to create the work is often a deciding factor as to in whom copyright ultimately vests.

image Student work: if an institutional copyright policy does not explicitly address the ownership of works created by students while enrolled at the university, then if the work created by the student is otherwise eligible for copyright protection under law, that copyright is owned by the student. This includes everything from normal course work to culminating projects such as theses and dissertations.

Beyond establishing/confirming the ownership of an unpublished work, it is also necessary to ensure (to the greatest degree possible) that the work will not place the institution at undue risk for legal action due to the inclusion in the work of copyrighted materials owned by a third party.

Before going further, it must be made clear that it is not practical, nor desirable, for the library and the repository manager to become the “copyright police” for unpublished work submitted to the repository. This is not a sustainable practice and, furthermore, removes responsibility from where it should rest – with the author(s) – and places it on the institution. The best way to make certain that responsibility remains with the author is to implement a clear and comprehensive repository submission agreement (discussed further later in this chapter) that requires the author to provide legally binding assurances that his or her work does not infringe on others’ rights.

However, while the repository manager may not take an active role in reviewing the entire substance of all works submitted to the repository, it is both possible and likely that potential infringement will come to light while processing submissions. This is particularly true in cases where the repository manager takes an active role in helping a student or faculty member to format a work prior to dissemination through the repository.

In order to guide case-by-case decisions on how to deal with the potentially infringing inclusion of copyrighted material – whether that inclusion is known or unknown to the repository manager – in student or faculty work, it is vital that the library develop a policy that addresses the fair use of third-party copyrighted materials in unpublished work submitted to the repository.

There are two opposite and extreme positions that libraries could take: posting no work that includes third-party copyrighted material unless documented permission has been obtained by the submitting author or simply posting all work that is submitted with the belief that all submitting authors understand, and are appropriately applying, the fair use defense. Certainly, neither of these positions is desirable. The former would have a chilling effect on the promotion of fair use as a tool for students, educators, and scholars and the latter could potentially open an institution (and the work’s creator) up to infringement claims. However, in between these two extremes is a combination of policies and practices that libraries are able to adopt in order to support ethical and fair use of copyrighted materials.

Policy statement in support of fair use. The first, and most important, step is to codify the library’s support for its users’ application of the fair use defense. As noted by ARL (2012, p. 23), “Librarians can and should respect the integrity of deposited materials that include selections from copyright[ed] works incorporated in reliance of fair use.” For example, the repository policy at the author’s institution states:

“The University Library does not review for compliance with copyright law the content of all such scholarly or creative unpublished materials that are submitted to CommonKnowledge. Furthermore, the Library fully supports the right of our community members to make fair use of copyrighted materials (as outlined in Title 17, Section 107 of the U.S. Code) in the creation of their own works. CommonKnowledge administrators will not make a fair use determination of submitted work; such a determination is considered to be the responsibility of the creator(s). However, in instances when it is readily and reasonably apparent that copyright law would be violated by posting a work, CommonKnowledge administrators may request that the creator(s) obtain permission from any relevant copyright holder(s).”

This policy makes it clear that it is the responsibility of the student or faculty member submitting the work to make a reasonable determination (as required by law) as to whether or not his/her use of copyrighted materials should be considered fair use. However, it also leaves open the possibility that, when copyright infringement is evident – as in the inclusion of consumable works, which would likely not be considered fair use – the library may request that permission be obtained for the use.

Box 2.4   Case study: fair use

A faculty member presented a paper at a philosophy conference that discussed artistic choices in representing a historical event. In the paper, he examined how a sculptor had created a piece of art based on an iconic photograph from recent U.S. history. The sculptor had taken artistic license with the work to make it more inclusive – while the original photograph had included only white individuals, the artist chose to make the sculpted figures representative of different races and ethnicities.

When the faculty member submitted this paper for inclusion in his institution’s repository, he included both a cropped version of the iconic photograph and a photograph of the sculptor’s work so that readers would understand what the paper was discussing. The copyright for the iconic photograph was owned by a newspaper, and the copyright for the photograph of the sculptor’s work was owned by the sculptor’s studio. Licensing options were available for the newspaper photograph, but only for limited periods of time (licensing was tailored to reproduction in other newspapers or magazines), and would not have allowed long-term archiving. The newspaper would not grant permission for use of the photograph, instead referring the faculty member to the licensing agency. Licensing options did not appear to be available for the photograph of the sculptor’s work.

The repository manager presented the faculty member with two options: either determine that his proposed use was fair, or remove the images from the paper. The faculty member was comfortable with the fair use defense and requested that the paper be posted with the photographs. Full attribution and statements of copyright ownership were added to the paper for both photographs prior to posting in the repository4 (analysis of case provided in chapter endnotes, p. 57).


4Analysis: in this particular case, the repository manager believed that the faculty author had a reasonable case for believing that the proposed use was fair. The photographs were integral to the faculty member’s critique and analysis; for readers, seeing the images discussed is far more valuable than having them simply described. Due to the nature of the photographs, there were no available substitutes that could have been used. The photographs were used for noncommercial criticism, and the image files used were of low resolution – they would not present market competition for higher resolution images (in addition, the iconic photograph was cropped, so the whole image was not used). It was also noted that the iconic image was available on Wikipedia, and the Wikipedia rationale for use of nonfree content (see http://en.wikipedia.org/wiki/Wikipedia:Non-free_content for a complete description) was used by both the faculty member and the repository manager in considering whether the proposed use was fair.

Policy and process for responding to infringement claims. In the U.S.A., the Digital Millennium Copyright Act (DMCA) provides a limitation to the liability of online service providers (e.g., institutions hosting an online repository) for infringing materials that are placed online at the request/direction of individual users. This limitation applies only if the service provider/institution in question is, among other requirements, (a) not aware of the infringing nature of the content and (b) immediately complies with appropriate requests from copyright holders to remove infringing content (U.S. Copyright Office, 1998).

With regard to (a) above, if a library decides to support its student and faculty authors’ inclusion of copyrighted material in their new works upon the basis of fair use, there are no grounds (outside clearly infringing activity) for the library to believe that it is disseminating infringing content, unless the submitting author explicitly indicates that he/she does not believe his/her use of copyrighted material is fair use. However, with regard to (b), there is a clear responsibility for the library to establish a “takedown” procedure. This responsibility (found in U.S.C. Title 17, Sec. 512(c)(3)) is discussed in greater detail in Chapter 9.

Though DMCA itself is only relevant for U.S. institutions, all institutions regardless of location should have a policy and process in place for responding to claims of copyright infringement. It is highly recommended that the library consult with legal counsel (and, perhaps, university information technology officers) as to relevant laws and practices.

Fair use education for faculty and students. A policy in support of fair use and a process for complying with takedown notices are both necessary for any library that includes unpublished work in the repository. However, these are meaningless without a systematic effort to educate students and faculty about how to appropriately apply fair use in their creation of new works. There are two primary avenues for providing this education:

image Develop an intentional education/outreach program: when repository managers are developing new relationships with departments, schools, or other units that want to contribute materials to the repository, the manager should take the opportunity to suggest scheduling a presentation/workshop to review copyright, fair use, and best practices for the use of others’ copyrighted materials.

image Provide “point-of-care” user education: not all submissions to a repository will come through regularly scheduled deposits (e.g., theses/dissertations). For individual faculty members or others who submit unpublished work to the repository, a brief email thanking them for their submission and alerting them to potential copyright issues when posting work online can help educate users and encourage ethical use of copyrighted materials. And, if a highly questionable (or infringing, as in the case of consumable works) inclusion of third-party copyrighted material is noticed during ingest, a friendly “We would encourage you to consider seeking permission or revising this before we post it …” email is entirely appropriate.

As noted in the Code of Best Practices in Fair Use for Academic and Research Libraries (ARL, 2012), any educational efforts should include: discussion of the differences in the application of fair use within and outside an educational setting; advice about providing appropriate attribution when including portions of third-party copyrighted materials in a new work; and general guidelines for coming to a fair use determination.

By ensuring that they have policies, processes, and educational offerings in place to support the ethical application of fair use by students and faculty, libraries are able to demonstrate both respect for copyright holders and support for the balance of exclusive and shared rights that fair use represents.

Contracts and licenses

Institutional repository managers must be conversant in both the evaluation and creation of contracts and license agreements that govern the intellectual property in their repositories. Repository managers and librarians need to be able to understand publishers’ copyright transfer and publication agreements, and to be able to communicate that understanding to faculty authors. Libraries must also be able to construct clear and effective submission agreements that will grant the repository the necessary rights to disseminate an author’s work while affording the institution a measure of protection against submitted content that may violate legal or ethical boundaries.

Box 2.5   Contract or license?

Although the terms are sometimes used interchangeably when referring to certain documents, it is important to understand that a contract and a license are not the same thing.

A contract is an agreement between two parties in which both parties make specific promises to one another. A legally binding contract must include, at minimum, four components: “offer, acceptance, intention and consideration” (Glassie et al., 2012, p. 61). In other words, one party must propose terms for the agreement; the other party must accept those terms (either with or without negotiation between the parties); both parties must intentionally enter into the agreement; and there must be an exchange of promises (the “consideration”). For example, in an author publication agreement, the author gives the publisher the right to use his or her article in specific ways in exchange for the privilege of being published in the journal.

A license, on the other hand, is not an agreement, but is rather a form of permission. In the context of intellectual property, a license gives the licensee permission to exercise certain intellectual property rights that are owned by the licensor (Glassie et al., 2012). A license is often included within a contract – for example, in an author publication agreement, the author often grants the publisher either an exclusive or nonexclusive license to use the author’s copyrights. However, licenses may also exist outside a contract as well – for example, the GNU General Public License (GPL) is simply a license. As noted by those familiar with the GPL, this distinction is important (for any license, not just the GPL) because breaches of contracts and licenses are treated differently (Jones, 2003). Contracts are governed by contract law, whereas the breach of a license means that the license is revoked and that an infringement claim may be pursued under intellectual property law (e.g., copyright law) (Jones, 2003).

Scholarly communication librarians, repository managers, and other library staff working with a repository or library publishing services should be familiar with the basic construction – and implications – of both contracts and licenses. Even when a document appears to be a license (e.g., a nonexclusive license agreement) it may actually be a contract as well.

Author publication agreements

When an author’s article is accepted for publication by a scholarly journal (or, sometimes, at the point of initial submission), the author is usually required to sign a publication agreement that includes, among other points, a guarantee from the author(s) that:

image the article is the original work of the author(s);

image the article has not been submitted to another publication simultaneously.5

image the article does not infringe on any existing copyrights;

image the article does not include any material that would be considered defamatory or otherwise unethical.

Along with these assurances, authors publishing in traditional (non–open access) journals are usually asked to transfer their exclusive copyrights in the article to the publisher. This gives the journal the ability to manage access to the article, as well as to profit from its position as the exclusive owner of the copyrights in the article. As discussed earlier, this transfer of copyright impacts the ability of the author to submit his/her article for inclusion in an institutional repository. The author no longer has the right to grant the repository permission to distribute his/her article, because the publisher now owns that right.

However, most publishers do grant to authors certain rights in the use of their own articles, including the right to post them in various formats (preprint, postprint, or published version) in institutional repositories. Often, authors are uncertain as to which rights they have retained – or the author may work with several different publishers, who each have slightly different contract language. In either case, it is important for the repository manager to work closely with authors to determine exactly what version of an article the author is allowed to deposit, and what other restrictions (like an embargo) may be in place.

With regard to the dispensation of copyrights and related rights, there are two sections of the author agreement that merit attention: copyright ownership/transfer of copyright and author/contributor rights. Depending on the agreement, these sections may be labeled differently (or not labeled at all). To aid in recognizing them, two examples are provided in Boxes 2.6 and 2.7.

Box 2.6

Copyright ownership/transfer of copyright language

Optical Society of America

In exchange for OSA accepting the work for reviewing, editing, and possible first publication on an exclusive basis and for other good and valuable consideration, the receipt and sufficiency of which is hereby acknowledged, the Author(s) hereby transfer to the Optical Society of America (OSA) full ownership throughout the world of all rights, titles, and interests, including all copyrights and renewals and extensions thereof, in and to the above-titled Work and including the title and abstract of the Work, effective as of date of acceptance of this Work for publication in the above-named Publication. OSA shall have the right to register copyright to the Work and the accompanying abstract in its name as claimant, whether separately or as part of the journal issue or other medium in which such work is included.

American Sociological Association

Whereas the American Sociological Association is undertaking to publish the above-named article, of which the undersigned is Author, the Author transfers and assigns to the ASA for the full term of copyright as may now or hereafter exist, all rights, title and interest, including copyright, including but not limited to the sole and exclusive right to print, publish, license and otherwise sell your work in whole or in part in all media in all languages and all editions throughout the world and the exclusive rights to license or exercise throughout the world all subsidiary rights, including electronic formats, whether now in existence or hereafter invented.

Source: American Sociological Association. Copyright transfer agreement. Available from: http://www.asanet.org/images/journals/docs/pdf/NewTOCForm.pdf

Optical Society of America. Copyright transfer agreement. Available from: http://www.opticsinfobase.org/submit/forms/copyxfer.pdf

Box 2.7   Author/contributor rights language

American Anthropological Association

The Association will grant you the right to use your article without charge as indicated below in the section on “Author’s Rights”.

Author’s Rights

The Author is hereby reserving the rights to use his or her article in the following ways, as long as Author acknowledges the published original in standard bibliographic citation form and does not sell it or give it away in a manner which would conflict directly with the business interests of the American Anthropological Association: 1) To use the article for educational or other scholarly purposes of Author’s own institution or company; 2) To post the article on Author’s personal or institutional website; 3) To post the article on free, discipline-specific public servers of preprints and/or postprints; and 4) to publish the article or permit it to be published by other publishers, as part of any book or anthology, of which he or she is the author or editor, subject only to his or her giving proper credit to the original publication by the American Anthropological Association, unless the anthology is drawn primarily from American Anthropologist.

Source: American Anthropological Association. Copyright transfer agreement. Available from: http://www.americanethnologist.org/wp-content/uploads/2012/08/ae-author-agreement-rev-2012.pdf

American Physical Society

The author(s) […] shall have the following rights (the “Author Rights”):

(1) All proprietary rights other than copyright, such as patent rights.

(2) The nonexclusive right, after publication by APS, to give permission to third parties to republish print versions of the Article or a translation thereof, or excerpts therefrom, without obtaining permission from APS, provided the APS-prepared version is not used for this purpose, the Article is not republished in another journal, and the third party does not charge a fee. If the APS version is used, or the third party republishes in a publication or product charging a fee for use, permission from APS must be obtained.

(3) The right to use all or part of the Article, including the APS-prepared version without revision or modification, on the author(s)’ web home page or employer’s website and to make copies of all or part of the Article, including the APS-prepared version without revision or modification, for the author(s)’ and/or the employer’s use for educational or research purposes.

(4) The right to post and update the Article on free-access e-print servers as long as files prepared and/or formatted by APS or its vendors are not used for that purpose. Any such posting made or updated after acceptance of the Article for publication shall include a link to the online abstract in the APS journal or to the entry page of the journal. If the author wishes the APS-prepared version to be used for an online posting other than on the author(s)’ or employer’s website, APS permission is required; if permission is granted, APS will provide the Article as it was published in the journal, and use will be subject to APS terms and conditions.

[…]

All copies of part or all of the Article made under any of the Author Rights shall include the appropriate bibliographic citation and notice of the APS copyright.

Source: American Physical Society. Copyright transfer agreement. Available from: http://publish. aps.org/authors/transfer-of-copyright-agreement

The effect of both agreements is the same: the publisher is given complete ownership of all exclusive rights granted to a copyright holder, and is able to register copyright for the work in the publisher’s name. (This does not mean that the author loses the right (the moral right) to be named as the author; in fact, this is often explicitly addressed elsewhere in the agreement.)

It should be noted that the ability for an author to transfer his/her copyright to the publisher is contingent on whether or not the author owned the copyright in the first place. For example, if the author is a U.S. federal government employee who has created the work within the scope of his/her employment, copyright does not exist in that work, because federal government works are not copyrightable under U.S. law.6 If a work is created jointly by both government employees and nongovernment individuals, however, copyrights may exist in the work, and ownership should be clearly established prior to posting the work in a repository (CENDI, 2008). Another potential situation that complicates ownership/transfer of copyright, though less common with scholarly articles, would be the case of a work made for hire. In this case, the employer would own the copyright in the work, and an authorized representative of the employer would need to sign the publisher’s agreement.

The language used by the American Anthropological Association demonstrates the lack of clarity/specificity that is often found with regard to author rights to certain versions of the article. An initial reading of the language would appear to allow the author to use any version (preprint, postprint, or final publisher’s PDF) of the article in the ways listed. However, a closer examination reveals a potential difference between the use of the article on institutional websites (which would be assumed to include institutional repositories) and use of the article in disciplinary repositories. Because “discipline-specific public servers” is qualified with “of preprints and/or postprints”, it is reasonable to assume that only preprint or postprint versions of the article may be posted to disciplinary repositories. No such qualification is made when institutional websites are mentioned, though, which would appear to make it possible to post the final published article in an institutional repository. Ultimately, if the posting rights granted to the author are not clear (as in this case), it may be necessary to contact the publisher directly for verification.

Fortunately, a growing number of author agreements do provide explicit language about what versions of a work may be shared through an institutional repository. The American Physical Society agreement specifically refers to the “APS-prepared version” of the article when granting permission to post the article on an employer’s website (which would reasonably seem to include an institutional repository). Other journals remove any ambiguity about website/repository posting by explicitly mentioning institutional repositories:

“Journal of Psychiatric Practice will permit the author(s) to deposit for display a “post-print” (the final manuscript after peer-review and acceptance for publication but prior to the publisher’s copyediting, design, formatting, and other services) 12 months after publication of the final article on his/her personal web site, university’s institutional repository or employer’s intranet, subject to the following [ …].” (JPP, n.d.).

When determining what, if any, version of a scholarly article may be posted to a repository, it is important to be certain the library has access to the version of the publisher agreement that was signed by the author. Particularly for older works, repository-posting rights may not be explicitly (or implicitly) addressed in the agreement that the author signed. If this is the case, and if more liberal terms exist in the publisher’s current agreement, the repository manager should confirm with the publisher (in writing) whether or not the current terms will be honored for prior publications – and specifically for the article(s) in question. Absent a written amendment to the original publishing agreement, or other explicit written permission, the repository manager should never assume that the publisher’s current practices apply to older articles.

Repository submission agreements

Beyond needing to confirm that an author has the right to post his/her article (and in what form), distributing published works through a repository poses very little risk to either the author or the institution. Assuming that the author has submitted either a postprint or published version of an article, there is a reasonable degree of certainty that factual or ethical issues (such as copyright infringement) that may have existed in the work have been identified and corrected by the publisher. With unpublished works submitted to the repository, however, it is necessary for the library to obtain assurances from the author(s) that no such issues exist – and that, if they do, the author(s) will assume full responsibility. Because the work has not been previously published, and the copyright still remains with the author(s), the library must also obtain permission from the author(s) to distribute the work through the repository.

The best mechanism for achieving these ends is the use of a submission agreement. Although there are other important components of a submission agreement that will be discussed later (see Chapter 5), the core of the agreement is focused on the intellectual property present in the submission. The items that should be addressed are similar to those in a journal’s publication agreement; the submitting author must provide assurances that:

image the submission is the original work of the author(s); and

image the article does not infringe on any existing copyrights or other intellectual property rights.

Unlike most traditional journal publication agreements, however, there is one key difference in the repository submission agreement: there is no transfer of copyright from the author to the repository or library. It should be made explicit in the submission agreement that the author retains the copyright in the work that is being submitted. This means that the author will keep his/her exclusive rights to copy, distribute, display, perform, or create derivative works from the submitted work. Because the author is retaining these rights, however, he/she must grant the library/institution the right to distribute the work through the repository, as well as any associated rights that may be needed to properly curate or disseminate the work. This grant of rights usually takes the form of a nonexclusive license (see Chapter 5 for examples).

As with journal publication agreements, there are a variety of ways to structure and phrase a repository submission agreement – and it is also likely that the content of submission agreements will vary for different collections of content within the repository. For example, collections of previously published faculty scholarship will require only a brief agreement that addresses the right of the author to deposit the work (i.e., the deposit does not violate an existing publisher copyright or contract) – but collections of student theses will require a longer agreement with necessary assurances that the content of the thesis is both legal and ethical.

The mechanism for obtaining and recording submission agreements will differ based on the repository platform being used and the preferred workflow of the library. Some institutions may prefer to use printed agreements that require either an ink or electronic signature from the submitting author; others may use an online “clickwrap” (or “clickthrough”) agreement similar to end user license agreements that accompany software. The medium of agreement that is most appropriate at a given institution will likely be governed by the preferences of the institution’s legal counsel.

If a clickwrap submission agreement is used, these issues should be considered (additional issues with clickwrap agreements are discussed in Chapter 5):

image Is adequate information gathered by the repository platform to allow reasonable certainty as to which individual completed the agreement?

image If work is submitted on behalf of an author by another person (e.g., an assistant), does the language in the submission agreement allow that proxy to agree on behalf of the author?

image If mediated deposits are made directly in the system by the repository manager or other staff (thereby bypassing the submission agreement) on behalf of faculty or students, a separate “print” (either paper or digital) agreement should be obtained from the author. (This is important primarily for the submission of unpublished works.)

Regardless of the purpose or mechanism of the agreement, it is advisable to have legal counsel from the institution review the agreement before implementation to ensure that it is legally sound and includes standard required language.

Creative Commons licensing

One of the goals of institutional repositories is to make scholarly and creative works available for others to use in their own scholarly or creative pursuits. While allowing users to freely download materials hosted in repositories provides necessary access, the ability to reuse and incorporate these materials into new works is constrained by copyright law. For works in the repository in which copyright is owned by publishers (e.g., scholarly articles), there is little that may be done to proactively remove the constraints on reuse imposed by copyright law. However, for works submitted to the repository for which the submitting author(s) still holds the copyright, it is possible to offer authors the option of posting their works under more permissive terms of use.

There are both philosophical and practical benefits to providing this option to authors. From a philosophical standpoint, the application of a permissive end user license grants users greater flexibility in their use of copyrighted works, while retaining the original author’s right to be acknowledged for his/her creation. From a practical standpoint, applying a more liberal license to a work reduces the likelihood that the repository manager will need to field permission requests from end users and – particularly in the case of graduated students – take the time to locate the author and relay the permission request.

The most widely accepted mechanism that allows authors to distribute their works with more permissive rights for end users attached is the Creative Commons (http://creativecommons.org) license. Creative Commons licenses are not a substitute for copyright; rather, they eliminate the need for a user to seek permission from the copyright holder prior to exercising specific exclusive rights owned by the copyright holder. By applying a Creative Commons license to a copyrighted work, the copyright holder explicitly tells any potential users of his/her work what they may do without asking permission. Any uses of a work beyond those allowed by the specific license (or by an exception in copyright law, such as fair use) still require the permission of the copyright holder. There are six Creative Commons licenses; each allows a slightly different set of uses:

image Attribution (CC BY)

image Attribution-NoDerivs (CC BY-ND)

image Attribution-NonCommercial-ShareAlike (CC BY-NC-SA)

image Attribution-ShareAlike (CC BY-SA)

image Attribution-NonCommercial (CC BY-NC)

image Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)

(Creative Commons, n.d.)

Though the basic terms of the licenses – attribution, commercial v. noncommercial use, creation of derivative works – are easily understood, there are actually complex legal instruments underlying the basic licenses. If an author has concerns about what an end user may be able to do with a Creative Commons–licensed work, the formal legal instrument (the “legal code”, in Creative Commons’ parlance) should be examined so that the exact terms (and recourse) are known to the author. (By the same token, an author who incorporates others’ Creative Commons–licensed material into his/her own work should read the complete license terms, particularly because there may be slight differences between international licenses and those configured for specific jurisdictions.)

No registration is required to use a Creative Commons license; a statement must only be placed on the work and/or in the metadata for the work that specifies the license being used. Within a repository or other online material, it is also preferable to link back to the complete terms of the license (both in plain language and the formal legal instrument), which are hosted online by Creative Commons.

There are a variety of ways in which institutional repositories may make this licensing option available to submitting authors. If an online submission form is used, a “Rights” field with a dropdown menu that allows the author to select a Creative Commons license, if desired, is probably the most simple. The choice can also be incorporated into either a paper or online submission form by adding an area for authors to indicate if they would like to apply such a license. The final step for the repository manager is ensuring that the appropriate license is displayed on, or along with, the work when it is posted in the repository.

It should be noted that Creative Commons licenses are not the only open-licensing models available. While Creative Commons is the most frequently used across a wide variety of content types, different licenses (like the GNU General Public License for software or the Open Database License for data) may be more appropriate for specific content types.

Research data

Though the traditional outputs of scholarship – journal articles, conference papers, posters, theses and dissertations, monographs, and photographs and other creative works – will likely continue to comprise the bulk of institutional repositories’ collections, the last decade has witnessed an explosion of interest in archiving and sharing another type of intellectual property: research data. Along with scholars, both publishers and funding agencies have recognized the value of sharing “raw” data (as opposed to the summaries and discussions of data that articles and reports provide), and have created policies that either recommend or mandate that authors or grantees must make the data underlying their publications publicly available (Borgman, 2007; Nelson, 2009). In the U.S.A., one of the most visible mandates has been that of the National Science Foundation (NSF), which required that, as of 2011, all NSF award recipients provide a data management plan that included a description of how their results would be shared (NSF, 2011).

However, even absent such mandates, independent scholars and other organizations have proactively developed mechanisms and venues to promote the sharing of research data, including disciplinary repositories (Nelson, 2009) like Dryad (http://datadryad.org/) and DataONE (http://www.dataone.org/). The primary objectives of such data-sharing efforts are to capitalize on the economy of scale present when thousands of researchers pool data and share the task of analysis, as well as to better enable researchers to reproduce – and thus confirm or refute – the results of others’ work (Nelson, 2009; Stodden, 2009).

In order to effectively share research data and preserve the integrity of the data, “active curation” (Borgman, 2007, p. 134) is necessary. As institutions with significant skill and experience in the collection and curation of intellectual property, academic libraries have become logical leaders in this effort (e.g., Newton et al., 2011; Parham et al., 2012; Peters and Dryden, 2011; Treloar et al., 2007), seen in efforts like Purdue University Library’s Digital Data Curation Center, Cornell University Library’s Data Working Group, and Cambridge University Library’s involvement in the Incremental Project. In general, libraries have identified three primary ways that they can support researchers in sharing data: provide training and guidance in the creation of data management plans (e.g., the DMPTool, https://dmp.cdlib.org); assist in the organization and preparation of data to make it “share ready”; and, finally, collect, preserve, and disseminate research data through institutional repositories.

Similar to the unique considerations for published and unpublished scholarly works, responsibly sharing research data in an institutional repository requires attention to the specific characteristics, and attendant legal and ethical issues, of data.

Defining data

“Data” may be defined both broadly and narrowly; the scope of digital objects that may be considered data in general is far wider than what a specific discipline may consider to be data. In order to understand data at its most generic, the following definitions are instructive:

“The terms ‘data’ and ‘information’ can be interchangeable. ‘Data’ refers to research results, facts, and statistical or survey information, including text, numbers, images, audio and video recordings, software, animations, metadata and model simulations. In the digital context, ‘data’ refers to any information that can be stored in digital form.”

(Fitzgerald et al., 2007, p. 16)

“A reinterpretable representation of information in a formalized manner suitable for communication, interpretation, or processing.”

(CCSDS, 2002, pp. 1–9)

These definitions (and others, such as that in the Concise Oxford English Dictionary) draw out four important attributes of data:

image data are usually factual (as opposed to being works of fiction);

image data are pieces of information that are collected/recorded;

image data are not limited to information expressed as numbers or text;

image data are “raw” materials that may be organized, processed, or analyzed.

It should be noted as well that data may not always originate from a research activity, but may be gathered for administrative or other reasons (PMSEIC, 2006). Data from medical and educational records, though not generated through research, are valuable resources for new research endeavors.

Beyond this broad definition of data, it is important for institutional repository managers and others involved in data curation to understand the specific nature of data present within different departments and schools at their own institutions. The characteristics, formats, and potential uses of data will vary both between and within the humanities, social sciences, natural sciences, and clinical or professional programs. The Data Curation Profiles (http://datacurationprofiles.org/) project at Purdue University Libraries provides one way of developing this discipline-specific or researcher-specific understanding of data.

Data, databases, and intellectual property

Usually, research data are collected into datasets (Fitzgerald et al., 2007). For example, a psychological research project may examine the relationship between college students’ consumption of popular media and the students’ body self-image. The number of hours that Student A spends reading tabloid magazines would be one piece of data; the same student’s responses on the Multidimensional Body–Self Relations Questionnaire would be additional pieces of data. These data points, together with data from all other student participants, would constitute a dataset. In the same way, a collection of individual telescope images (data) of the night sky would also constitute a dataset.

A more complex organization of data, or of multiple datasets, is considered a database:

“ ‘Database’ refers to a collection of data and datasets, often compiled from a range of sources and usually organised to permit data to be readily retrieved, managed and updated. Typically databases involve software programs which enable the data to be collected, copied, stored, retrieved and distributed.”

(Fitzgerald et al., 2007, p. 19)

This distinction between data and datasets/databases is vital when discussing the intellectual property considerations surrounding the use and reuse of data. It is important to distinguish between the data themselves and (a) the manner in which the data are compiled, arranged, and presented and (b) any supporting or explanatory documentation tied to the data.

Under U.S. copyright law (and internationally), data (“raw facts”) are not eligible for copyright protection; however, “original products derived from those facts are copyrightable” (Stodden, 2009, p. 35). This means that if a database is compiled in such a way that it constitutes an original selection, coordination, or arrangement of the data, that database may be eligible for copyright protection:

“A ‘compilation’ is a work formed by the collection and assembling of preexisting materials or of data that are selected, coordinated, or arranged in such a way that the resulting work as a whole constitutes an original work of authorship. The term ‘compilation’ includes collective works.”

(U.S.C. Title 17, Sec. 101)

While a database may be copyrighted, though, the raw facts that are included in the database remain unaffected, and may be used without permission (assuming, of course, that it is possible to access the database legally in order to extract the raw data – which may not be the case for databases protected by digital rights management (Reichman and Uhlir, 2003):

“The copyright in a compilation or derivative work extends only to the material contributed by the author of such work, as distinguished from the preexisting material employed in the work, and does not imply any exclusive right in the preexisting material. The copyright in such work is independent of, and does not affect or enlarge the scope, duration, ownership, or subsistence of, any copyright protection in the preexisting material.” [emphasis added]

(U.S.C Title 17, Sec. 103(b))

Unfortunately, determination as to what constitutes sufficient original selection, coordination, and arrangement of data to receive copyright protection as a compilation remains largely unclear. In U.S. law, the defining case on the matter, Feist Publications, Inc. v. Rural Telephone Service Co. (1991), which established the necessity of originality, has been followed by inconsistent application of the standard in succeeding cases (Bitton, 2011).

Making the issue more complex – and less clear – is the fact that data themselves that are considered to be compilations, or otherwise exhibit original expression, may be protected by copyright. In CDN Inc. v. Kapes (1999), the court held that data which are “created, not discovered”, are copyrightable (in the case, the data in question were prices which had been estimated by the copyright owner). This presents another potential barrier to use of data from a dataset or database, unless it is clear that the data constitute “discovered” facts, and are not original creations of the dataset/database owner.

Despite these uncertainties, U.S. copyright law is clear that, in general, raw data by itself is not protected by copyright. While European copyright law is consistent in this regard, European law has created additional protections for databases that present barriers to use of the data they hold. In 1996, a new sui generis database right was created by the European Parliament and the Council of the European Union, and is enshrined in Directive 96/9/EC (1996), which requires member states to adopt this exclusive right for database owners:

“ a right for the maker of a database which shows that there has been qualitatively and/or quantitatively a substantial investment in either the obtaining, verification or presentation of the contents to prevent extraction and/or re-utilization of the whole or of a substantial part, evaluated qualitatively and/or quantitatively, of the contents of that database.”

(European Union, 1996)

This right directly affects the ability of an end user to extract and reuse raw data from a protected database by making these actions (“extraction and/or re-utilization”) the exclusive right of the database owner. Therefore, even though the raw data in a database may not be protected by copyright law (assuming the data consist of facts, or of ideas expressed in the only way possible), it is impossible to access and reuse the data without permission – even if it is technologically possible. As Reichman and Uhlir (2003, p. 387) note, the European directive varies from existing intellectual property law in important ways:

image the sui generis database right is given in recognition of investment, not “creative contribution”;

image the right grants exclusive control over previously “unprotectible raw material”;

image the duration of the right allows it to exist “in perpetuity”; unlike copyright law, which is for a fixed term.

In practical terms, the implications of the European Directive for institutional repositories in the U.S.A. are limited; databases created in the U.S.A. are not subject to the European database rights. However, as collaborative research continues to grow, it is entirely possible that U.S. institutional repository managers and/or data curators will encounter researchers at their institutions who have collaborated with European peers. For databases created in Europe that are under consideration for deposit in U.S. repositories, it is vital that ownership of the databases (and sources of data in the database) is established so that appropriate permissions may be secured for the deposit.

Ownership and licensing of data

If there is one thing that should be abundantly clear about copyright and related rights in data and databases, it is this: nothing is clear. For academic libraries exploring data curation services, and disseminating data through a repository, these issues require careful navigation. However, there are two ways to eliminate uncertainty when it comes to the legal use of data:

image establish clear ownership of the data (and/or database) in question; and

image recommend that the data owner(s) apply a license to the data to make the terms of use clear (for both the repository and for end users)

Ownership in a specific dataset or database may vest in multiple parties, and the extent of ownership rights present in data will vary based on the context in which the data were created, and on the type of intellectual property present in the data. For example, it is possible that under the terms of employment, or of a grant or contract, data may be owned not by the person who gathered/created the data, but by the employer or funding agency (DaWG, 2008; ICPSR, 2012). As discussed earlier, it is also possible that parts of the dataset/database may be copyrighted and, if so, copyright ownership – which may be distinct from contractual ownership of the data – must also be established. Furthermore, depending on the nature of the data, patent rights may exist (DaWG, 2008). In the case of data collected for digital humanities research, which could include images or cultural artifacts, copyrights or other rights may exist in the data themselves – and may be owned by someone other than the researcher (Borgman, 2007). The bottom line is this: before data can be deposited in an institutional repository, it must be clear that the person making the deposit has the right to do so (DaWG, 2008). To that end, there are five questions that should guide any exploration of data ownership:

1. Does copyright, or any other intellectual property right, exist in the data themselves?7

2. Does copyright exist in the dataset or database in which the data are housed? And, if there are accompanying materials (codebooks, etc.), what is their copyright status?

3. Was the data/database created under terms of employment, a grant, or a contract? And, if so, what does that agreement specify about the ownership of the data and related assets?

4. Was the data/database created through a collaborative process? If more than one individual was involved, is there a preexisting agreement that specifies ownership of intellectual property or other rights in the data and related assets?

5. Has the individual making the deposit been authorized by all others with an ownership interest (or other right existant in the data/dataset) to make said deposit?

Once ownership of the data has been established, the next step should be the application of an appropriate license and/or the development of a usage rights statement for the data (DataONE, n.d.). As with other works posted in the repository, the application of a standard license allows end users to know what use of the data is permitted and what use requires further permission from the copyright holder. Even if raw, factual data themselves are not copyrightable (and so not subject to a license which depends on the existence of copyright as its basis), licensing a dataset or database will eliminate end users’ questions about what individual components or aspects of a database may be protected.

Ideally, data owners will elect to apply an open license to their databases; expansive reuse rights for data are integral to a database’s utility to other researchers. If they are unable to legally extract, reanalyze, or otherwise work with raw data (and the accompanying assets necessary for them to understand how they were created and organized), the fact that the data are freely accessible means very little. Fortunately, there are good options for data owners who wish to openly license their databases:

image Creative Commons Public Domain Dedication (CC0)

image Open Data Commons Public Domain Dedication and License (PDDL)

image Open Data Commons Attribution License (ODC-BY)

image Open Data Commons Open Database License (ODbL).

While a standard Creative Commons license could be applied to a database, it is generally considered an ill-advised approach (e.g., Rochkind, 2008). Often, researchers may seek to incorporate data from multiple databases into one larger database, or into a meta-analysis. If copyrightable elements of different databases all hold different variations of Creative Commons license requirements (e.g., Attribution, Share-Alike, Non-Commercial), it may be impossible for the researcher to satisfy the terms of each discrete database’s license. Electing to dedicate a database to the public domain with a CC0 or PDDL license removes the potential for conflicting licenses when databases are aggregated or otherwise reused.

An additional benefit to using either the Creative Commons CC0 license or the Open Data PDDL license is that they specifically account for the European sui generis database rights. Both licenses explicitly waive the licensor’s copyrights and database rights, effectively dedicating the work and all related rights to the public domain. The other Creative Commons and Open Data Commons licenses approach database rights differently. Creative Commons licenses either do not license or waive the sui generis rights (depending on whether an international or territorial license is in use); Open Data Commons ODbL and ODC-BY licenses explicitly grant the end user a license for those database rights.

If data owners prefer not to use a standard license, at minimum they should work with the repository manager to develop a usage rights statement to be incorporated into the metadata for the dataset/database in the repository. For example:

“This data set and accompanying metadata may be used for noncommercial academic, research, and other professional purposes. Permission to use the data is granted to the Data User subject to the following terms: 1) Data User will cite the data set owner in derivative works or publications that use the data set. 2) Data User will share any derivative works for non-commercial academic, research, and other professional purposes. 3) Data User will notify users that such derivative work is a modified version and not the original data and documentation distributed by the data set owner.”

(Cornell University, n.d.)

Regardless of what usage restrictions or conditions are placed on the data for end users, it is important for the data owner to grant the library a nonexclusive license that will allow the repository to disseminate the data/database, to migrate formats or make copies as needed for preservation, and to exercise any other rights governed by copyright that are necessary for the proper curation of the data (DaWG, 2008; Fitzgerald, 2007). As with other materials added to the repository, this grant of rights is usually most effectively accomplished through a carefully constructed repository submission agreement.

Beyond copyright: other data considerations

While copyright (and sui generis database rights) have been the focus of this chapter, they are not the only issues that must be given consideration when undertaking data curation activities. Other forms of intellectual property (e.g., patents or trademarks) may exist in datasets or databases, and the rights in those types of property must be addressed with the same level of care as copyright. It is also important to note that there may be circumstances in which open sharing of data (regardless of the licensing terms) may not be appropriate – and, in fact, could be legally or ethically untenable. In those cases, it may be necessary to create restricted use collections or “data enclaves” to closely monitor the use of sensitive data (ICPSR, 2012, p. 39). However, this level of data curation and restricted access is beyond the reach of what most institutional repository programs are currently prepared to offer, and may in fact be best served through the creation of a separate, secure data repository. As such, further discussion of restricted data collections is outside the scope of this book.

Beyond intellectual property, there are other legal and ethical issues inherent in sharing research data. For example, if the data are derived from human subjects, privacy must be appropriately protected; this requires compliance with applicable legal requirements and ethical guidelines. Data from research with animals or biological materials have their own considerations as well. This does not mean that such data cannot be shared openly; however, assurances must be in place to make certain that data derived from unethical research, or data that have not been properly anonymized to preserve subjects’ privacy, are not disseminated. These issues will be explored in greater depth in Chapters 3 and 4.


1“Literary” works are any works “expressed in words, numbers, or other verbal or numerical symbols or indicia” (U.S.C. Title 17, Sec. 101).

2Depending on the publisher, the author may still own the copyright in the manuscript at this point in the publication process (i.e., has not signed a copyright transfer agreement for the publisher); however, even if this is the case, it is advisable to comply with a publisher’s stated policy regarding preprints if the author wishes to be published by that publisher.

3A distinction is made here between posting complete published works, and the use of portions of previously published work in the creation of new works. Such use is discussed below in relation to fair use.

5As will be discussed later, this is not a universal requirement; different disciplines have different stances on this issue.

6Although U.S. federal government works are generally not copyrightable under U.S. law, “the work may be protected under the copyright laws of other jurisdictions when used in these jurisdictions. The U.S. government may assert copyright outside of the United States for U.S. government works” (Copyright and other rights …, 2013).

7Special care should be taken when assessing ownership of qualitative data (Corti et al., 2000); it should be confirmed with the owner of the dataset that participants/interviewees do not own any of the copyright in the data they contributed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset