3

Research ethics

Abstract:

This chapter examines the issues related to disseminating original research results through an institutional repository, with a particular focus of the result of human subject research. The chapter includes an overview of the ethical review process for research and a discussion of how libraries can collaborate with research review committees, and concludes with ethical considerations for data repositories and the issue of retracted publications.

Key words

research ethics

institutional review board

human subject research

institutional repository

research data

retractions

Within the scholarly communication system, the results of scientific research have traditionally been communicated formally through publication in scholarly journals or presentations at disciplinary conferences. The peer review processes in place for these respective venues usually include some level of assurance that the researchers have conducted their investigations in an ethical manner, and are not sharing data that they are not legally or ethically sanctioned to share. As institutional repositories become another venue through which the results of original research may be disseminated, it is imperative that repository managers are aware of the ethical issues present in scientific research and that a reasonable effort is made to exclude work from the repository that does not meet basic ethical standards.

It is not reasonable, of course, to expect repository managers to either police or detect all forms of scientific misconduct (e.g., Martinson et al., 2005) – much in the same way that libraries should not be in the position of making fair use determinations for work deposited in the repository. To do so would be to place expectations on repositories that even scholarly journals are not able to meet, as evidenced by the number of published articles retracted (for both honest and dishonest errors) (Van Noorden, 2011). However, libraries do need to be able to identify the types of submissions that may raise ethical concerns. This knowledge will allow repository managers to ensure that they are (a) asking the right questions about potential submissions, (b) receiving appropriate assurances from submitting researchers, and, most importantly, (c) not disseminating materials that will place their institutions at undue legal risk or violate others’ rights.

In general, the types of submissions that will warrant the most attention in this regard are those that present the results of original research; such submissions may come from either faculty or students, and may be in the form of manuscripts, research posters, datasets, or other common research outputs. And while ethical issues such as plagiarism, fabrication, or defamation may be present in scholarship across disciplines (and are worthy of attention), research in the natural, social, and clinical sciences is deserving of special attention, particularly that involving human subjects.

Human subject research

In the U.S.A., ethical guidelines for research with human subjects are codified in the Code of Federal Regulations, both in relation to research governed by the Department of Health and Human Services (Title 45 CFR Part 46) and the Food and Drug Administration (Title 21 CFR Part 56). The HHS regulations are informally called the “Common Rule”, due to their adoption by over a dozen other federal agencies; the FDA maintains its own regulations, though they are largely similar to the Common Rule (Amdur and Bankert, 2011). As a practical matter, the Common Rule guidelines for human subject research (and the FDA guidelines for research involving medical devices or pharmaceuticals) must be followed by any institution that receives any sort of federal funding or is otherwise subject to federal jurisdiction or review. An academic institution does not need to receive federal research funding to be subject to the regulations; for example, any institution that participates in the Federal Work-Study Program should comply with the regulations. By the same token, any institution that does not comply with the guidelines would be in danger of sanctions or, in the most egregious cases, funding removal.

Belmont Report

The human subject research guidelines are governed by principles laid out in a document from the National Commission for Protection of Human Subjects of Biomedical and Behavioral Research called the Belmont Report. The three core principles of the Belmont Report are respect for persons, beneficence, and justice. In sum:

image Respect for persons: every person’s autonomy must be respected; if someone has diminished autonomy (e.g., a child, a prisoner, or someone who is physically or mentally incapacitated), extra care must be taken to ensure they are not exploited. (In practice, this means that a research subject must be fully informed about the risks and benefits of the research, and that they must have the opportunity to voluntarily join the research, or to withdraw from it at any time. For those subjects with diminished autonomy, researchers have to take extra steps to ensure that their participation in research is appropriate and that they are properly protected from risks or harm.)

image Beneficence: researchers must design their studies, and their subjects’ participation, in such a way that they cause no harm to the research subjects, and that the risk of harm is reduced and the potential benefits of the research are maximized. (In practice, there is often no way for research to be carried out in a way that carries no risk at all for participants, even if that risk is relatively minor. Additionally, there are often research projects that hold no direct benefit for the participants, but hold potential benefit for society at large. This means that, in all cases, researchers must carefully balance the risks and benefit of the given study and must expose participants to no greater risk than is reasonably justified by the direct or indirect benefits of the research.)

image Justice: in relation to research, justice refers to a need to make sure that the risks and benefits of research are distributed equally and equitably between individuals and groups of people. (In practice, this means that subjects who are selected to participate in research must be appropriate given the content and scope of the study. As the Report notes, there is a danger of certain groups of people being overrepresented in studies simply because they are convenient to use or easily taken advantage of. If a specific group of people is heavily represented in a study, it must only be because that research holds a potential benefit that is unique to the group. This is, for example, why only certain types of research involving prisoners may be approved.)

The full Belmont Report is available online from: the Office of Human Subject Research: http://www.hhs.gov/ohrp/policy/belmont.html

Institutional review boards

While the principles of the Belmont Report provide a philosophical framework for the ethical conduct of human subject research, the federal regulations provide a concrete process in which the principles must be enacted. The Common Rule (and the FDA regulations) lays out a specific set of procedures through which proposed research must be reviewed and approved prior to the study commencing. At the core of these procedures is the institutional review board.

An institutional review board (IRB) is a committee that must review all human subject research proposals at an institution to ensure that the proposed research complies with the federal guidelines. For a project to fall under the IRB’s jurisdiction, it must either involve a clinical investigation of a medical device or drug or meet the Common Rule definition of human subject research:

“Human subject: a living individual about whom an investigator (whether professional or student) conducting research obtains (1) Data through intervention or interaction with the individual, or (2) Identifiable private information.”

(45 CFR 46.102(f))

“Research: a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge.”

(45 CFR 46.102(d))

There are many research-related activities that take place at educational institutions that do not constitute human subject research. For example, explorations of pedagogy that are intended only to inform an individual faculty member’s teaching, surveys conducted by the institution for the purposes of staff development, and internal quality improvement studies would likely not qualify as human subject research. Although they might involve collecting data from, or about, individuals, because they are intended to inform local praxis and not contribute to generalizable knowledge, they would not be under the IRB’s jurisdiction. However, depending on the institution, some IRBs may suggest that such projects be submitted to the IRB so that there can be an official acknowledgment that no formal IRB review was required.

Research that does fall under the IRB’s jurisdiction must be reviewed for compliance with federal regulations. Depending on the type of study, the review may fall into one of three categories: exempt, expedited, or full board review.

A human subject research proposal may be deemed exempt from full IRB review and oversight if it (a) presents minimal risk to participants and (b) falls into one of several allowable categories of research, which include, among others, pedagogical research, survey or interview research, research using educational tests, or research using existing data (see all categories of exempt research at 45 CFR 46.101(b) and 21 CFR 56.104). It is likely that these types of exempt research will constitute the bulk of research activities at most colleges and universities. For example, due to the growth of online survey tools like SurveyMonkey, SurveyGizmo, and Qualtrics, survey research is simple and inexpensive to carry out – which has made it much more prevalent.

Box 3.1   Minimal risk

The concept of “minimal risk” is extremely important in evaluating proposed human subject research. Minimal risk constitutes a threshold of acceptable risk for research subjects; if research presents more than minimal risk to subjects, that risk must be justified and weighed carefully against the potential benefit of the study (for subjects and for others). As defined by federal regulations, minimal risk “means that the probability and magnitude of harm or discomfort anticipated in the research are not greater in and of themselves than those ordinarily encountered in daily life or during the performance of routine physical or psychological examinations or tests” (45 CFR 46.102). In accordance with this definition, it is not possible for a research study to present “no risk”; as an expression of the risk present in daily life, “minimal risk” is the lowest possible risk.

While exempt research is plentiful, its classification often causes the most confusion for researchers who are unfamiliar with the IRB process, and repository managers who receive original human subject research outputs for deposit should be especially careful if the materials are presented as being the result of exempt research. The most important point to remember is that a researcher cannot be the one to determine whether or not the exempt classification is appropriate for his/her own research. The institution must designate someone in an official capacity to make this determination; often, this will be a member of the IRB. Once that determination has been made, and the research has been officially registered as being exempt, IRB oversight for the project (which would otherwise include continuing review until the close of the project) ceases (Amdur and Bankert, 2011). But unless the official determination is made, the research should not be conducted and the results should certainly not be disseminated through the repository (or elsewhere).

If a human subject research proposal is determined to be ineligible for an exemption, it will either be classified as an expedited or full board review. Expedited review, as the name implies, is a quicker review process that only requires one or two IRB members to review and approve the proposal. To be eligible for expedited review, a study must (a) present no more than minimal risk to participants and (b) fall into specific categories of research approved by federal regulations (45 CFR 46.110(a) or 21 CFR 56.110, depending on the nature of the research). If a study involves more than minimal risk to participants, and/or the participants involved are part of a protected or nonautonomous population, the proposal must be reviewed at a meeting of the full board. Proposals that are approved through the expedited or full board process are also subject to continuing review over the life of the project; particularly if the project extends beyond the original time period specified for the study.

When reviewing any proposal, regardless of the category into which it falls, IRB members are primarily concerned with the elements outlined in the Belmont Report. This means that, in general, they are interested in:

image whether the subject pool is appropriate for the research;

image whether the risk to the participants is appropriate given the potential direct or indirect benefits of the study;

image whether appropriate measures have been taken to mitigate any risks; and whether the participants will be adequately informed about the nature of their participation, any risks they will be exposed to, and their ability to withdraw from the study at any time without penalty.

IRB review provides an objective lens to ensure that the rights of individuals who agree to participate as research subjects are protected. If human subject research is performed without the review and approval of an IRB, it should be considered unethical and should not be disseminated in any fashion. To share such work through a repository would not only condone unethical practice, but would also open the institution up to governmental sanctions or other legal action.

It is worth noting that, while the discussion here has focused solely on the IRB, there are other comparable review boards that govern the ethical conduct of other types of research. Two of the most common at U.S. institutions are the Institutional Animal Care and Use Committee (IACUC), which oversees research involving animals, and the Institutional Biosafety Committee (IBC), which oversees recombinant DNA research (NIH, 2011).

IRBs: considerations for institutional repositories

Even though – or, perhaps, especially because – every library will not have a faculty or staff member on its institution’s ethical review board, it is important for the library to establish a working relationship with the IRB (or IACUC, IBC, etc.). Without a clear understanding of local expectations regarding the review/approval of research activities, the repository manager may inadvertently post materials online that were either (a) not approved or (b) not submitted for review even though such submission was required. Conversely, it is also important for the IRB committee chairs and administrators to be aware of the institutional repository, specifically with regard to its role as a means of disseminating research results.

During a meeting with review board chairs or administrators, there are several key issues that repository managers should raise.

What works are required to be submitted to the committee? Although federal regulations provide clear guidance as to which works fall under an IRB’s purview, some institutions may endow their review boards with expanded oversight roles. For example, though certain types of activities involving human subjects may not qualify as research (according to the federal definition), an institution may require that the IRB – not the investigator – make that determination. This would ensure that there was a formal record stating that the activity did not fall within the IRB’s jurisdiction and did not require oversight. Types of work that could fall into this category might include program evaluations or quality improvement studies intended only for the benefit of the organization under evaluation, case studies that do not present generalizable information, or an individual teacher’s assessment of his/ her own pedagogical effectiveness.

An additional area to consider, and seek clarification on, is collaborative research. Technology and online capacity has made it much easier for researchers to collaborate with colleagues at other institutions, and collaborative or multisite research studies are quite common. Depending on the agreements in place between institutions (and between IRBs), review and approval from more than one IRB may be required. If a multi-institutional research work is submitted to the repository, the repository manager must confirm with the local IRB whether or not its review/approval was required.

How is the approval or registration of reviewed research/projects documented? After determining what work must be submitted to the IRB – either for review and approval or for a decision that formal oversight is not required – it is important for the repository manager to find out how to verify if a work has been properly reviewed. For research that requires IRB review, it is likely that a proposal number or other IRB identifier will have been assigned. For activities that did not require formal IRB review, there may be a similar identifier, or there may simply be a record (a letter, email, etc.) from the IRB or other institutional official stating that the project did not require IRB review. (If the institution allows investigators to make their own determination as to whether or not IRB review is required, there obviously would be no documentation of such a decision).

For work submitted to the repository, how is it recommended that IRB approval be confirmed? Graduate school theses, dissertations, and other culminating works that involve formal faculty oversight or advisory committees should not require any additional verification by the repository manager. The research informing these works would not have been conducted, and the final paper certainly not approved, without evidence that appropriate ethics board approval had been obtained.

However, it is likely that the repository will receive materials from students and faculty who have conducted research either outside the framework of a culminating project or as part of their own independent research activities. Because such works are usually not governed by the same level of policy and oversight as theses or dissertations, it is recommended that the repository manager confirm ethical review status at the time of deposit.

The simplest means of achieving this is to add fields to the repository submission form. For online submission forms, a dropdown menu with the following options would be appropriate:

— Ethical review board (IRB, IACUC, IBC) approval obtained

— Ethical review board approval not required

— Unsure if ethical review board approval was required

Following that menu, one or more text fields may be added to allow entry of the IRB name, a proposal number, or similar identifier; for example:

IRB/IACUC/IBC name (if applicable): _____________________________

IRB/IACUC/IBC proposal number (if applicable): ___________________

This information would not be displayed with the other metadata on the public side of the repository interface, but would be stored as administrative metadata on the back end of the system. When reviewing the submission, the repository manager would be able to quickly assess the status of the project and, if desired, follow up with the appropriate ethical review board.

The repository manager will not have the time, nor should have the responsibility, to check on every submission. Because of this, it is advisable to discuss with the IRB administrators which sources of submissions they may anticipate to be most problematic. For example, the IRB may suspect that human subject research is taking place as part of certain undergraduate courses. But because IRBs do not have a proactive policing function, they are unable to seek out and investigate suspected review board shirkers. However, if research results from that college or department are submitted to the repository, the manager will know that verification of status with the IRB is advisable prior to posting.

Does the IRB have an accurate understanding of the role of dissemination? As part of the proposal submission, most IRBs will ask investigators to identify the intended use/audience for the final study results. Depending on the nature of the project, this could range from internal use by an organization to publication in a scholarly journal. In asking this question, the IRB is trying to determine the nature of the project: is the primary intent to contribute to generalizable scientific evidence or is it simply to inform local/individual practice? Unfortunately, some IRBs may place too much weight on this question due to a common misconception: that a determining factor as to whether an activity constitutes human subject research (and therefore requires IRB review/approval) is if the results of that activity are publicly disseminated. This misconception is important to clarify because it may lead some IRBs to believe that research-related activities that were not previously required to undergo IRB review – simply because there had been no prior intent to publicly disseminate the results – would now require review in order to be posted to the repository.

The misunderstanding about dissemination is related to the regulatory definition of “research” in 45 CFR 46, which references intent to “contribute to generalizable knowledge”. It is an inaccurate reading of the definition to assume that because something is disseminated, it is intended to contribute to generalizable knowledge. In fact, there is clear guidance from the U.S. Department of Health and Human Services to the contrary:

“[T]he intent to publish is an insufficient criterion for determining whether a quality improvement activity involves research. The regulatory definition under 45 CFR 46.102(d) is ‘Research means a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge.’ Planning to publish an account of a quality improvement project does not necessarily mean that the project fits the definition of research; people seek to publish descriptions of nonresearch activities for a variety of reasons, if they believe others may be interested in learning about those activities. Conversely, a quality improvement project may involve research even if there is no intent to publish the results.”

(USDHHS, 2010)

It is important to emphasize to the IRB that dissemination of results from a project or research-related activity does not, on its own, create an intent to contribute to generalizable knowledge: in other words, sharing work through a repository does not change the nature of that work. If it was previously deemed “not research” (under the regulatory definition), its addition to the repository does not change that determination. And, for new or ongoing projects, an intent to share it through the repository does not necessarily mean that it requires review/approval by an ethics committee before such dissemination can occur.

For example, suppose a faculty member had collected student evaluative data on his use of a new teaching tool for the purpose of improving his own instruction. According to the 45 CFR 46 definition, this would not constitute research, because there was no intent to gather data that could be analyzed and applied by faculty members at other institutions. In other words, there was no intent to develop generalizable conclusions or recommendations. If the faculty member then creates a poster to share his experience with other faculty in his department, and subsequently uploads that poster to his institution’s repository, his data are now publicly disseminated. However, the simple act of dissemination does not change the nature of the data: they were gathered to inform local practice, and the faculty member makes no attempt on his poster to suggest that what he has done would be applicable in another setting.

In short, if a research-related activity (like a quality improvement project or program evaluation) would otherwise be deemed to be outside the IRB’s jurisdiction, the dissemination of that work through an institutional repository should not impact classification of the work with regard to whether it constitutes human subject research.

Will the IRB recommend/request additional descriptive metadata for certain types of submissions prior to posting? While dissemination of a work through an institutional repository does not magically convert nongeneralizable scholarship into generalizable knowledge, the IRB may still have a legitimate concern that readers who access the project will not make that distinction. Therefore, it may be appropriate to offer to place a “disclaimer” or similar descriptive metadata within the repository record for the results of research-related activities that did not undergo IRB review; for example:

This report is based on an internal evaluation of factors impacting employee satisfaction with Acme Corporation’s recognition and merit pay programs. Information in this report is not designed or intended to be generalized to other organizational settings.

or

This poster describes the creation and evaluation of a teaching tool in the author’s classroom. The results of the evaluation should not be considered generalizable to other settings or student populations.

While such disclaimers do not change the project or results in any way, it does alert the reader to the author’s intent and to the limited applicability of the study. As with a case study, the information is shared to inform others of local practice, but the data were not gathered with the intent of creating a model or generating statistically significant results that could be applied to a broader population. Such a disclaimer should allay any IRB concerns that a reader would perceive that, because the work is being broadly disseminated, it is also intended to be broadly used/generalized from.

How can the library partner with the IRB to provide education for students and faculty? This final question is intended to open up new collaborative opportunities for the library and the repository manager. One of the greatest challenges for IRBs is educating all students and faculty about their responsibilities as researchers – particularly those students and faculty who don’t think of themselves as such, but who do conduct research. Anything that libraries can do to aid in this educational effort will be welcome.

One of the simplest ways to support the IRB’s efforts is to incorporate information about human subject research, and the responsible conduct of research, in presentations and marketing collateral about the repository. By presenting the repository as one stage in the research cycle (or two stages, if both discovery and dissemination are considered), it will become natural to include a discussion about how the work that is added to the repository is generated. If time, or space on the page, is limited, even including simple one-line items in repository submission checklists or agreements is helpful; for example:

If my research involved gathering data from, or about, humans, did I receive approval from the IRB?

In a similar way, the IRB can help educate students and faculty about the services provided by the institutional repository. Importantly, the context in which to provide this information is an integral component of the research process: the informed consent process. Federal and disciplinary guidelines require that participants in research be fully informed about the nature of their participation. This includes being provided with information about how the results of research will be used, and often also being given the opportunity to examine those results. If research results will be shared through an institutional repository (as is most often the case for theses/dissertations), participants should be provided with that information, as well as information that will allow them to access those results when they are made available.

Whether through an online FAQ or boilerplate language in study proposal templates, the IRB can remind investigators of two things. First, that the repository exists as a means of sharing their research results. Second, that if investigators choose to (or are required to) use the repository to disseminate their work, they should inform their research subjects of that decision. For example, at the University of Oxford, the following guidance is provided:

“Where the research will be written up as a student’s thesis that will be published online, the participant information and consent form should explain how the personal data included in that thesis will be published and stored. The following is sample text for inclusion in participant information.

The University of Oxford is committed to the dissemination of its research for the benefit of society and the economy and, in support of this commitment, has established an online archive of research materials. This archive includes digital copies of student theses successfully submitted as part of a University of Oxford postgraduate degree programme. Holding the archive online gives easy access for researchers to the full text of freely available theses, thereby increasing the likely impact and use of that research.

If you agree to participate in this project, the research will be written up as a thesis. On successful submission of the thesis, it will be deposited both in print and online in the University archives, to facilitate its use in future research. The thesis will be published with [insert here information on the type of access and what that means, e.g. open or restricted access (or closed or embargoed access), open access meaning available to every internet user].”

(University of Oxford, 2010)

From the IRB’s perspective, sharing research results, reports, or papers through an institutional repository should be seen as an overwhelmingly positive action. The majority of research is published in journals that research subjects are not aware of, or to which they would be unable to afford access. Disseminating results through a repository (e.g., as an article postprint or as an unpublished research report) provides greater access to those who contributed to the research as participants. This is consistent with the spirit of the Belmont Report, and with the intent of informed consent to be an ongoing dialogue, not a one-time signature.

Furthermore, positioning the repository as the central institutional platform for sharing research results should help alleviate concerns about results being posted, unknown and unmonitored, on individuals’ websites. By communicating these benefits to the IRB, and proactively addressing the potential concerns discussed here, the library can create a strong partnership in support of ethical research and open knowledge.

Box 3.2   Case study: original research

A U.S. faculty member at a medical college entered into a collaboration with a professional peer at a university in Iran. As part of this collaboration, the U.S. faculty member helped his Iranian colleague by conducting literature reviews and helping write up the results of human subject research conducted at the Iranian university. Both men were interested in sharing the results of the research and planned to submit it to peer-reviewed journals. However, they were also interested in sharing the results informally by posting the original manuscripts in the U.S. faculty member’s institutional repository.

When the manuscripts were presented to the institutional repository manager, it was noted that there was no mention in the study description as to whether or not the research had been approved by an appropriate ethical review board in Iran. The research appeared to constitute minimal risk for the subjects. However, the repository manager informed the authors that the work could not be posted without some confirmation that appropriate ethical review had taken place. The Iranian researcher informed both his colleague and the repository manager that approval had been received by an ethics committee for one study but that, due to the nature of the other studies (observational/noninvasive), ethics review was not required by his institution. However, he noted that for all of his studies (whether or not ethics review was required), he had obtained informed consent from the subjects and had followed the principles of the Declaration of Helsinki. The repository manager determined that, due to the nature of the research, and the assurances that local protocols for the protection of human subjects had been observed, it was appropriate to post the manuscripts.1 (analysis of case provided in chapter endnotes, p. 83)


1Analysis: This case presents a more difficult dilemma than is usually encountered when deciding whether or not original, unpublished human subject research should be posted in a repository. When the research is conducted locally – or even domestically, within the U.S. – it is fairly simple to determine whether appropriate ethical review has taken place. However, although there are global standards for the ethical conduct of research (such as the Declaration of Helsinki), the implementation of human subject protections varies from country to country. It can be especially difficult to determine what local protocols are if governmental or institutional websites are in a language other than English – and if there is a further language barrier with the international researcher(s). In this case, the repository manager considered three essential questions: (a) what is the Iranian researcher’s local protocol for the protection of research subjects, (b) was that protocol followed, and (c) what, if any, responsibilities did the local faculty member have related to ethical review? According to the Iranian researcher, the type of research conducted did not require review by the local ethics committee. However, the researcher indicated that he had followed an internationally accepted standard for conducting the research – the World Medical Association’s Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Subjects. Finally, based on the local faculty member’s involvement, he was not considered by the local IRB to be engaged in the conduct of human subject research himself – therefore, his institution was not engaged, and there was no requirement for the local IRB to provide approval for the research or his role in it. Based on these findings, the determination was made to post the manuscripts. At least two of the manuscripts were later published in peer-reviewed journals, at which point the repository records were updated to indicate the location of the final versions of record.

Research data: special considerations

Assuming that research has been conducted ethically, there should be no concerns about posting articles, summary reports, and other aggregated presentations of research results in an institutional repository. However, the treatment and dispensation of datasets derived from human subject research is of particular concern to ethical review boards and should receive special consideration from repository managers.

Fortunately, data-sharing is not a new practice – as examples, see the Inter-University Consortium for Political and Social Research (ICPSR) repository in the U.S.A. or the U.K. Data Archive – and existing guidelines and best practices provide a clear path for institutional repositories in addressing data deposits. When developing local policies and practices, there are three primary areas to consider: informed consent, confidentiality, and access (Bishop, 2009; Carusi and Jirotka, 2009; Van den Eynden et al., 2011). It should be noted that the approach to these three areas may be influenced by relevant laws, such as the U.K. Data Protection Act, the forthcoming European General Data Protection Regulation, or – in the case of U.S. medical records – the Health Insurance Portability and Affordability Act (HIPAA, which will be discussed in more depth in Chapter 4). Regardless of the local legal framework, though, the ethical principles of consent, confidentiality, and access will be relevant and should be largely consistent with the application of data privacy laws.

Informed consent

The issue of informed consent will be addressed long before any data are submitted for inclusion in the repository. However, it is important for the repository manager to (a) be able to understand what consent requirements should be met prior to depositing data and (b) communicate to researchers and ethical review boards how data can be shared through the repository so that the consent process will be as informed as possible for research subjects. If the academic library in question is also involved in helping researchers create data management plans, that presents an excellent opportunity for the latter conversation to take place.

At its core, the informed consent process in research is intended to recognize the personal agency of each research subject, and to respect the subject’s right to make an informed decision about whether or not to participate in a given study. Although discussion of the “informed consent form” is common, consent is not meant to be simply a one-time signature, but an ongoing process wherein the subject is continually informed as to the nature of his/her participation and is able to withdraw that participation at any point in the process.

An important component of informed consent is providing the subject with information about how the data collected from the subject will be used, and how the integrity of the data will be protected. It is generally recommended that researchers should use consent language that leaves open the possibility that collected data may be shared with other researchers (ICPSR, 2012; Van den Eynden et al., 2011) and, more specifically, should: “inform participants how research data will be stored, preserved and used in the long-term; inform participants how confidentiality will be maintained, e.g. by anonymising data; [and] obtain informed consent, either written or verbal, for data sharing” (Van den Eynden et al., 2011, p. 23). For example, the following language could be used on a consent form or associated research information sheet for data that will be shared through an open-access repository:

Your data, along with data collected from other participants, will be stored in a publicly accessible online archive so that other researchers and scholars who could benefit from using the data will be able to do so. Before your data are placed online, all individually identifiable information (e.g., your name, birth date, etc.) will be removed from the data file. We will also remove any pieces of data that someone could potentially use in combination to identify you. If you would like to see the data file with your data included, you may visit the DataBank repository at this address: [URL here]. The data will be preserved in the DataBank indefinitely by the University of Data, although permanent preservation cannot be absolutely guaranteed due to the nature of digital storage media.

Although such a statement may seem clear and straightforward to researchers or library professionals, there is a legitimate question as to exactly how “informed” research subjects can really be about the nature of online data archiving (Bishop, 2009; Carusi and Jirotka, 2009). One means of providing better information may be to provide subjects with an opportunity to see what their final archived data might look like online, or to explain how their data could be reused in the future (Carusi and Jirotka, 2009). Even if more detailed information is provided, there can also potentially be cultural or language barriers to understanding the exact nature of a data archive. To address this, the U.K. Economic and Social Research Council provides guidance about using appropriate language with participants:

“Fieldworkers use locally relevant language, images and concepts when explaining complex notions. For example, when explaining archiving, they reassure participants about anonymity, and that identifying features of places, people and organizations are disguised in preparing data for archiving. For example, in Peru, the term ‘un archivo’ is understood, since almost all villages and communities own archives with documentation regarding the village, which are for public consultation. In Vietnam, researchers used the word for ‘storage’ – pack and store away – pointing to a cupboard or box.

(ESRC, 2010, p. 50)

But perhaps a more complex issue than understanding the concept of an online archive is ensuring that – to the extent possible – research subjects are informed about potential reuses of their data. As Bishop (2009, p. 262) notes, it is “logically impossible” for researchers to be able to predict how data may be used by others after it is deposited in a repository. Although it has been proposed that mechanisms be put in place to allow research subjects to prospectively track use of the dataset in which they are included (Carusi and Jirotka, 2009), this may not actually be feasible – particularly for an open-access repository. And while it may be possible for owners of restricted access datasets to set parameters for reuse by other researchers, it is more reasonable (and realistic) for researchers to inform participants that they are unable to predict future uses of data stored in a repository.

Confidentiality

As noted above, part of the information provided to research subjects during the consent process should describe the steps that will be taken by the researcher to ensure that the subjects’ identifiable information will remain confidential. In the majority of cases where data are deposited in an open repository, this will mean that all data have been anonymized (or “de-identified”) prior to deposit (unless subjects have explicitly agreed to allow deposit of identifiable information). Anonymization (rendering individual subjects unidentifiable from their data) requires that all direct identifiers be removed from the dataset – and, in some cases, may also mean that certain indirect identifiers should be removed as well (Van den Eynden et al., 2011). Direct identifiers are those that are unique to an individual subject, while indirect identifiers could reasonably be used in combination (or with other information) to identify a unique individual (see Box 3.3 for examples).

Box 3.3   Identifiers

Direct Identifiers

image Names

image Addresses, including ZIP codes

image Telephone numbers, including area codes

image Social Security numbers

image Other linkable numbers such as driver license numbers, certification numbers, etc.

Indirect identifiers

image Detailed geographic information (e.g., state, county, or census tract of residence)

image Organizations (to which the respondent belongs)

image Educational institutions (from which the respondent graduated and year of graduation)

image Exact occupations

image Place where respondent grew up

image Exact dates of events (birth, death, marriage, divorce)

image Detailed income

image Offices or posts held by respondent

Examples from ICPSR Data Management: Confidentiality (ICPSR, n.d.).

Anonymization is usually a straightforward process with quantitative data, and may be quite easy if the researcher has used a separate master key to link subject identities with information, and included no identifiers in the actual dataset. However, anonymization can become much more complex when dealing with qualitative data such as narrative transcripts or audio/ video materials (Corti et al., 2000).Ideally, regardless of the data type, the researcher should be responsible for anonymizing his or her data prior to deposit. But it is up to an individual institution, library, or repository to decide if it wants to provide active support for researchers in preparing data for sharing. ICPSR, for example, reviews all data submitted for inclusion in its repository and works with the submitting researcher to make sure that the data do not present a risk of breaking research subject confidentiality. Staff may even recode data to lessen risk, which can entail converting individual data points to ranges (e.g., if a subject’s age is 33, that data point could be converted to the range 30–9) (ICPSR, 2011a). Though this level of review may extend beyond what an academic library is able to provide, this presents an opportunity to partner with a research office or ethics review committee to establish a process for reviewing datasets prior to deposit in an institutional repository. Involving disciplinary faculty on a data review committee will not only bring necessary expertise to evaluating disparate data types, but will also create an opportunity to familiarize faculty with the tools and services available through the repository.

Access

For some datasets, particularly those containing qualitative data, it may be impossible to adequately anonymize data and preserve subject identities without losing valuable meaning (Carusi and Jirotka, 2009). In these cases, it will likely be more appropriate to consider another option for preserving confidentiality: access restrictions. When access (and use) restrictions are put into place, identifiable data are shared with people other than the individual research team, but there is control over whom data is shared with, and what they are allowed to do with that data.

Depending on the local context, the types of data most commonly submitted to the repository, and individual researchers’ needs, different types of access restriction are possible. The restrictions may be administered by the repository manager, or by the data owner (Corti et al., 2000), and range from a minimal requirement of accepting a terms of use/license agreement to a more comprehensive user application process that examines the qualifications of the user and the intended use of the data. As an example of the former, the U.K. Data Archive End User License Agreement is a legally binding document that all users must agree to prior to gaining access to data. Among other stipulations, it requires users to agree to:

“[P]reserve at all times the confidentiality of information pertaining to individuals and/or households in the data collections where the information is not in the public domain. Not to use the data to attempt to obtain or derive information relating specifically to an identifiable individual or household, nor to claim to have obtained or derived such information. In addition, to preserve the confidentiality of information about, or supplied by, organisations recorded in the data collections. This includes the use or attempt to use the data collections to compromise or otherwise infringe the confidentiality of individuals, households or organisations.”

(ESDS, 2008)

A license/terms of use agreement such as this, or that used by ICPSR for its restricted use collections, may be used on its own (as the sole barrier to access), or in conjunction with other restrictions. For example, if evidence of qualifications or researcher credentials is first required, or if potential users must have their proposed use of the data vetted by the data owner (Corti et al., 2000), a data use agreement may ultimately be used as the final condition of access.

If a repository is considering using data use agreements to demarcate proper use of restricted datasets, it may also be prudent to consider requiring data users to a provide a “data protection plan”, much like that required by ICPSR. Researchers who wish to access restricted use ICPSR data must write a protection plan that includes information on where they will store the data, whether it will exist in a local or networked environment, the security measures they will take, how data printouts will be handled, and how data will be shared with other members of the research team, among other items (ICPSR, 2011b). Similarly, the Social Research Association’s guidelines for data security, though intended to address the requirements of the U.K. Data Protection Act (1998), include questions that would be reasonable to include in any required data protection plan:

image Are the automated systems protected by a level of security appropriate to the data held?

image Are technical measures in place to restrict access to systems holding personal data?

image Are technical measures in place to secure data during transit (e.g., to subcontractors and interviewers)?

image How is the data stored by your sub-contractors and interviewers – is it adequate and appropriate?

image Are the premises on which the data is held secure?

image Is access to the premises restricted?

image If the data is held on non-automated systems e.g. paper files, discs, microfilm, and microfiche, is access still restricted or secure?

image Are copies of printouts, obsolete back-up tapes etc. disposed [of] securely?

image Is obsolete hardware and software from which data could be recovered disposed of securely?

image Is there an auditable data retention and destruction policy?

image Are staff trained and made aware of their responsibilities to safeguard the personal data?

(SRA, 2005, p. 50–51)

Ultimately, each library and repository will need to determine both its technological and its staffing capacity to support varying levels of access to data collections held in the repository. It may be that the requirements of securing and administering restricted access data are too significant, in which case only data that could ethically (and legally) be shared publicly would be included in the repository.

Regardless of the types of data that are archived in the repository, the repository manager will need to develop policies and procedures to address the issues of consent and confidentiality. Methods for confirming appropriate consent could range from simply receiving a researcher’s assurance via a deposit agreement to requiring deposit of the consent template used with research subjects as a supplemental file. Receipt of the consent language would allow verification that subjects had been properly informed of the location and conditions of their data’s deposit. If subjects have been promised anonymization of data prior to deposit, the repository (and institution) must decide if researcher assurances are sufficient or if additional review (such as by the faculty review committee suggested earlier) is required.

Once the repository has determined what its capacity is for handling research data, and what its requirements will be for researchers who wish to deposit, the repository manager should contact the local ethics review committee and provide this information. As Carusi and Jirotka (2009) observe, there is an opportunity, and need, to educate ethics boards about the realities of data-sharing. This, in turn, enables the committee to provide better guidance to researchers and to identify whether researchers’ plans – or promises to research subjects regarding dispensation of their data – are realistic and appropriate.

Article retractions and corrections

For research data and other materials, most of the considerations for repository managers deal with whether or not an item should be deposited in the repository in the first place. However, a unique area of concern is material that is discovered to be erroneous (or unethical) after it has been posted. If, in the case of student work or other unpublished materials, the only “home” for the work is the local repository, it is likely that the repository manager will be informed of the update and can respond appropriately. (What is appropriate, of course, will depend on the repository’s collection management policy, which will be discussed in Chapter 5). But for work that has been published elsewhere, particularly for journal articles, there is less certainty that the repository manager will learn of the error, correction, or misconduct.

When a significant error is found in a published journal article, it is possible that either a correction or retraction will be issued. According to the Committee on Publication Ethics (COPE), a retraction:

“[I]s a mechanism for correcting the literature and alerting readers to publications that contain such seriously flawed or erroneous data that their findings and conclusions cannot be relied upon. Unreliable data may result from honest error or from research misconduct.

Retractions are also used to alert readers to cases of redundant publication (i.e. when authors present the same data in several publications), plagiarism, and failure to disclose a major competing interest likely to influence interpretations or recommendations.”

(COPE, 2009, p. 2)

As the rate of retractions has grown over the past decade (Van Noorden, 2011), the issue of how to handle retracted articles has received a great deal of scrutiny. Sometimes, articles that are found (or suspected) to contain erroneous information are slow to be (or are never) retracted due to legal concerns of the journal that originally published the article (Couzin and Unger, 2006). And even when articles are retracted, citations to the articles still persist (Budd et al., 2011). The treatment of retractions – from how journals decide to retract an article, to how they notify readers of retractions, to the amount of information they provide about the reason for the retraction – is likely a major contributor to the persistence of retracted work. But perhaps the single most challenging obstacle to effectively retracting (or correcting) a published article is the proliferation of versions and instances of that article.

Whereas previously, librarians were able to stamp a print article with a retraction notice (Curry, 2005), or individual subscribers could not fail to miss an editor’s note when the next issue arrived, digital storage and access makes it virtually impossible to properly label each existing copy of an article. The version of record is on the publisher’s website, full-text database aggregators have one copy, individual researchers have stored other copies on hard drives, and still other copies may be posted on websites or file-sharing services. And, unless appropriate measures are taken, institutional repositories could become yet another contributing factor to this inherent difficulty in correcting the scholarly record.

Fortunately, the accepted best practices of the repository community already address the need to appropriately treat different versions of scholarly articles – and dealing with corrected or retracted articles should be little different. For example, when the postprint of a published article is added to a repository, it is considered best practice to add a complete citation (including a DOI, when available) to the published version of the article on the publisher’s website. If nothing else were done, this alone would exist as a pointer to the version of record which, if retracted/corrected, would hopefully bear a visible notice of that fact. However, when made aware of a retraction/ correction, it is most responsible for repository managers to also add information directly to the citation on the postprint to indicate that the final published article had been retracted or corrected. If the repository has adopted a standard metadata vocabulary for designating journal article versions (e.g., the NISO Journal Article Version recommendations; NISO/ALPSP, 2008), it could use that vocabulary within the citation. For example:

This is the accepted author manuscript for an article published in its final form in the Journal of Geocentric Studies:

Urban, P. (1993) Explorations of the center of the universe: the Sun’s revolution around the Earth [Corrected Version of Record]. Journal of Geocentric Studies, 5(2): 24–47. http://dx.doi.org/10.1000/jgs.1993.10012

In addition to updating the citation on the postprint itself, the postprint metadata can also be revised. Adding a “[Retracted]” or “[Corrected]” tag to the beginning of the title metadata field would provide an easily visible marker:

Title: [Retracted] Explorations of the center of the universe: the Sun’s revolution around the Earth.

The benefit of amending the title metadata is that the retraction or correction tag will be immediately visible in search engine results. However, if the repository manager is concerned that the revised citation or postprint metadata may still be missed by readers, the most visible form of notification is to add a large watermark directly to the postprint PDF that spans each page.

The case for persistence

Regardless of the metadata or watermarks added to a postprint (or published version) in a repository, questions may remain about what level of access it is appropriate to provide for retracted articles. While the rationale for maintaining a metadata record in order to avoid broken hyperlinks and to point to an article’s version of record is understandable, it is reasonable to question whether knowingly maintaining full-text access to the postprint of a retracted article is ethically responsible. In some cases, particularly where the content of the article could impact medical care (Couzin and Unger, 2006), it may not be. However, it has been suggested that increased visibility of journal articles is a key factor in improving and correcting the scholarly record (Cokol et al., 2007). If this is the case, by providing open access (and better discovery) to a scholarly article that might otherwise be locked behind a paywall, the repository is actually providing a service to the greater community by bringing greater visibility to the fact that the article has been retracted. Furthermore, maintaining access to a retracted article is fully in keeping with the COPE guidelines for journals: “Retracted articles should not be removed from printed copies of the journal (e.g. in libraries) nor from electronic archives but their retracted status should be indicated as clearly as possible” (COPE, 2009, p. 2).

Finally, while retracted articles may be clearly erroneous (particularly in cases of scientific misconduct), the possibility always exists that there is value in articles that are initially deemed unfit. Curry (2005) expresses this well:

“Should we really make all unreliable research disappear? If we do remove it from our libraries, how can anyone know what the fraudulent research said? Many times, research from the past that was alleged to be untrue, either because it was claimed to be slanderous or fabricated, fudged or fraudulent, turned out to have a grain of truth within it, a grain that grew through further research into solid results.”

(Curry, 2005, pp. 33–4)

Questions about the impact on intellectual freedom aside, the matter of how a repository chooses to handle retracted journal articles is largely academic unless the repository manager is made aware of the retraction. Absent such notification, the best preemptive approach for repositories to take is to create clear links between article versions posted in the repository and the version of record on the publishers’ websites – and along with those links, an explicit reminder for readers to take some responsibility for ensuring the accuracy of their sources: “Readers should refer to the published version of record for citation purposes and to check for any post-publication corrections.” Because while the repository may provide responsible curation and access, it is the critical analysis of the individuals who make up the scholarly community that will ultimately provide the best assurance that ethical standards are observed, policed, and corrected.


2The published version of this article has been retracted by the publisher.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset