Nursing Research Methods and Critical Appraisal for Evidence-Based Practice
Nursing Research
Methods and Critical Appraisal for Evidence-Based Practice
NINETH EDITION
Geri LoBiondo-Wood, PhD, RN, FAAN Professor and Coordinator, PhD in Nursing Program, University of Texas Health Science Center at Houston, School of Nursing, Houston, Texas
Judith Haber, PhD, RN, FAAN
2
The Ursula Springer Leadership Professor in Nursing, New York University, Rory Meyers College of Nursing, New York, New York
3
Table of Contents
Cover image
Title page
Copyright
About the authors
Contributors
Reviewers
To the faculty
To the student
Acknowledgments I. Overview of Research and Evidence-Based Practice
Introduction
4
References
1. Integrating research, evidence-based practice, and quality improvement processes
References
2. Research questions, hypotheses, and clinical questions
References
3. Gathering and appraising the literature
References
4. Theoretical frameworks for research
References
II. Processes and Evidence Related to Qualitative Research
Introduction
References
5. Introduction to qualitative research
References
6. Qualitative approaches to research
References
7. Appraising qualitative research
5
Critique of a qualitative research study
References
References
III. Processes and Evidence Related to Quantitative Research
Introduction
References
8. Introduction to quantitative research
References
9. Experimental and quasi-experimental designs
References
10. Nonexperimental designs
References
11. Systematic reviews and clinical practice guidelines
References
12. Sampling
References
13. Legal and ethical issues
References
6
14. Data collection methods
References
15. Reliability and validity
References
16. Data analysis: Descriptive and inferential statistics
References
17. Understanding research findings
References
18. Appraising quantitative research
Critique of a quantitative research study
Critique of a quantitative research study
References
References
References
IV. Application of Research: Evidence-Based Practice
Introduction
References
19. Strategies and tools for developing an evidence-based practice
References
7
20. Developing an evidence-based practice
References
21. Quality improvement
References
Example of a randomized clinical trial (Nyamathi et al., 2015) Nursing case management peer coaching and hepatitis A and B vaccine completion among homeless men recently released on parole
Example of a longitudinal/Cohort study (Hawthorne et al., 2016) Parent spirituality grief and mental health at 1 and 3 months after their infant schild s death in an intensive care unit
Example of a qualitative study (van dijk et al., 2015) Postoperative patients perspectives on rating pain: A qualitative study
Example of a correlational study (Turner et al., 2016) Psychological functioning post traumatic growth and coping in parents and siblings of adolescent cancer survivors
Example of a systematic Review/Meta analysis (Al mallah et al., 2015) The impact of nurse led clinics on the mortality and morbidity of patients with cardiovascular diseases
Glossary
Index
Special features
8
Copyright
3251 Riverport Lane St. Louis, Missouri 63043
NURSING RESEARCH: METHODS AND CRITICAL APPRAISAL FOR EVIDENCE-BASED PRACTICE, NINTH EDITION ISBN: 978- 0-323-43131-6
Copyright © 2018 by Elsevier, Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies, and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices Knowledge and best practice in this field are constantly changing.
9
As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
With respect to any drug or pharmaceutical products identified, readers are advised to check the most current information provided (i) on procedures featured or (ii) by the manufacturer of each product to be administered, to verify the recommended dose or formula, the method and duration of administration, and contraindications. It is the responsibility of practitioners, relying on their own experience and knowledge of their patients, to make diagnoses, to determine dosages and the best treatment for each individual patient, and to take all appropriate safety precautions.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
Previous editions copyrighted 2014, 2010, 2006, 2002, 1998, 1994, 1990, 1986.
Library of Congress Cataloging-in-Publication Data
Names: LoBiondo-Wood, Geri, editor. | Haber, Judith, editor. Title: Nursing research : methods and critical appraisal for evidence-based practice / [edited by] Geri LoBiondo-Wood, Judith Haber. Other titles: Nursing research (LoBiondo-Wood) Description: 9th edition. | St. Louis, Missouri : Elsevier, [2018] |
10
Includes bibliographical references and index. Identifiers: LCCN 2017008727 | ISBN 9780323431316 (pbk. : alk. paper) Subjects: | MESH: Nursing Research—methods | Research Design | Evidence-Based Nursing—methods Classification: LCC RT81.5 | NLM WY 20.5 | DDC 610.73072—dc23 LC record available at https://lccn.loc.gov/2017008727
Executive Content Strategist: Lee Henderson Content Development Manager: Lisa Newton Content Development Specialist: Melissa Rawe Publishing Services Manager: Jeff Patterson Book Production Specialist: Carol O’Connell Design Direction: Renee Duenow
Printed in China
Last digit is the print number: 9 8 7 6 5 4 3 2 1
11
About the authors
Geri LoBiondo-Wood, PhD, RN, FAAN, is Professor and Coordinator of the PhD in Nursing Program at the University of Texas Health Science Center at Houston, School of Nursing (UTHSC-Houston) and former Director of Research and Evidence- Based Practice Planning and Development at the MD Anderson Cancer Center, Houston, Texas. She received her Diploma in Nursing at St. Mary’s Hospital School of Nursing in Rochester, New York; Bachelor’s and Master’s degrees from the University of Rochester; and a PhD in Nursing Theory and Research from New York University. Dr. LoBiondo-Wood teaches research and evidence-based practice principles to undergraduate, graduate, and doctoral students. At MD Anderson Cancer Center, she developed and implemented the Evidence-Based Resource Unit Nurse (EB- RUN) Program. She has extensive national and international experience guiding nurses and other health care professionals in the development and utilization of research. Dr. LoBiondo-Wood is an Editorial Board member of Progress in Transplantation and a reviewer for Nursing Research, Oncology Nursing Forum, and Oncology Nursing. Her research and publications focus on chronic
12
illness and oncology nursing. Dr. Wood has received funding from the Robert Wood Johnson Foundation Future of Nursing Scholars program for the past several years to fund full-time doctoral students.
Dr. LoBiondo-Wood has been active locally and nationally in many professional organizations, including the Oncology Nursing Society, Southern Nursing Research Society, the Midwest Nursing Research Society, and the North American Transplant Coordinators Organization. She has received local and national awards for teaching and contributions to nursing. In 1997, she received the Distinguished Alumnus Award from New York University, Division of Nursing Alumni Association. In 2001 she was inducted as a Fellow of the American Academy of Nursing and in 2007 as a Fellow of the University of Texas Academy of Health Science Education. In 2012 she was appointed as a Distinguished Teaching Professor of the University of Texas System and in 2015 received the John McGovern Outstanding Teacher Award from the University of Texas Health Science Center at Houston School of Nursing.
Judith Haber, PhD, RN, FAAN, is the Ursula Springer Leadership Professor in Nursing at the Rory Meyers College of Nursing at New York University. She received her undergraduate nursing education at Adelphi University in New York, and she holds a Master’s degree in Adult Psychiatric–Mental Health Nursing and a PhD in Nursing Theory and Research from New York University. Dr. Haber is internationally recognized as a clinician and educator in psychiatric–mental health nursing. She was the editor of the award- winning classic textbook, Comprehensive Psychiatric Nursing, published for eight editions and translated into five languages. She has extensive clinical experience in psychiatric nursing, having been an advanced practice psychiatric nurse in private practice for over
13
30 years, specializing in treatment of families coping with the psychosocial impact of acute and chronic illness. Her NIH-funded program of research addressed physical and psychosocial adjustment to illness, focusing specifically on women with breast cancer and their partners and, more recently, breast cancer survivorship and lymphedema prevention and risk reduction. Dr. Haber is also committed to an interprofessional program of clinical scholarship related to interprofessional education and improving oral-systemic health outcomes and is the Executive Director of a national nursing oral health initiative, the Oral Health Nursing Education and Practice (OHNEP) program, funded by the DentaQuest and Washington Dental Service Foundations.
Dr. Haber is the recipient of numerous awards, including the 1995 and 2005 APNA Psychiatric Nurse of the Year Award, the 2005 APNA Outstanding Research Award, and the 1998 ANA Hildegarde Peplau Award. She received the 2007 NYU Distinguished Alumnae Award, the 2011 Distinguished Teaching Award, and the 2014 NYU Meritorious Service Award. In 2015, Dr. Haber received the Sigma Theta Tau International Marie Hippensteel Lingeman Award for Excellence in Nursing Practice. Dr. Haber is a Fellow in the American Academy of Nursing and the New York Academy of Medicine. Dr. Haber has consulted, presented, and published widely on evidence-based practice, interprofessional education and practice, as well as oral-systemic health issues.
14
Contributors
Terri Armstrong, PhD, ANP-BC, FAANP, Senior Investigator, Neuro-oncology Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
Julie Barroso, PhD, ANP, RN, FAAN, Professor and Department Chair, Medical University of South Carolina, Charleston, South Carolina
Carol Bova, PhD, RN, ANP, Professor of Nursing and Medicine, Graduate School of Nursing, University of Massachusetts, Worcester, Massachusetts
Dona Rinaldi Carpenter, EdD, RN, Professor and Chair, University of Scranton, Department of Nursing, Scranton, Pennsylvania
Maja Djukic, PhD, RN, Assistant Professor, Rory Meyers College of Nursing, New York University, New York, New York
Mei R. Fu, PhD, RN, FAAN, Associate Professor, Rory Meyers College of Nursing, New York University, New York, New York
15
Mattia J. Gilmartin, PhD, RN, Senior Research Scientist , Executive Director, NICHE Program, Rory Meyers College of Nursing, New York University, New York, New York
Deborah J. Jones, PhD, MS, RN, Margaret A. Barnett/PARTNERS Professorship , Associate Dean for Professional Development and Faculty Affairs , Associate Professor, University of Texas Health Science Center at Houston, School of Nursing, Houston, Texas
Carl Kirton, DNP, RN, MBA, Chief Nursing Officer, University Hospital, Newark, New Jersey; , Adjunct Faculty, Rory Meyers College of Nursing, New York University, New York, New York
Barbara Krainovich-Miller, EdD, RN, PMHCNS-BC, ANEF, FAAN, Professor, Rory Meyers College of Nursing, New York University, New York, New York
Elaine Larson, PhD, RN, FAAN, CIC, Anna C. Maxwell Professor of Nursing Research , Associate Dean for Research, Columbia University School of Nursing, New York, New York
Melanie McEwen, PhD, RN, CNE, ANEF, Professor, University of Texas Health Science Center at Houston, School of Nursing, Houston, Texas
Gail D’Eramo Melkus, EdD, ANP, FAAN, Florence & William Downs Professor in Nursing Research, Associate Dean for Research, Rory Meyers College of Nursing, New York University, New York, New York
Susan Sullivan-Bolyai, DNSc, CNS, RN, FAAN, Associate
16
Professor, Rory Meyers College of Nursing, New York University, New York, New York
Marita Titler, PhD, RN, FAAN, Rhetaugh G. Dumas Endowed Professor , Department Chair, Department of Systems, Populations and Leadership, University of Michigan School of Nursing, Ann Arbor, Michigan
Mark Toles, PhD, RN, Assistant Professor, University of North Carolina at Chapel Hill, School of Nursing, Chapel Hill, North Carolina
17
Reviewers Karen E. Alexander, PhD, RN, CNOR, Program Director RN- BSN, Assistant Professor, Department of Nursing, University of Houston Clear Lake-Pearland, Houston, Texas
Donelle M. Barnes, PhD, RN, CNE, Associate Professor, College of Nursing, University of Texas, Arlington, Arlington, Texas
Susan M. Bezek, PhD, RN, ACNP, CNE, Assistant Professor, Division of Nursing, Keuka College, Keuka Park, New York
Rose M. Kutlenios, PhD, MSN, MN, BSN, ANCC Board Certification, Adult Psychiatric/Mental Health Clinical Specialist, ANCC Board Certification, Adult Nurse Practitioner, Nursing Program Director and Associate Professor, Department of Nursing, West Liberty University, West Liberty, West Virginia
Shirley M. Newberry, PhD, RN, PHN, Professor, Department of Nursing, Winona State University, Winona, Minnesota
Sheryl Scott, DNP, RN, CNE, Assistant Professor and Chair, School of Nursing, Wisconsin Lutheran College, Milwaukee, Wisconsin
18
To the faculty Geri LoBiondo-Wood, Geri.L.Wood@uth.tmc.edu, Judith Haber, jh33@nyu.edu
The foundation of the ninth edition of Nursing Research: Methods and Critical Appraisal for Evidence-Based Practice continues to be the belief that nursing research is integral to all levels of nursing education and practice. Over the past three decades since the first edition of this textbook, we have seen the depth and breadth of nursing research grow, with more nurses conducting research and using research evidence to shape clinical practice, education, administration, and health policy.
The National Academy of Medicine has challenged all health professionals to provide team-based care based on the best available scientific evidence. This is an exciting challenge. Nurses, as clinicians and interprofessional team members, are using the best available evidence, combined with their clinical judgment and patient preferences, to influence the nature and direction of health care delivery and document outcomes related to the quality and cost-effectiveness of patient care. As nurses continue to develop a unique body of nursing knowledge through research, decisions about clinical nursing practice will be increasingly evidence based.
As editors, we believe that all nurses need not only to understand the research process but also to know how to critically read, evaluate, and apply research findings in practice. We realize that understanding research, as a component of evidence-based practice and quality improvement practices, is a challenge for every student, but we believe that the challenge can be accomplished in a
19
stimulating, lively, and learner-friendly manner. Consistent with this perspective is an ongoing commitment to
advancing implementation of evidence-based practice. Understanding and applying research must be an integral dimension of baccalaureate education, evident not only in the undergraduate nursing research course but also threaded throughout the curriculum. The research role of baccalaureate graduates calls for evidence-based practice and quality improvement competencies; central to this are critical appraisal skills—that is, nurses should be competent research consumers.
Preparing students for this role involves developing their critical thinking skills, thereby enhancing their understanding of the research process, their appreciation of the role of the critiquer, and their ability to actually critically appraise research. An undergraduate research course should develop this basic level of competence, an essential requirement if students are to engage in evidence-informed clinical decision making and practice, as well as quality improvement activities.
The primary audience for this textbook remains undergraduate students who are learning the steps of the research process, as well as how to develop clinical questions, critically appraise published research literature, and use research findings to inform evidence- based clinical practice and quality improvement initiatives. This book is also a valuable resource for students at the master’s, DNP, and PhD levels who want a concise review of the basic steps of the research process, the critical appraisal process, and the principles and tools for evidence-based practice and quality improvement.
This text is also an important resource for practicing nurses who strive to use research evidence as the basis for clinical decision making and development of evidence-based policies, protocols, and standards or who collaborate with nurse-scientists in conducting clinical research and evidence-based practice. Finally, this text is an important resource for considering how evidence-based practice, quality improvement, and interprofessional collaboration are essential competencies for students and clinicians practicing in a transformed health care system, where nurses and their interprofessional team members are accountable for the quality and cost-effectiveness of care provided to their patient population.
20
Building on the success of the eighth edition, we reaffirm our commitment to introducing evidence-based practice, quality improvement processes, and research principles to baccalaureate students, thereby providing a cutting-edge, research consumer foundation for their clinical practice. Nursing Research: Methods and Critical Appraisal for Evidence-Based Practice prepares nursing students and practicing nurses to become knowledgeable nursing research consumers by doing the following:
• Addressing the essential evidence-based practice and quality improvement role of the nurse, thereby embedding evidence- based competencies in clinical practice.
• Demystifying research, which is sometimes viewed as a complex process.
• Using a user-friendly, evidence-based approach to teaching the fundamentals of the research process.
• Including an exciting chapter on the role of theory in research and evidence-based practice.
• Providing a robust chapter on systematic reviews and clinical guidelines.
• Offering two innovative chapters on current strategies and tools for developing an evidence-based practice.
• Concluding with an exciting chapter on quality improvement and its application to practice.
• Teaching the critical appraisal process in a user-friendly progression.
• Promoting a lively spirit of inquiry that develops critical thinking and critical reading skills, facilitating mastery of the critical appraisal process.
• Developing information literacy, searching, and evidence-based practice competencies that prepare students and nurses to
21
effectively locate and evaluate the best research evidence.
• Emphasizing the role of evidence-based practice and quality improvement initiatives as the basis for informing clinical decisions that support nursing practice.
• Presenting numerous examples of recently published research studies that illustrate and highlight research concepts in a manner that brings abstract ideas to life for students. These examples are critical links that reinforce evidence-based concepts and the critiquing process.
• Presenting five published articles, including a meta-analysis, in the Appendices, the highlights of which are woven throughout the text as exemplars of research and evidence-based practice.
• Showcasing, in four new inspirational Research Vignettes, the work of renowned nurse researchers whose careers exemplify the links among research, education, and practice.
• Introducing new pedagogical interprofessional education chapter features, IPE Highlights and IPE Critical Thinking Challenges and quality improvement, QSEN Evidence-Based Practice Tips.
• Integrating stimulating pedagogical chapter features that reinforce learning, including Learning Outcomes, Key Terms, Key Points, Critical Thinking Challenges, Helpful Hints, Evidence-Based Practice Tips, Critical Thinking Decision Paths, and numerous tables, boxes, and figures.
• Featuring a revised section titled Appraising the Evidence, accompanied by an updated Critiquing Criteria box in each chapter that presents a step of the research process.
• Offering a student Evolve site with interactive review questions that provide chapter-by-chapter review in a format consistent with that of the NCLEX® Examination.
• Offering a Student Study Guide that promotes active learning and assimilation of nursing research content.
22
• Presenting Faculty Evolve Resources that include a test bank, TEACH lesson plans, PowerPoint slides with integrated audience response system questions, and an image collection. Evolve resources for both students and faculty also include a research article library with appraisal exercises for additional practice in reviewing and critiquing, as well as content updates.
The ninth edition of Nursing Research: Methods and Critical Appraisal for Evidence-Based Practice is organized into four parts. Each part is preceded by an introductory section and opens with an engaging Research Vignette by a renowned nurse researcher.
Part I, Overview of Research and Evidence-Based Practice, contains four chapters: Chapter 1, “Integrating Research, Evidence- Based Practice, and Quality Improvement Processes,” provides an excellent overview of research and evidence-based practice processes that shape clinical practice. The chapter speaks directly to students and highlights critical reading concepts and strategies, facilitating student understanding of the research process and its relationship to the critical appraisal process. The chapter introduces a model evidence hierarchy that is used throughout the text. The style and content of this chapter are designed to make subsequent chapters user friendly. The next two chapters address foundational components of the research process. Chapter 2, “Research Questions, Hypotheses, and Clinical Questions,” focuses on how research questions and hypotheses are derived, operationalized, and critically appraised. Students are also taught how to develop clinical questions that are used to guide evidence-based inquiry, including quality improvement projects. Chapter 3, “Gathering and Appraising the Literature,” showcases cutting-edge information literacy content and provides students and nurses with the tools necessary to effectively search, retrieve, manage, and evaluate research studies and their findings. Chapter 4, “Theoretical Frameworks for Research,” is a user-friendly theory chapter that provides students with an understanding of how theories provide the foundation of research studies and evidence-based practice projects.
Part II, Processes and Evidence Related to Qualitative Research, contains three interrelated qualitative research chapters. Chapter 5,
23
“Introduction to Qualitative Research,” provides an exciting framework for understanding qualitative research and the significant contribution of qualitative research to evidence-based practice. Chapter 6, “Qualitative Approaches to Research,” presents, illustrates, and showcases major qualitative methods using examples from the literature as exemplars. This chapter highlights the questions most appropriately answered using qualitative methods. Chapter 7, “Appraising Qualitative Research,” synthesizes essential components of and criteria for critiquing qualitative research reports using published qualitative research study.
Part III, Processes and Evidence Related to Quantitative Research, contains Chapters 8 to 18Chapter 8Chapter 9Chapter 10Chapter 11Chapter 12Chapter 13Chapter 14Chapter 15Chapter 16Chapter 17Chapter 18. This group of chapters delineates essential steps of the quantitative research process, with published clinical research studies used to illustrate each step. These chapters are streamlined to make the case for linking an evidence-based approach with essential steps of the research process. Students are taught how to critically appraise the strengths and weaknesses of each step of the research process in a synthesized critique of a study. The steps of the quantitative research process, evidence- based concepts, and critical appraisal criteria are synthesized in Chapter 18 using two published research studies, providing a model for appraising strengths and weaknesses of studies, and determining applicability to practice. Chapter 11, a unique chapter, addresses the use of the types of systematic reviews that support an evidence-based practice as well as the development and application of clinical guidelines.
Part IV, Application of Research: Evidence-Based Practice, contains three chapters that showcase evidence-based practice models and tools. Chapter 19, “Strategies and Tools for Developing an Evidence-Based Practice,” is a revised, vibrant, user-friendly, evidence-based toolkit with exemplars that capture the essence of high-quality, evidence-informed nursing care. It “walks” students and practicing nurses through clinical scenarios and challenges them to consider the relevant evidence-based practice “tools” to develop and answer questions that emerge from clinical situations.
24
Chapter 20, “Developing an Evidence-Based Practice,” offers a dynamic presentation of important evidence-based practice models that promote evidence-based decision making. Chapter 21, “Quality Improvement,” is an innovative, engaging chapter that outlines the quality improvement process with information from current guidelines. Together, these chapters provide an inspirational conclusion to a text that we hope motivates students and practicing nurses to advance their evidence-based practice and quality improvement knowledge base and clinical competence, positioning them to make important contributions to improving health care outcomes as essential members of interprofessional teams.
Stimulating critical thinking is a core value of this text. Innovative chapter features such as Critical Thinking Decision Paths, Evidence-Based Practice Tips, Helpful Hints, Critical Thinking Challenges, IPE Highlights, and QSEN Evidence-Based Practice Tips enhance critical thinking, promote the development of evidence-based decision-making skills, and cultivate a positive value about the importance of collaboration in promoting evidence- based, high quality and cost-effective clinical outcomes.
Consistent with previous editions, we promote critical thinking by including sections called “Appraising the Evidence,” which describe the critical appraisal process related to the focus of the chapter. Critiquing Criteria are included in this section to stimulate a systematic and evaluative approach to reading and understanding qualitative and quantitative research and evaluating its strengths and weaknesses. Extensive resources are provided on the Evolve site that can be used to develop critical thinking and evidence-based competencies.
The development and refinement of an evidence-based foundation for clinical nursing practice is an essential priority for the future of professional nursing practice. The ninth edition of Nursing Research: Methods and Critical Appraisal for Evidence-Based Practice will help students develop a basic level of competence in understanding the steps of the research process that will enable them to critically analyze research studies, judge their merit, and judiciously apply evidence in clinical practice. To the extent that this goal is accomplished, the next generation of nursing professionals will have a cadre of clinicians who inform their
25
practice using theory, research evidence, and clinical judgment, as they strive to provide high-quality, cost-effective, and satisfying health care experiences in partnership with individuals, families, and communities.
26
To the student Geri LoBiondo-Wood, Geri.L.Wood@uth.tmc.edu, Judith Haber, jh33@nyu.edu
We invite you to join us on an exciting nursing research adventure that begins as you turn the first page of the ninth edition of Nursing Research: Methods and Critical Appraisal for Evidence-Based Practice. The adventure is one of discovery! You will discover that the nursing research literature sparkles with pride, dedication, and excitement about the research dimension of professional nursing practice. Whether you are a student or a practicing nurse whose goal is to use research evidence as the foundation of your practice, you will discover that nursing research and a commitment to evidence-based practice positions our profession at the forefront of change. You will discover that evidence-based practice is integral to being an effective member of an interprofessional team prepared to meet the challenge of providing quality whole person care in partnership with patients, their families/significant others, as well as with the communities in which they live. Finally, you will discover the richness in the “Who,” “What,” “Where,” “When,” “Why,” and “How” of nursing research and evidence-based practice, developing a foundation of knowledge and skills that will equip you for clinical practice and making a significant contribution to achieving the Triple Aim, that is, contributing to high quality and cost-effective patient outcomes associated with satisfying patient experiences!
We think you will enjoy reading this text. Your nursing research course will be short but filled with new and challenging learning
27
experiences that will develop your evidence-based practice skills. The ninth edition of Nursing Research: Methods and Critical Appraisal for Evidence-Based Practice reflects cutting-edge trends for developing evidence-based nursing practice. The four-part organization and special features in this text are designed to help you develop your critical thinking, critical reading, information literacy, interprofessional, and evidence-based clinical decision- making skills, while providing a user-friendly approach to learning that expands your competence to deal with these new and challenging experiences. The companion Study Guide, with its chapter-by-chapter activities, serves as a self-paced learning tool to reinforce the content of the text. The accompanying Evolve website offers review questions to help you reinforce the concepts discussed throughout the book.
Remember that evidence-based practice skills are used in every clinical setting and can be applied to every patient population or clinical practice issue. Whether your clinical practice involves primary care or critical care and provides inpatient or outpatient treatment in a hospital, clinic, or home, you will be challenged to apply your evidence-based practice skills and use nursing research as the foundation for your evidence-based practice. The ninth edition of Nursing Research: Methods and Critical Appraisal for Evidence-Based Practice will guide you through this exciting adventure, where you will discover your ability to play a vital role in contributing to the building of an evidence-based professional nursing practice.
28
Acknowledgments Geri LoBiondo-Wood, Judith Haber
No major undertaking is accomplished alone; there are those who contribute directly and those who contribute indirectly to the success of a project. We acknowledge with deep appreciation and our warmest thanks the help and support of the following people:
• Our students, particularly the nursing students at the University of Texas Health Science Center at Houston School of Nursing and the Rory Meyers College of Nursing at New York University, whose interest, lively curiosity, and challenging questions sparked ideas for revisions in the ninth edition.
• Our chapter contributors, whose passion for research, expertise, cooperation, commitment, and punctuality made them a joy to have as colleagues.
• Our vignette contributors, whose willingness to share evidence of their research wisdom made a unique and inspirational contribution to this edition.
• Our colleagues, who have taken time out of their busy professional lives to offer feedback and constructive criticism that helped us prepare this ninth edition.
• Our editors, Lee Henderson, Melissa Rawe, and Carol O’Connell, for their willingness to listen to yet another creative idea about teaching research in a meaningful way and for their expert help
29
with manuscript preparation and production.
• Our families: Rich Scharchburg; Brian Wood; Lenny, Andrew, Abbe, Brett, and Meredith Haber; and Laurie, Bob, Mikey, Benjy, and Noah Goldberg for their unending love, faith, understanding, and support throughout what is inevitably a consuming—but exciting—experience.
30
PART I
Overview of Research and Evidence-Based Practice Research Vignette: Terri Armstrong
OUTLINE
Introduction
1. Integrating research, evidence-based practice, and quality improvement processes
2. Research questions, hypotheses, and clinical questions
3. Gathering and appraising the literature
4. Theoretical frameworks for research
31
32
Introduction
Research vignette
With a little help from my friends Terri Armstrong, PhD ANP-BC, FAANP, FAAN
Senior Investigator Neuro-Oncology Branch National Cancer Institute National Institute of Health Bethesda, Maryland I grew up surrounded by family and strong role models of
women working in health care in a small town in Ohio. When in college, the three most important women in my life (my mom, grandmother, and great-grandmother) were all diagnosed with cancer. This led me to seek out a nursing position in oncology, and over time, I was able to be actively involved in their care. This experience taught me so much and led to the desire to do more to make the daily lives of people with cancer better. After obtaining a master’s in oncology and a postmaster’s nurse practitioner, an opportunity to work with Dr. M. Gilbert, a well-known caring physician who specialized in the care and treatment of patients with central nervous system (CNS) tumors and a great mentor, became available, so my work with people with CNS tumors began.
After several years, I realized that the quality of life of the brain tumor patients and families was significantly impacted by the symptoms they experienced. Over 80% were unable to return to work from the time of diagnosis, and their daily lives (and those of their families) were often consumed with managing the neurologic
33
and treatment-related symptoms. I realized that obtaining my PhD would be an important step to learn the skills I would need to try to find answers to solve the problems CNS tumor patients were facing.
At that time, many of the conceptual models identified solitary symptoms and their impact on the person. I learned from my experience and in caring for patients that symptoms seldom occurred in isolation and that the meaning the symptoms had for patients’ daily lives was important, as was learning about the patients’ perception of that impact. I developed a conceptual model to identify those relationships and guide my research (Armstrong, 2003). My focus since then has been on patient-centered outcomes research, focusing on the impact of symptoms on the illness trajectory, tolerance of therapy, and potential to influence survival. My work is never done in isolation. I have been fortunate to work with research teams, including those who work alongside me and important collaborators across disciplines and the world. Team research, in which the views of various disciplines are brought together, is important in every step of research—from the hypothesis to study design and finally interpretation of the results.
My work is interconnected, but I believe it can be categorized into three general areas:
1. Improving assessment and our understanding of the experience of patients with CNS tumors.
Patients with primary brain tumors are highly symptomatic, with implications for functional status, and are used in making treatment decisions. I led a team that developed the M.D. Anderson Symptom Inventory for Brain Tumors (MDASI-BT) (Armstrong et al., 2005; Armstrong et al., 2006) and spinal cord tumors (MDASI-Spine) (Armstrong, Gning, et al., 2010). We have completed studies showing that symptoms are associated with tumor progression (Armstrong et al., 2011). We have also been able to quantify limitations of patients’ functional status (Armstrong et al., 2015), in a way that caregivers report is congruent with the patient, and have found that electronic technology (such as iPads) can be used for this (Armstrong et al., 2012). Our work with the Collaborative Ependymoma Research
34
Organization (CERN, www.cern-foundation.org) has allowed us to reach out to patients with this rarer tumor to understand the natural history and impact of the disease and its treatment on patients around the world (Armstrong, Vera-Bolanos, et al., 2010; Armstrong, Vera-Bolanos, & Gilbert, 2011). Based on these surveys, we have developed materials to inform patients and are launching an expansion of this project, in which we will evaluate risk factors (both based on history and genetics) for the occurrence of these tumors in both adults and children.
2. Incorporation of clinical outcomes assessment into brain tumor clinical trials.
Clinical trials often assess the impact of therapy on how the tumor appears on imaging or survival, but the impact on the person is often not assessed. I have been fortunate to work with Dr. M. Gilbert and Dr. J. Wefel to incorporate these outcomes into large clinical trials, providing clear evidence that it was feasible to incorporate patient outcomes measures and that the results of these evaluations could impact the interpretation of the clinical trial (Armstrong et al., 2013; Gilbert et al., 2014). As a result of my involvement in these efforts, I recently chaired a daylong workshop exploring the use of clinical outcomes assessments (COAs) in brain tumor trials, a workshop cosponsored by the FDA and the Jumpstarting Brain Tumor Drug Development (JSBTDD) consortia that also included members of the academic community, patient advocates, pharmaceutical industry, and the NIH. This successful workshop has resulted in a series of white papers that were recently published on the importance of including these in clinical trials (Armstrong, Bishof, et al., 2016; Helfer et al., 2016).
3. Identification of clinical and genomic predictors of toxicity.
Toxicity associated with treatment also impacts the patient. For example, Temozolomide, the most common agent used in the treatment of brain tumors, has a low overall incidence of myelotoxicity (impact on blood counts that help to fight infection or clot the blood). However, in the select patients who develop
35
toxicity, there are significant clinical implications (treatment holds or cessation, and even death). I work with an interdisciplinary group that began to explore the clinical predictors of this toxicity and then explored associated genomic changes associated with risk (Armstrong et al., 2009). Currently, I am also working with a research team exploring risk factors and pathogenesis of radiation-induced fatigue and sleepiness, which is a major symptom in a large percentage of patients undergoing cranial radiotherapy for their brain tumor (Armstrong, Shade, et al., 2016). The ultimate goal of this part of my research is to begin to uncover phenotypes associated with symptoms and to uncover the underlying biologic processes, so that we can initiate measures prior to the occurrence of symptoms, rather than waiting for them to occur and then trying to mitigate them.
In addition to conducting focused outcomes research as outlined previously, I have over 25 years’ dedication to the clinical care of persons with tumors of the CNS. This work is the best part of my job and is a critical linkage and inspiration in my research, with the goal of improving the daily life of patients and improving our understanding of the underlying biology of symptoms and experience that our patients have.
References 1. Armstrong T. S. Symptoms experience a concept analysis.
Oncology Nursing Society 2003;30(4):601-606. 2. Armstrong T. S, Cohen M. Z, Eriksen L., Cleeland C.
Content validity of self-report measurement instruments an illustration from the development of the Brain Tumor Module of the M. D. Anderson Symptom Inventory. Oncology Nursing Society 2005;32(3):669-676.
3. Armstrong T. S, Mendoza T., Gning I., et al. Validation of the M. D. Anderson Symptom Inventory Brain Tumor Module (MDASI-BT). Journal of Neuro-Oncology 2006;80(1):27-35.
4. Armstrong T. S, Cao Y., Scheurer M. E, et al. Risk analysis of severe myelotoxicity with temozolomide The effects of clinical and genetic factors. Neuro-Oncology 2009;11(6):825-832.
36
5. Armstrong T. S, Gning I., Mendoza T. R, et al. Reliability and validity of the M. D. Anderson Symptom Inventory-Spine Tumor Module. Journal of Neurosurgery Spine 2010;12(4):421- 430.
6. Armstrong T. S, Vera-Bolanos E., Bekele B. N, et al. Adult ependymal tumors prognosis and the M. D. Anderson Cancer Center experience. Neuro-Oncology 2010;12(8):862-870.
7. Armstrong T. S, Vera-Bolanos E., Gilbert M. R. Clinical course of adult patients with ependymoma results of the Adult Ependymoma Outcomes Project. Cancer 2011;117(22):5133- 5141.
8. Armstrong T. S, Vera-Bolanos E., Gning I., et al. The impact of symptom interference using the MD Anderson Symptom Inventory-Brain Tumor Module (MDASI-BT) on prediction of recurrence in primary brain tumor patients. Cancer 2011;117(14):3222-3228.
9. Armstrong T. S, Wefel J. S, Gning I., et al. Congruence of primary brain tumor patient and caregiver symptom report. Cancer 2012;118(20):5026-5037.
10. Armstrong T. S, Wefel J. S, Wang M., et al. Net clinical benefit analysis of radiation therapy oncology group 0525 a phase III trial comparing conventional adjuvant temozolomide with dose-intensive temozolomide in patients with newly diagnosed glioblastoma. Journal of Clinical Oncology 2013;31(32):4076-4084.
11. Armstrong T. S, Vera-Bolanos E., Acquaye A. A, et al. The symptom burden of primary brain tumors evidence for a core set of tumor and treatment-related symptoms. Neuro- Oncology 2015;18(2):252-260 Epub August 19, 2015.
12. Armstrong T. S, Bishof A. M, Brown P. D, et al. Determining priority signs and symptoms for use as clinical outcomes assessments in trials including patients with malignant gliomas panel 1 report. Neuro-Oncology 2016;18(Suppl. 2):ii1-ii12.
13. Armstrong T. S, Shade M. Y, Breton G., et al. Sleep-wake disturbance in patients with brain tumors.;: Neuro-Oncology, in press2016;
14. Gilbert M. R, Dignam J. J, Armstrong T. S, et al. A randomized trial of bevacizumab for newly diagnosed
37
glioblastoma. New England Journal of Medicine 2014;370(8):699-708.
15. Helfer J. L, Wen P. Y, Blakeley J., et al. Report of the Jumpstarting Brain Tumor Drug Development Coalition and FDA clinical trials clinical outcome assessment endpoints workshop (October 15, 2014, Bethesda, MD). Neuro-Oncology 2016;18(Suppl. 2):ii26-ii36.
38
CHAPTER 1
Integrating research, evidence-based practice, and quality improvement processes Geri LoBiondo-Wood, Judith Haber
Learning outcomes
After reading this chapter, you should be able to do the following:
• State the significance of research, evidence-based practice, and quality improvement (QI). • Identify the role of the consumer of nursing research. • Define evidence-based practice. • Define QI. • Discuss evidence-based and QI decision making.
39
• Explain the difference between quantitative and qualitative research. • Explain the difference between the types of systematic reviews. • Identify the importance of critical reading skills for critical appraisal of research. • Discuss the format and style of research reports/articles. • Discuss how to use an evidence hierarchy when critically appraising research studies.
KEY TERMS
abstract
clinical guidelines
consensus guidelines
critical appraisal
critical reading
critique
evidence-based guidelines
evidence-based practice
integrative review
levels of evidence
meta-analysis
meta-synthesis
quality improvement
qualitative research
quantitative research
research
systematic review
40
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
We invite you to join us on an exciting nursing research adventure that begins as you read the first page of this chapter. The adventure is one of discovery! You will discover that the nursing research literature sparkles with pride, dedication, and excitement about this dimension of professional practice. As you progress through your educational program, you are taught how to ensure quality and safety in practice through acquiring knowledge of the various sciences and health care principles. A critical component of clinical knowledge is understanding research as it applies to practicing from a base of evidence.
Whether you are a student or a practicing nurse whose goal is to use research as the foundation of your practice, you will discover that research, evidence-based practice, and quality improvement (QI) positions our profession at the cutting edge of change and improvement in patient outcomes. You will also discover the cutting edge “who,” “what,” “where,” “when,” “why,” and “how” of nursing research, and develop a foundation of evidence-based practice knowledge and competencies that will equip you for your clinical practice.
Your nursing research adventure will be filled with new and challenging learning experiences that develop your evidence-based practice skills. Your critical thinking, critical reading, and clinical decision-making skills will expand as you develop clinical questions, search the research literature, evaluate the research evidence found in the literature, and make clinical decisions about applying the “best available evidence” to your practice. For example, you will be encouraged to ask important clinical questions, such as, “What makes a telephone education intervention more effective with one group of patients with a diagnosis of congestive heart failure but not another?” “What is the effect of computer learning modules on self-management of diabetes in children?” “What research has been conducted in the area of identifying barriers to breast cancer screening in African
41
American women?” “What is the quality of studies conducted on telehealth?” “What nursing-delivered smoking cessation interventions are most effective?” This book will help you begin your adventure into evidence-based practice by developing an appreciation of research as the foundation for evidence-based practice and QI.
Nursing research, evidence-based practice, and quality improvement Nurses are challenged to stay abreast of new information to provide the highest quality of patient care (Institute of Medicine [IOM], 2011). Nurses are challenged to expand their “comfort zone” by offering creative approaches to old and new health problems, as well as designing new and innovative programs that make a difference in the health status of our citizens. This challenge can best be met by integrating rapidly expanding research and evidence-based knowledge about biological, behavioral, and environmental influences on health into the care of patients and their families.
It is important to differentiate between research, evidence-based practice, and QI. Research is the systematic, rigorous, critical investigation that aims to answer questions about nursing phenomena. Researchers follow the steps of the scientific process, outlined in this chapter and discussed in detail in each chapter of this textbook. There are two types of research: quantitative and qualitative. The methods used by nurse researchers are the same methods used by other disciplines; the difference is that nurses study questions relevant to nursing practice. Published research studies are read and evaluated for use in clinical practice. Study findings provide evidence that is evaluated, and applicability to practice is used to inform clinical decisions.
Evidence-based practice is the collection, evaluation, and integration of valid research evidence, combined with clinical expertise and an understanding of patient and family values and preferences, to inform clinical decision making (Sackett et al., 2000). Research studies are gathered from the literature and assessed so that decisions about application to practice can be made,
42
culminating in nursing practice that is evidence based. ➤ Example: To help you understand the importance of evidence-based practice, think about the systematic review and meta-analysis from Al- Mallah and colleagues (2015), which assessed the impact of nurse- led clinics on the mortality and morbidity of patients with cardiovascular disease (see Appendix E). Based on their synthesis of the literature, they put forth several conclusions regarding the implications for practice and further research for nurses working in the field of cardiovascular care.
QI is the systematic use of data to monitor the outcomes of care processes as well as the use of improvement methods to design and test changes in practice for the purpose of continuously improving the quality and safety of health care systems (Cronenwett et al., 2007). While research supports or generates new knowledge, evidence-based practice and QI uses currently available knowledge to improve health care delivery. When you first read about these three processes, you will notice they have similarities. Each begins with a question. The difference is that in a research study the question is tested with a design appropriate to the question and specific methodology (i.e., sample, instruments, procedures, and data analysis) used to test the research question and contribute to new, generalizable knowledge. In the evidence-based practice and QI processes, a question is used to search the literature for already completed studies in order to bring about improvements in care.
All nurses share a commitment to the advancement of nursing science by conducting research and using research evidence in practice. Research promotes accountability, which is one of the hallmarks of the nursing profession and a fundamental concept of the American Nurses Association (ANA) Code for Nurses (ANA, 2015). There is a consensus that the research role of the baccalaureate and master’s graduate calls for critical appraisal skills. That is, nurses must be knowledgeable consumers of research, who can evaluate the strengths and weaknesses of research evidence and use existing standards to determine the merit and readiness of research for use in clinical practice. Therefore, to use research for an evidence-based practice and to practice using the highest quality processes, you do not have to conduct research; however, you do need to understand and appraise the steps of the
43
research process in order to read the research literature critically and use it to inform clinical decisions.
As you venture through this text, you will see the steps of the research, evidence-based practice, and QI processes. The steps are systematic and relate to the development of evidence-based practice. Understanding the processes that researchers use will help you develop the assessment skills necessary to judge the soundness of research studies.
throughout the chapters, terminology pertinent to each step is identified and illustrated with examples. Five published studies are found in the appendices and used as examples to illustrate significant points in each chapter. Judging the study’s strength and quality, as well as its applicability to practice, is key. Before you can judge a study, it is important to understand the differences among studies. There are different study designs that you will see as you read through this text and the appendices. There are standards not only for critiquing the soundness of each step of a study, but also for judging the strength and quality of evidence provided by a study and determining its applicability to practice.
This chapter provides an overview of research study designs and appraisal skills. It introduces the overall format of a research article and provides an overview of the subsequent chapters in the book. It also introduces the QI and evidence-based practice processes, a level of evidence hierarchy model, and other tools for helping you evaluate the strength and quality of research evidence. These topics are designed to help you read research articles more effectively and with greater understanding, so that you can make evidence-based clinical decisions and contribute to quality and cost-effective patient outcomes.
Types of research: Qualitative and quantitative Research is classified into two major categories: qualitative and quantitative. A researcher chooses between these categories based on the question being asked. That is, a researcher may wish to test a cause-and-effect relationship, or to assess if variables are related, or may wish to discover and understand the meaning of an experience
44
or process. A researcher would choose to conduct a qualitative research study if the question is about understanding the meaning of a human experience such as grief, hope, or loss. The meaning of an experience is based on the view that meaning varies and is subjective. The context of the experience also plays a role in qualitative research. That is, the experience of loss as a result of a miscarriage would be different than the experience of losing a parent.
Qualitative research is generally conducted in natural settings and uses data that are words or text rather than numeric to describe the experiences being studied. Qualitative studies are guided by research questions, and data are collected from a small number of subjects, allowing an in-depth study of a phenomenon. ➤ Example: vanDijk et al. (2016) explored how patients assign a number to their postoperative pain experience (see Appendix C). Although qualitative research is systematic in its method, it uses a subjective approach. Data from qualitative studies help nurses understand experiences or phenomena that affect patients; these data also assist in generating theories that lead clinicians to develop improved patient care and stimulate further research. Highlights of the general steps of qualitative studies and the journal format for a qualitative article are outlined in Table 1.1. Chapters 5 through 7 provide an in-depth view of qualitative research underpinnings, designs, and methods.
TABLE 1.1 Steps of the Research Process and Journal Format: Qualitative Research
Research Process Steps and/or Format Issues Usual Location in Journal Heading or Subheading
Identifying the phenomenon
Abstract and/or in introduction
Research question study purpose
Abstract and/or in beginning or end of introduction
Literature review Introduction and/or discussion Design Abstract and/or in introductory section or under method section
entitled “Design” or stated in method section Sample Method section labeled “Sample” or “Subjects” Legal-ethical issues Data collection or procedures section or in sample section Data collection procedure Data collection or procedures section Data analysis Methods section under subhead “Data Analysis” or “Data
45
Analysis and Interpretation” Results Stated in separate heading: “Results” or “Findings” Discussion and recommendation
Combined in separate section: “Discussion” or “Discussion and Implications”
References At end of article
Whereas qualitative research looks for meaning, quantitative research encompasses the study of research questions and/or hypotheses that describe phenomena, test relationships, assess differences, seek to explain cause-and-effect relationships between variables, and test for intervention effectiveness. The numeric data in quantitative studies are summarized and analyzed using statistics. Quantitative research techniques are systematic, and the methodology is controlled. Appendices A, B, and D illustrate examples of different quantitative approaches to answering research questions. Table 1.2 indicates where each step of the research process can usually be located in a quantitative research article and where it is discussed in this text. Chapters 2, 3, and 8 through 18 describe processes related to quantitative research.
TABLE 1.2 Steps of the Research Process and Journal Format: Quantitative Research
Research Process Steps and/or Format Issue
Usual Location in Journal Heading or Subheading TextChapter
Research problem Abstract and/or in article introduction or separately labeled: “Problem”
2
Purpose Abstract and/or in introduction, or end of literature review or theoretical framework section, or labeled separately: “Purpose”
2
Literature review At end of heading “Introduction” but not labeled as such, or labeled as separate heading: “Literature Review,” “Review of the Literature,” or “Related Literature”; or not labeled or variables reviewed appear as headings or subheadings
3
TF and/or CF Combined with “Literature Review” or found in separate section as TF or CF; or each concept used in TF or CF may appear as separate subheading
3, 4
Hypothesis/research questions
Stated or implied near end of introduction, may be labeled or found in separate heading or subheading: “Hypothesis” or “Research Questions”; or reported for first time in “Results”
2
Research design Stated or implied in abstract or introduction or in “Methods” or “Methodology” section
8–10
Sample: type and size
“Size” may be stated in abstract, in methods section, or as separate subheading under methods section as “Sample,” “Sample/Subjects,” or “Participants”; “Type” may be implied or stated in any of previous headings described under size
12
46
Legal-ethical issues Stated or implied in sections: “Methods,” “Procedures,” “Sample,” or “Subjects”
13
Instruments Found in sections: “Methods,” “Instruments,” or “Measures” 14
Validity and reliability
Specifically stated or implied in sections: “Methods,” “Instruments,” “Measures,” or “Procedures”
15
Data collection procedure
In methods section under subheading “Procedure” or “Data Collection,” or as separate heading: “Procedure”
14
Data analysis Under subheading: “Data Analysis” 16 Results Stated in separate heading: “Results” 16, 17 Discussion of findings and new findings
Combined with results or as separate heading: “Discussion” 17
Implications, limitations, and recommendations
Combined in discussion or as separate major headings 17
References At end of article 4 Communicating research results
Research articles, poster, and paper presentations 1, 20
CF, Conceptual framework; TF, theoretical framework.
The primary difference is that a qualitative study seeks to interpret meaning and phenomena, whereas quantitative research seeks to test a hypothesis or answer research questions using statistical methods. Remember as you read research articles that, depending on the nature of the research problem, a researcher may vary the steps slightly; however, all of the steps should be addressed systematically.
Critical reading skills To develop an expertise in evidence-based practice, you will need to be able to critically read all types of research articles. As you read a research article, you may be struck by the difference in style or format of a research article versus a clinical article. The terms of a research article are new, and the content is different. You may also be thinking that the research article is hard to read or that it is technical and boring. You may simultaneously wonder, “How will I possibly learn to appraise all the steps of a research study, the terminology, and the process of evidence-based practice? I’m only on Chapter 1. This is not so easy; research is as hard as everyone says.”
Remember that learning occurs with time and help. Reading research articles can be difficult and frustrating at first, but the best
47
way to become a knowledgeable research consumer is to use critical reading skills when reading research articles. As a student, you are not expected to understand a research article or critique it perfectly the first time. Nor are you expected to develop these skills on your own. An essential objective of this book is to help you acquire critical reading skills so that you can use research in your practice. Becoming a competent critical thinker and reader of research takes time and patience.
Learning the research process further develops critical appraisal skills. You will gradually be able to read a research article and reflect on it by identifying assumptions, key concepts, and methods, and determining whether the conclusions are based on the study’s findings. Once you have obtained this critical appraisal competency, you will be ready to synthesize the findings of multiple studies to use in developing an evidence-based practice. This will be a very exciting and rewarding process for you. Analyzing a study critically can require several readings. As you review and synthesize a study, you will begin an appraisal process to help you determine the study’s worth. An illustration of how to use critical reading strategies is provided in Box 1.1, which contains an excerpt from the abstract, introduction, literature review, theoretical framework literature, and methods and procedure section of a quantitative study (Nyamathi et al., 2015) (see Appendix A). Note that in this article there is both a literature review and a theoretical framework section that clearly support the study’s objectives and purpose. Also note that parts of the text from the article were deleted to offer a number of examples within the text of this chapter. BOX 1.1 Example of Critical Appraisal Reading Strategies
Introductory Paragraphs, Study’s Purpose and Aims
Globally, incarcerated populations encounter a host of public health care issues; two such issues—HAV and HBV diseases—are vaccine preventable. In addition, viral hepatitis disproportionately impacts the homeless because of increased risky sexual behaviors and drug use (Stein, Andersen, Robertson, & Gelberg, 2012), along with substandard living conditions (Hennessey, Bangsberg, Weinbaum, & Hahn, 2009). Purpose—Despite knowledge of awareness of risk factors for HBV infection, intervention programs designed to enhance completion of the three-series
48
Twinrix HAV/HBV vaccine and identification of prognostic factors for vaccine completion have not been widely studied. The purpose of this study was to first assess whether seronegative parolees previously randomized to any one of three intervention conditions were more likely to complete the vaccine series as well as to identify the predictors of HAV/HBV vaccine completion.
Literature Review— Concepts
Despite the availability of the HBV vaccine, there has been a low rate of completion for the three-dose core of the accelerated vaccine series (Centers for Disease Control and Prevention, 2012). Among incarcerated populations, HBV vaccine coverage is low; in a study among jail inmates, 19% had past HBV infection, and 12% completed the HBV vaccination series (Hennessey, Kim, et al., 2009). Although HBV is well accepted behind bars—because of the lack of funding and focus on prevention as a core in the prison system— few inmates complete the series (Weinbaum, Sabin, & Santibanez, 2005). In addition, prevention may not be priority.
Preventable disease vaccinations Homelessness
Authors contend that, although the HBV vaccine is cost-effective, it is underutilized among high-risk (Rich et al., 2003) and incarcerated populations (Hunt & Saab, 2009). For homeless men on parole, vaccination completion may be affected by level of custody; generally, the higher the level of custody, the higher the risk an inmate poses.
Conceptual Framework
The comprehensive health seeking and coping paradigm (Nyamathi, 1989), adapted from a coping model (Lazarus & Folkman, 1984), and the health seeking and coping paradigm (Schlotfeldt, 1981) guided this study and the variables selected (see Fig. 1.1). The comprehensive health seeking and coping paradigm has been successfully applied by our team to improve our understanding of HIV and HBV/hepatitis C virus (HCV) protective behaviors and health outcomes among homeless adults (Nyamathi, Liu, et al., 2009)—many of whom had been incarcerated (Nyamathi et al., 2012).
Methods/Design The study used a randomized clinical trial. Specific Aims and Hypotheses
In this model, a number of factors are thought to relate to the outcome variable, completion of the HAV/HBV vaccine series. These factors include sociodemographic factors, situational factors, personal factors, social factors, and health seeking and coping responses.
Subject Recruitment and Accrual
An RCT where 600 male parolees participating in an RDT program were randomized into one of three intervention conditions aimed at assessing program efficacy on reducing drug use and recidivism at 6 and 12 months, as well as vaccine completion in eligible subjects. There were four inclusion criteria for recruitment purposes in assessing program efficacy on reducing drug use and recidivism: (1) history of drug use prior to their latest incarceration, (2) between ages of 18 and 60, (3) residing in the participating RDT program, and (4) designated as homeless as noted on the prison or jail discharge form.
Procedure The study was approved by the University of California, Los Angeles Institutional Review Board and registered with clinical Trials.gov. Building upon previous studies, we developed varying levels of peer- coached and nurse-led programs designed to improve HAV/HBV vaccine receptivity at 12-month follow-up among homeless offenders recently released to parole. See Appendix A for details in the “Interventions” section.
Intervention Fidelity
Several strategies for treatment fidelity included study design, interventionist’s training, and standardization of interventions. See the Interventions section in Appendix A.
HBA, Hepatitis A virus; HBV, hepatitis B virus; RCT, randomized clinical trial.
49
HIGHLIGHT Start an IPE Journal Club with students from other health professions programs on your campus. Select a research study to read, understand, and critically appraise together. It is always helpful to collaborate on deciding whether the findings are applicable to clinical practice.
Strategies for critiquing research studies Evaluation of a research article requires a critique. A critique is the process of critical appraisal that objectively and critically evaluates a research report’s content for scientific merit and application to practice. It requires some knowledge of the subject matter and knowledge of how to critically read and use critical appraisal criteria. You will find:
• Summarized examples of critical appraisal criteria for qualitative studies and an example of a qualitative critique in Chapter 7
• Summarized critical appraisal criteria and examples of a quantitative critique in Chapter 18
• An in-depth exploration of the criteria for evaluation required in quantitative research critiques in Chapters 8 through 18
• Criteria for qualitative research critiques presented in Chapters 5 through 7
• Principles for qualitative and quantitative research in Chapters 1 through 4
Critical appraisal criteria are the standards, appraisal guides, or questions used to assess an article. In analyzing a research article, you must evaluate each step of the research process and ask questions about whether each step meets the criteria. For instance, the critical appraisal criteria in Chapter 3 ask if “the literature review identifies gaps and inconsistencies in the literature about a subject, concept, or problem,” and if “all of the concepts and
50
variables are included in the review.” These two questions relate to critiquing the research question and the literature review components of the research process. Box 1.1 lists several gaps identified in the literature by Nyamathi and colleagues (2015) and how the study intended to fill these gaps by conducting research for the stated objective and purpose (see Appendix A). Remember that when doing a critique, you are pointing out strengths as well as weaknesses. Standardized critical appraisal tools such as those from the Center for Evidence Based Medicine (CEBM) Critical Appraisal Tools (www.cebm.net/critical-appraisal) can be used to systematically appraise the strength and quality of evidence provided in research articles (see Chapter 20).
Critiquing can be thought of as looking at a completed jigsaw puzzle. Does it form a comprehensive picture, or is a piece out of place? What is the level of evidence provided by the study and the findings? What is the balance between the risks and benefits of the findings that contribute to clinical decisions? How can I apply the evidence to my patient, to my patient population, or in my setting? When reading several studies for synthesis, you must assess the interrelationship of the studies, as well as the overall strength and quality of evidence and applicability to practice. Reading for synthesis is essential in critiquing research. Appraising a study helps with the development of an evidence table (see Chapter 20).
Overcoming barriers: Useful critiquing strategies throughout the text, you will find features that will help refine the skills essential to understanding and using research in your practice. A Critical Thinking Decision Path related to each step of the research process in each chapter will sharpen your decision- making skills as you critique research articles. Look for Internet resources in chapters that will enhance your consumer skills. Critical Thinking Challenges, which appear at the end of each chapter, are designed to reinforce your critical reading skills in relation to the steps of the research process. Helpful Hints, designed to reinforce your understanding, appear at various points throughout the chapters. Evidence-Based Practice Tips, which will
51
help you apply evidence-based practice strategies in your clinical practice, are provided in each chapter.
When you complete your first critique, congratulate yourself; mastering these skills is not easy. Best of all, you can look forward to discussing the points of your appraisal, because your critique will be based on objective data, not just personal opinion. As you continue to use and perfect critical analysis skills by critiquing studies, remember that these skills are an expected competency for delivering evidence-based and quality nursing care.
Evidence-based practice and research Along with gaining comfort while reading and critiquing studies, there is one final step: deciding how, when, and if to apply the studies to your practice so that your practice is evidence based. Evidence-based practice allows you to systematically use the best available evidence with the integration of individual clinical expertise, as well as the patient’s values and preferences, in making clinical decisions (Sackett et al., 2000). Evidence-based practice involves processes and steps, as does the research process. These steps are presented throughout the text. Chapter 19 provides an overview of evidence-based practice steps and strategies.
When using evidence-based practice strategies, the first step is to be able to read a study and understand how each section is linked to the steps of the research process. The following section introduces you to the research process as presented in published articles. Once you read a study, you must decide which level of evidence the study provides and how well the study was designed and executed. Fig. 1.1 illustrates a model for determining the levels of evidence associated with a study’s design, ranging from systematic reviews of randomized clinical trials (RCTs) (see Chapters 9 and 10) to expert opinions. The rating system, or evidence hierarchy model, presented here is just one of many. Many hierarchies for assessing the relative worth of both qualitative and quantitative designs are available. Early in the development of evidence-based practice, evidence hierarchies were thought to be very inflexible, with systematic reviews or meta-analyses at the top and qualitative research at the bottom. When assessing a clinical
52
question that measures cause and effect, this may be true; however, nursing and health care research are involved in a broader base of problem solving, and thus assessing the worth of a study within a broader context of applying evidence into practice requires a broader view.
FIG 1.1 Levels of evidence: Evidence hierarchy for rating levels of evidence associated with a study’s
design. Evidence is assessed at a level according to its source.
The meaningfulness of an evidence rating system will become clearer as you read Chapters 8 through 11. ➤ Example: The Nyamathi et al. (2015) study is Level II because of its experimental, randomized control trial design, whereas the vanDijk et al. (2016) study is Level VI because it is a qualitative study. The level itself does not tell a study’s worth; rather it is another tool that helps you think about a study’s strengths and weaknesses and the nature of
53
the evidence provided in the findings and conclusions. Chapters 7 and 18 will provide an understanding of how studies can be assessed for use in practice. You will use the evidence hierarchy presented in Fig. 1.1 throughout the book as you develop your research consumer skills, so become familiar with its content.
This rating system represents levels of evidence for judging the strength of a study’s design, which is just one level of assessment that influences the confidence one has in the conclusions the researcher has drawn. Assessing the strength of scientific evidence or potential research bias provides a vehicle to guide evaluation of research studies for their applicability in clinical decision making. In addition to identifying the level of evidence, one needs to grade the strength of a body of evidence, incorporating the domains of quality, quantity, and consistency (Agency for Healthcare Research and Quality, 2002).
• Quality: Extent to which a study’s design, implementation, and analysis minimize bias.
• Quantity: Number of studies that have evaluated the research question, including overall sample size across studies, as well as the strength of the findings from data analyses.
• Consistency: Degree to which studies with similar and different designs investigating the same research question report similar findings.
The evidence-based practice process steps are: ask, gather, assess and appraise, act, and evaluate (Fig. 1.2). These steps of asking clinical questions; identifying and gathering the evidence; critically appraising and synthesizing the evidence or literature; acting to change practice by coupling the best available evidence with your clinical expertise and patient preferences (e.g., values, setting, and resources); and evaluating if the use of the best available research evidence is applicable to your patient or organization will be discussed throughout the text.
54
FIG 1.2 Evidence-based practice steps.
To maintain an evidence-based practice, studies are evaluated using specific criteria. Completed studies are evaluated for strength, quality, and consistency of evidence. Before one can proceed with an evidence-based project, it is necessary to understand the steps of the research process found in research studies.
Research articles: Format and style Before you begin reading research articles, it is important to understand their organization and format. Many journals publish research, either as the sole type of article or in addition to clinical or theoretical articles. Many journals have some common features but also unique characteristics. All journals have guidelines for manuscript preparation and submission. A review of these guidelines, which are found on a journal’s website, will give you an idea of the format of articles that appear in specific journals.
Remember that even though each step of the research process is discussed at length in this text, you may find only a short paragraph or a sentence in an article that provides the details of the step. A publication is a shortened version of the researcher(s) completed work. You will also find that some researchers devote more space in an article to the results, whereas others present a longer discussion of the methods and procedures. Most authors give more emphasis to the method, results, and discussion of implications than to details of assumptions, hypotheses, or definitions of terms. Decisions about the amount of material presented for each step of the research process are bound by the following:
• A journal’s space limitations
55
• A journal’s author guidelines
• The type or nature of the study
• The researcher’s decision regarding which component of the study is the most important
The following discussion provides a brief overview of each step of the research process and how it might appear in an article. It is important to remember that a quantitative research article will differ from a qualitative research article. The components of qualitative research are discussed in Chapters 5 and 6, and are summarized in Chapter 7.
Abstract An abstract is a short, comprehensive synopsis or summary of a study at the beginning of an article. An abstract quickly focuses the reader on the main points of a study. A well-presented abstract is accurate, self-contained, concise, specific, nonevaluative, coherent, and readable. Abstracts vary in word length. The length and format of an abstract are dictated by the journal’s style. Both quantitative and qualitative research studies have abstracts that provide a succinct overview of the study. An example of an abstract can be found at the beginning of the study by Nyamathi et al. (2015) (see Appendix A). Their abstract follows an outline format that highlights the major steps of the study. It partially reads as follows:
Purpose/Objective: “The study focused on completion of the HAV and HBV vaccine series among homeless men on parole. The efficacy of the three levels of peer counseling (PC) and nurse delivered intervention was compared at 12 month follow up.”
In this example, the authors provide a view of the study variables. The remainder of the abstract provides a synopsis of the background of the study and the methods, results, and conclusions. The studies in Appendices A through D all have abstracts.
HELPFUL HINT An abstract is a concise short overview that provides a reference to the research purpose, research questions, and/or hypotheses, methodology, and results, as well as the implications for practice or
56
future research.
Introduction Early in a research article, in a section that may or may not be labeled “Introduction,” the researcher presents a background picture of the area researched and its significance to practice (see Chapter 2).
Definition of the purpose The purpose of the study is defined either at the end of the researcher’s initial introduction or at the end of the “Literature Review” or “Conceptual Framework” section. The study’s purpose may or may not be labeled (see Chapters 2 and 3), or it may be referred to as the study’s aim or objective. The studies in Appendices A through D present specific purposes for each study in untitled sections that appear in the beginning of each article, as well as in the article’s abstract.
Literature review and theoretical framework Authors of studies present the literature review and theoretical framework in different ways. Many research articles merge the “Literature Review” and the “Theoretical Framework.” This section includes the main concepts investigated and may be called “Review of the Literature,” “Literature Review,” “Theoretical Framework,” “Related Literature,” “Background,” “Conceptual Framework,” or it may not be labeled at all (see Chapters 2 and 3). By reviewing Appendices A through D, you will find differences in the headings used. Nyamathi et al. (2015) (see Appendix A) use no labels and present the literature review but do have a section labeled theoretical framework, while the study in Appendix B has a literature review and a conceptual framework integrated in the beginning of the article. One style is not better than another; the studies in the appendices contain all the critical elements but present the elements differently.
HELPFUL HINT Not all research articles include headings for each step or component of the research process, but each step is presented at
57
some point in the article.
Hypothesis/research question A study’s research questions or hypotheses can also be presented in different ways (see Chapter 2). Research articles often do not have separate headings for reporting the “Hypotheses” or “Research Question.” They are often embedded in the “Introduction” or “Background” section or not labeled at all (e.g., as in the studies in the appendices). If a study uses hypotheses, the researcher may report whether the hypotheses were or were not supported toward the end of the article in the “Results” or “Findings” section. Quantitative research studies have hypotheses or research questions. Qualitative research studies do not have hypotheses, but have research questions and purposes. The studies in Appendices A, B, and D have hypotheses. The study in Appendix C does not, since it is a qualitative study; rather it has a purpose statement.
Research design The type of research design can be found in the abstract, within the purpose statement, or in the introduction to the “Procedures” or “Methods” section, or not stated at all (see Chapters 6, 9, and 10). For example, the studies in Appendices A, B, and D identify the design in the abstract.
One of your first objectives is to determine whether the study is qualitative (see Chapters 5 and 6) or quantitative (see Chapters 8, 9, and 10). Although the rigor of the critical appraisal criteria addressed do not substantially change, some of the terminology of the questions differs for qualitative versus quantitative studies. Do not get discouraged if you cannot easily determine the design. One of the best strategies is to review the chapters that address designs. The following tips will help you determine whether the study you are reading employs a quantitative design:
• Hypotheses are stated or implied (see Chapter 2).
• The terms control and treatment group appear (see Chapter 9).
• The terms survey, correlational, case control, or cohort are used (see
58
Chapter 10).
• The terms random or convenience are mentioned in relation to the sample (see Chapter 12).
• Variables are measured by instruments or scales (see Chapter 14).
• Reliability and validity of instruments are discussed (see Chapter 15).
• Statistical analyses are used (see Chapter 16).
In contrast, qualitative studies generally do not focus on “numbers.” Some qualitative studies may use standard quantitative terms (e.g., subjects) rather than qualitative terms (e.g., informants). Deciding on the type of qualitative design can be confusing; one of the best strategies is to review the qualitative chapters (see Chapters 5 through 7). Begin trying to link the study’s design with the level of evidence associated with that design as illustrated in Fig. 1.1. This will give you a context for evaluating the strength and consistency of the findings and applicability to practice. Chapters 8 through 11 will help you understand how to link the levels of evidence with quantitative designs. A study may not indicate the specific design used; however, all studies inform the reader of the methodology used, which can help you decide the type of design the authors used to guide the study.
Sampling The population from which the sample was drawn is discussed in the section “Methods” or “Methodology” under the subheadings of “Subjects” or “Sample” (see Chapter 12). Researchers should tell you both the population from which the sample was chosen and the number of subjects that participated in the study, as well as if they had subjects who dropped out of the study. The authors of the studies in the appendices discuss their samples in enough detail so that the reader is clear about who the subjects are and how they were selected.
Reliability and validity
59
The discussion of the instruments used to study the variables is usually included in a “Methods” section under the subheading of “Instruments” or “Measures” (see Chapter 14). Usually each instrument (or scale) used in the study is discussed, as well as its reliability and validity (see Chapter 15). The studies in Appendices A, B, and D discuss each of the measures used in the “Methods” section under the subheading “Measures” or “Instruments.” The reliability and validity of each measure is also presented.
In some cases, the reliability and validity of commonly used, established instruments in an article are not presented, and you are referred to other references.
Procedures and collection methods The data collection procedures, or the individual steps taken to gather measurable data (usually with instruments or scales), are generally found in the “Procedures” section (see Chapter 14). In the studies in Appendices A through D, the researchers indicate how they conducted the study in detail under the subheading “Procedure” or “Instruments and Procedures.” Notice that the researchers in each study included in the Appendices provided information that the studies were approved by an institutional review board (see Chapter 13), thereby ensuring that each study met ethical standards.
Data analysis/results The data-analysis procedures (i.e., the statistical tests used and the results of descriptive and/or inferential tests applied in quantitative studies) are presented in the section labeled “Results” or “Findings” (see Chapters 16 and 17). Although qualitative studies do not use statistical tests, the procedures for analyzing the themes, concepts, and/or observational or print data are usually described in the “Method” or “Data Collection” section and reported in the “Results,” “Findings,” or “Data Analysis” section (see Appendix C and Chapters 5 and 6).
Discussion The last section of a research study is the “Discussion” (see Chapter
60
17). In this section the researchers tie together all of the study’s pieces and give a picture of the study as a whole. The researchers return to the literature reviewed and discuss how their study is similar to, or different from, other studies. Researchers may report the results and discussion in one section but usually report their results in separate “Results” and “Discussion” sections (see Appendices A through D). One particular method is no better than another. Journal and space limitations determine how these sections will be handled. Any new or unexpected findings are usually described in the “Discussion” section.
Recommendations and implications In some cases, a researcher reports the implications and limitations based on the findings for practice and education, and recommends future studies in a separate section labeled “Conclusions”; in other cases, this appears in several sections, labeled with such titles as “Discussion,” “Limitations,” “Nursing Implications,” “Implications for Research and Practice,” and “Summary.” Again, one way is not better than the other—only different.
References All of the references cited are included at the end of the article. The main purpose of the reference list is to support the material presented by identifying the sources in a manner that allows for easy retrieval. Journals use various referencing styles.
Communicating results Communicating a study’s results can take the form of a published article, poster, or paper presentation. All are valid ways of providing data and have potential to effect high-quality patient care based on research findings. Evidence-based nursing care plans and QI practice protocols, guidelines, or standards are outcome measures that effectively indicate communicated research.
HELPFUL HINT If you have to write a paper on a specific concept or topic that requires you to critique and synthesize the findings from several studies, you might find it useful to create an evidence table of the
61
data (see Chapter 20). Include the following information: author, date, study type, design, level of evidence, sample, data analysis, findings, and implications.
Systematic reviews: Meta-analyses, integrative reviews, and meta-syntheses Systematic reviews Other article types that are important to understand for evidence- based practice are review articles. Review articles include systematic reviews, meta-analyses, integrative reviews (sometimes called narrative reviews), meta-syntheses, and meta-summaries. A systematic review is a summation and assessment of a group of research studies that test a similar research question. Systematic reviews are based on a clear question, a detailed plan which includes a search strategy, and appraisal of a group of studies related to the question. If statistical techniques are used to summarize and assess studies, the systematic review is labeled as a meta-analysis. A meta-analysis is a summary of a number of studies focused on one question or topic, and uses a specific statistical methodology to synthesize the findings in order to draw conclusions about the area of focus. An integrative review is a focused review and synthesis of research or theoretical literature in a particular focus area, and includes specific steps of literature integration and synthesis without statistical analysis; it can include both quantitative and qualitative articles (Cochrane Consumer Network, 2016; Uman, 2011; Whittemore, 2005). At times reviews use the terms systematic review and integrative review interchangeably. Both meta-synthesis and meta-summary are the synthesis of a number of qualitative research studies on a focused topic using specific qualitative methodology (Kastner et al., 2016; Sandelowski & Barrosos, 2007).
The components of review articles will be discussed in greater detail in Chapters 6, 11, and 20. These articles take a number of studies related to a clinical question and, using a specific set of criteria and methods, evaluate the studies as a whole. While they may vary somewhat in approach, these reviews all help to better
62
inform and develop evidence-based practice. The meta-analysis in Appendix E is an example of a systematic review that is a meta- analysis.
Clinical guidelines Clinical guidelines are systematically developed statements or recommendations that serve as a guide for practitioners. Two types of clinical guidelines will be discussed throughout this text: consensus, or expert-developed guidelines, and evidence-based guidelines. Consensus guidelines, or expert-developed guidelines, are developed by an agreement of experts in the field. Evidence- based guidelines are those developed using published research findings. Guidelines are developed to assist in bridging practice and research and are developed by professional organizations, government agencies, institutions, or convened expert panels. Clinical guidelines provide clinicians with an algorithm for clinical management or decision making for specific diseases (e.g., breast cancer) or treatments (e.g., pain management). Not all clinical guidelines are well developed and, like research, must be assessed before implementation. Though they are systematically developed and make explicit recommendations for practice, clinical guidelines may be formatted differently. Guidelines for practice are becoming more important as third party and government payers are requiring practices to be based on evidence. Guidelines should present scope and purpose of the practice, detail who the development group included, demonstrate scientific rigor, be clear in its presentation, demonstrate clinical applicability, and demonstrate editorial independence (see Chapter 11).
Quality improvement As a health care provider, you are responsible for continuously improving the quality and safety of health care for your patients and their families through systematic redesign of health care systems in which you work. The Institute of Medicine (2001) defined quality health care as care that is safe, effective, patient- centered, timely, efficient, and equitable. Therefore, the goal of QI is to bring about measurable changes across these six domains by
63
applying specific methodologies within a care setting. While several QI methods exist, the core steps for improvement commonly include the following:
• Conducting an assessment
• Setting specific goals for improvement
• Identifying ideas for changing current practice
• Deciding how improvements in care will be measured
• Rapidly testing practice changes
• Measuring improvements in care
• Adopting the practice change as a new standard of care
Chapter 21 focuses on building your competence to participate in and lead QI projects by providing an overview of the evolution of QI in health care, including the nurse’s role in meeting current regulatory requirements for patient care quality. Chapter 19 discusses QI models and tools, such as cause-and-effect diagrams and process mapping, as well as skills for effective teamwork and leadership that are essential for successful QI projects.
As you venture through this textbook, you will be challenged to think not only about reading and understanding research studies, but also about applying the findings to your practice. Nursing has a rich legacy of research that has grown in depth and breadth. Producers of research and clinicians must engage in a joint effort to translate findings into practice that will make a difference in the care of patients and families.
Key points • Research provides the basis for expanding the unique body of
scientific evidence that forms the foundation of evidence-based nursing practice. Research links education, theory, and practice.
64
• As consumers of research, nurses must have a basic understanding of the research process and critical appraisal skills to evaluate research evidence before applying it to clinical practice.
• Critical appraisal is the process of evaluating the strengths and weaknesses of a research article for scientific merit and application to practice, theory, or education; the need for more research on the topic or clinical problem is also addressed at this stage.
• Critical appraisal criteria are the measures, standards, evaluation guides, or questions used to judge the worth of a research study.
• Critical reading skills will enable you to evaluate the appropriateness of the content of a research article, apply standards or critical appraisal criteria to assess the study’s scientific merit for use in practice, or consider alternative ways of handling the same topic.
• A level of evidence model is a tool for evaluating the strength (quality, quantity, and consistency) of a research study and its findings.
• Each article should be evaluated for the study’s strength and consistency of evidence as a means of judging the applicability of findings to practice.
• Research articles have different formats and styles depending on journal manuscript requirements and whether they are quantitative or qualitative studies.
• Evidence-based practice and QI begin with the careful reading and understanding of each article contributing to the practice of nursing, clinical expertise, and an understanding of patient values.
• QI processes are aimed at improving clinical care outcomes for patients and better methods of system performance.
65
Critical thinking challenges • How might nurses discuss the differences between evidence-
based practice and research with their colleagues in other professions?
• From your clinical practice, discuss several strategies nurses can undertake to promote evidence-based practice.
• What are some strategies you can use to develop a more comprehensive critique of an evidence-based practice article?
• A number of different components are usually identified in a research article. Discuss how these sections link with one another to ensure continuity.
• How can QI data be used to improve clinical practice?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Agency for Healthcare Research and Quality. Systems to
rate the strength of scientific evidence. File inventory, Evidence Report/Technology Assessment No. 47. AHRQ Publication No. 02-E0162002.
2. Al-Mallah M.H, Farah I, Al-Madani W, et al. The impact of nurse-led clinics on the mortality and mortality of patients with cardiovascular diseases A systematic review and meta- analysis. Journal of Cardiovascular Nursing 2015;31(1):89-95 Available at: doi:10.1097/JCN.00000000000000224.
3. American Nurses Association (ANA). Code of ethics for nurses for nurses with interpretive statements. Washington, DC: The Association;2015.
4. Cochrane Consumer Network, The Cochrane Library, 2016, retrieved online. Available at: www.cochranelibrary.com
66
5. Cronenwett L, Sherwood G, Barnsteiner J, et al. Quality and safety education for nurses. Nursing Outlook 2007;55(3):122- 131.
6. Institute of Medicine [IOM]. The future of nursing Leading change, advancing health. Washington, DC: National Academic Press;2011.
7. Institute of Medicine Committee on Quality of Health Care in America. Crossing the quality chasm A new health system for the 21st century. Washington, DC: National Academy Press;2001.
8. Kastner M, Antony J, Soobiah C, et al. Conceptual recommendations for selecting the most appropriate knowledge synthesis method to answer research questions related to complex evidence. Journal of Clinical Epidemiology 2016;73:43-49.
9. Nyamathi A, Salem B.E, Zhang S, et al. Nursing case management, peer coaching, and hepatitus A and B vaccine completion among homeless men recently released on parole. Nursing Research 2015;64:177-189 Available at: doi:10.1097/NNR.0000000000000083.
10. Sackett D.L, Straus S, Richardson S, et al. Evidence-based medicine How to practice and teach EBM. 2nd ed. London: Churchill Livingstone;2000.
11. Sandelowski M, Barroso J. Handbook of Qualitative Research. New York, NY: Springer Pub. Co.;2007.
12. Uman L.S. Systematic reviews and meta-analyses. Journal of the Canadian Academy of Child and Adolescent Psychiatry 2011;20(1):57-59.
13. vanDijk J.F.M, Vervoot S.C.J.M, vanWijck A.J.M, et al. Postoperative patients’ perspectives on rating pain A qualitative study. International Journal of Nursing Studies 2016;53:260- 269.
14. Whittemore R. Combining evidence in nursing research. Nursing Research 2005;54(1):56-62.
67
CHAPTER 2
Research questions, hypotheses, and clinical questions Judith Haber
Learning outcomes
After reading this chapter, you should be able to do the following:
• Describe how the research question and hypothesis relate to the other components of the research process. • Describe the process of identifying and refining a research question or hypothesis. • Discuss the appropriate use of research questions versus hypotheses in a research study. • Identify the criteria for determining the significance of a research question or hypothesis. • Discuss how the purpose, research question, and hypothesis suggest the level of evidence to be obtained from the findings of a research study. • Discuss the purpose of developing a clinical question.
68
• Discuss the differences between a research question and a clinical question in relation to evidence-based practice. • Apply critiquing criteria to the evaluation of a research question and hypothesis in a research report.
KEY TERMS
clinical question
complex hypothesis
dependent variable
directional hypothesis
hypothesis
independent variable
nondirectional hypothesis
population
purpose
research hypothesis
research question
statistical hypothesis
testability
theory
variable
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
At the beginning of this chapter, you will learn about research questions and hypotheses from the perspective of a researcher, which, in the second part of this chapter, will help you generate
69
your own clinical questions that you will use to guide the development of evidence-based practice projects. From a clinician’s perspective, you must understand the research question and hypothesis as it aligns with the rest of a study. As a practicing nurse, developing clinical questions (see Chapters 19, 20, and 21) is the first step of the evidence-based practice process for quality improvement programs like those that decrease risk for development of pressure ulcers.
When nurses ask questions such as, “Why are things done this way?” “I wonder what would happen if . . . ?” “What characteristics are associated with . . . ?” or “What is the effect of ____ on patient outcomes?”, they are often well on their way to developing a research question or hypothesis. Research questions are usually generated by situations that emerge from practice, leading nurses to wonder about the effectiveness of one intervention versus another for a specific patient population.
The research question or hypothesis is a key preliminary step in the research process. The research question tests a measureable relationship to be examined in a research study. The hypothesis predicts the outcome of a study.
Hypotheses can be considered intelligent hunches, guesses, or predictions that provide researchers with direction for the research design and the collection, analysis, and interpretation of data. Hypotheses are a vehicle for testing the validity of the theoretical framework assumptions and provide a bridge between theory (a set of interrelated concepts, definitions, and propositions) and the real world (see Chapter 4).
For a clinician making an evidence-informed decision about a patient care issue, a clinical question, such as whether chlorhexidine or povidone-iodine is more effective in preventing central line catheter infections, would guide the nurse in searching and retrieving the best available evidence. This evidence, combined with clinical expertise and patient preferences, would provide an answer on which to base the most effective decision about patient care for this population.
Often the research questions or hypotheses appear at the beginning of a research article, but may be embedded in the purpose, aims, goals, or even the results section of the research
70
report. This chapter provides you with a working knowledge of quantitative research questions and hypotheses. It also highlights the importance of clinical questions and how to develop them.
Developing and refining a research question: Study perspective A researcher spends a great deal of time refining a research idea into a testable research question. Research questions or topics are not pulled from thin air. In Table 2.1, you will see that research questions can indicate that practical experience, critical appraisal of the scientific literature, or interest in an untested theory forms the basis for the development of a research idea. The research question should reflect a refinement of the researcher’s initial thinking. The evaluator of a research study should be able to identify that the researcher has:
• Defined a specific question area
• Reviewed the relevant literature
• Examined the question’s potential significance to nursing
• Pragmatically examined the feasibility of studying the research question
TABLE 2.1 How Practical Experience, Scientific Literature, and Untested Theory Influence the Development of a Research Idea
Area Influence Example Clinical experience
Clinical practice provides a wealth of experience from which research problems can be derived. The nurse may observe a particular event or pattern and become curious about why it occurs, as well as its relationship to other factors in the patient’s environment.
Health professionals observe that despite improvements in symptom management for cancer patients receiving chemotherapy, side effects remain highly prevalent. Symptoms such as nausea/vomiting, diarrhea, constipation, and fatigue are common, and patients report that they negatively affect functional status and quality of life, including costly and distressing hospitalizations. A study by Traeger et al. (2015) tested a model integrated into outpatient care for patients with breast cancer, lung cancer, and colorectal cancer, designed to reduce symptom
71
burden to be delivered by each patient’s oncology team nurse practitioner that included telephone follow-up, symptom assessment, advice, and triage according to actual clinical practice. The aim was to ensure optimal patient-NP management of side effects early in the course of care.
Critical appraisal of the scientific literature
Critical appraisal of studies in journals may indirectly suggest a clinical problem by stimulating the reader’s thinking. The nurse may observe the outcome data from a single study or a group of related studies that provide the basis for developing a pilot study, quality improvement project, or clinical practice guideline to determine the effectiveness of this intervention in their setting.
At a staff meeting with members of an interprofessional team at a cancer center, it was noted that the center did not have a standardized clinical practice guideline for mucositis, a painful chemotherapy side effect involving the oral cavity that has a negative impact on nutrition, oral hygiene, and comfort. The team wanted to identify the most effective approaches for treating adults and children experiencing mucositis. Their search for, and critical appraisal of, existing research studies led the team to develop an interprofessional mucositis guideline that was relevant to their patient population and clinical setting (NYU Langone Medical Center, 2016).
Gaps in the literature
A research idea may also be suggested by a critical appraisal of the literature that identifies gaps in the literature and suggests areas for future study. Research ideas also can be generated by research reports that suggest the value of replicating a particular study to extend or refine the existing scientific knowledge base.
Obesity is a widely recognized risk factor for many conditions treated in primary care settings including type 2 diabetes, cardiovascular disease, hypertension, and osteoarthritis. Although weight and achieving a healthy weight for children and adults is a Healthy People 2020 goal and a national priority, the prevalence of obesity remains high, and there is little research on targeted interventions for weight loss in primary care settings. Therefore, the purpose of a study by Thabault, Burke, and Ades (2015) was to evaluate an NP-led motivational interviewing IBT program implemented in an adult primary care practice with obese patients to determine feasibility and acceptance of the intervention.
Interest in untested theory
Verification of a theory and its concepts provides a relatively uncharted area from which research problems can be derived. Inasmuch as theories themselves are not tested, a researcher may consider investigating a concept or set of concepts related to a nursing theory or a theory from another discipline. The researcher would pose questions like, “If this theory is correct, what kind of behavior would I expect to observe in particular patients and under which conditions?” “If this theory is valid, what kind of supporting evidence will I find?”
Bandura’s (1997) health self-efficacy construct, an individual’s confidence in the ability to perform a behavior, overcome barriers to that behavior, and exert control over the behavior through self- regulation and goal setting, was used by Richards, Ogata, and Cheng (2016) to investigate whether health-related self-efficacy provides the untested theoretical foundation for behavior change related to increasing physical activity using a dog walking (Dogs PAW) intervention.
IBT, Intensive behavioral therapy.
72
Defining the research question Brainstorming with faculty or colleagues may provide valuable feedback that helps the researcher focus on a specific research question area. Example: ➤ Suppose a researcher told a colleague that her area of interest was health disparities about the effectiveness of peer coaching or case management in improving health outcomes with challenging patient populations such as those who are homeless. The colleague may have asked, “What is it about the topic that specifically interests you?” This conversation may have initiated a chain of thought that resulted in a decision to explore the effectiveness of a nursing case management and peer coaching intervention on hepatitis A and B (HAV and HBV) vaccine completion rates among homeless men recently released on parole (Nyamathi et al., 2015) (see Appendix A). Fig. 2.1 illustrates how a broad area of interest (health disparities, nursing case management, peer coaching) was narrowed to a specific research topic (effectiveness of nursing case management and peer coaching on HAV and HBV vaccine completion among homeless men recently released on parole).
73
FIG 2.1 Development of a research question.
EVIDENCE-BASED PRACTICE TIP A well-developed research question guides a focused search for scientific evidence about assessing, diagnosing, treating, or providing patients with information about their prognosis related to a specific health problem.
Beginning the literature review
74
The literature review should reveal a relevant collection of studies and systematic reviews that have been critically examined. Concluding sections in such articles (i.e., the recommendations and implications for practice) often identify remaining gaps in the literature, the need for replication, or the need for additional knowledge about a particular research focus (see Chapter 3). In the previous example, the researcher may have conducted a preliminary review of books and journals for theories and research studies on factors apparently critical to vaccine completion rates for preventable health problems like HAV and HBV, as well as risk factors contributing to the disproportionate impact of HAV and HBV on the homeless, such as risky sexual activity, drug use, substandard living conditions, and older age. These factors, called variables, should be potentially relevant, of interest, and measurable.
EVIDENCE-BASED PRACTICE TIP The answers to questions generated by qualitative data reflect evidence that may provide the first insights about a phenomenon that has not been previously studied.
Other variables, called demographic variables, such as race, ethnicity, gender, age, education, and physical and mental health status, are also suggested as essential to consider. Example: ➤ Despite the availability of the HAV and HBV vaccines, there has been a low completion rate for the three-dose core of the accelerated vaccine series, particularly following release from prison. This information can then be used to further define the research question and continue the search of the literature to identify effective intervention strategies reported in other studies with similar high- risk populations (e.g., homeless) that could be applied to this population. Example: ➤ One study documented the effectiveness of a nurse case management program in improving vaccine completion rates in a group of homeless adults, but no studies were found about the effectiveness of peer coaching. At this point, the researcher could write the tentative research question: “What is the effectiveness of peer coaching and nursing case management on completion of an HAV and HBV vaccine series among homeless men on parole?” You can envision the interrelatedness of the initial
75
definition of the question area, the literature review, and the refined research question.
HELPFUL HINT Reading the literature review or theoretical framework section of a research article helps you trace the development of the implied research question and/or hypothesis.
Examining significance When considering a research question, it is crucial that the researcher examine the question’s potential significance for nursing. This is sometimes referred to as the “so what” question, because the research question should have the potential to contribute to and extend the scientific body of nursing knowledge. Guidelines for selecting research questions should meet the following criteria:
• Patients, nurses, the medical community in general, and society will potentially benefit from the knowledge derived from the study.
• Results will be applicable for nursing practice, education, or administration.
• Findings will provide support or lack of support for untested theoretical concepts.
• Findings will extend or challenge existing knowledge by filling a gap or clarifying a conflict in the literature.
• Findings will potentially provide evidence that supports developing, retaining, or revising nursing practices or policies.
If the research question has not met any of these criteria, the researcher is wise to extensively revise the question or discard it. Example: ➤ In the previously cited research question, the significance of the question includes the following facts:
• HAV and HBV are vaccine preventable.
• Viral hepatitis disproportionately impacts the homeless.
76
• Despite its availability, vaccine completion rates are low among high-risk and incarcerated populations.
• Accelerated vaccine programs have shown success in RCT studies.
• The use of nurse case management programs in accelerated vaccine programs also provides evidence of effectiveness.
• Little is known about vaccine completion among ex-offender populations on parole using varying intensities of nurse case management and peer coaches.
• This study sought to fill a gap in the related literature by assessing whether seronegative parolees randomized to one of three intervention conditions were more likely to complete the vaccine series as well as to identify predictors of HAV/HBV vaccine completion.
HIGHLIGHT It is helpful to collaborate with colleagues from other professions to identify an important clinical question that provides data for a quality improvement on your unit.
The fully developed research question When a researcher finalizes a research question, the following characteristics should be evident:
• It clearly identifies the variables under consideration.
• It specifies the population being studied.
• It implies the possibility of empirical testing.
Because each element is crucial to developing a satisfactory research question, the criteria will be discussed in greater detail. These elements can often be found in the introduction of the published article; they are not always stated in an explicit manner.
77
Variables Researchers call the properties that they study “variables.” Such properties take on different values. Thus a variable, as the name suggests, is something that varies. Properties that differ from each other, such as age, weight, height, religion, and ethnicity, are examples of variables. Researchers attempt to understand how and why differences in one variable relate to differences in another variable. Example: ➤ A researcher may be concerned about the variable of pneumonia in postoperative patients on ventilators in critical care units. It is a variable because not all critically ill postoperative patients on ventilators have pneumonia. A researcher may also be interested in what other factors can be linked to ventilator-acquired pneumonia (VAP). There is clinical evidence to suggest that elevation of the head of the bed and frequent oral hygiene are associated with decreasing risk for VAP. You can see that these factors are also variables that need to be considered in relation to the development of VAP in postoperative patients.
When speaking of variables, the researcher is essentially asking, “Is X related to Y? What is the effect of X on Y? How are X1 and X2 related to Y?” The researcher is asking a question about the relationship between one or more independent variables and a dependent variable. (Note: In cases in which multiple independent or dependent variables are present, subscripts are used to indicate the number of variables under consideration.)
An independent variable, usually symbolized by X, is the variable that has the presumed effect on the dependent variable. In experimental research studies, the researcher manipulates the independent variable (see Chapter 9). In nonexperimental research, the independent variable is not manipulated and is assumed to have occurred naturally before or during the study (see Chapter 10).
The dependent variable, represented by Y, varies with a change in the independent variable. The dependent variable is not manipulated. It is observed and assumed to vary with changes in the independent variable. Predictions are made from the independent variable to the dependent variable. It is the dependent variable that the researcher is interested in understanding, explaining, or predicting. Example: ➤ It might be assumed that the
78
perception of pain intensity (the dependent variable) will vary in relation to a person’s gender (the independent variable). In this case, we are trying to explain the perception of pain intensity in relation to gender (i.e., male or female). Although variability in the dependent variable is assumed to depend on changes in the independent variable, this does not imply that there is a causal relationship between X and Y, or that changes in variable X cause variable Y to change.
Table 2.2 presents a number of examples of research questions. Practice substituting other variables for the examples in Table 2.2. You will be surprised at the skill you develop in writing and critiquing research questions with greater ease.
TABLE 2.2 Research Question Format
HBHC, Hospital-based home care.
Although one independent variable and one dependent variable are used in the examples, there is no restriction on the number of variables that can be included in a research question. Research questions that include more than one independent or dependent variable may be broken down into subquestions that are more concise.
Finally, it should be noted that variables are not inherently independent or dependent. A variable that is classified as independent in one study may be considered dependent in another study. Example: ➤ A nurse may review an article about depression
79
that identifies depression in adolescents as predictive of risk for suicide. In this case, depression is the independent variable. When another article about the effectiveness of antidepressant medication alone or in combination with cognitive behavioral therapy (CBT) in decreasing depression in adolescents is considered, change in depression is the dependent variable. Whether a variable is independent or dependent is a function of the role it plays in a particular study.
Population The population is a well-defined set that has certain characteristics and is either clearly identified or implied in the research question. Example: ➤ In a retrospective cohort study studying the number of ED visits and hospitalizations in two different transition care programs, a research question may ask, “What is the differential effectiveness of nurse-led or physician-led intensive home visiting program providing transition care to patients with complex chronic conditions or receiving palliative care (Morrison, Palumbo, & Rambur, 2016)? Does a relationship exist between type of transition care model (nurse-led focused on chronic disease self-management or physician-led focused on palliative care and managing complex chronic conditions) and the number of ED visits and rehospitalizations 120 days pre- and posttransitional care interventions?” This question suggests that the population includes community-residing adults with complex chronic conditions or receiving palliative care who participated in either a nurse or physician-led transitional care program.
EVIDENCE-BASED PRACTICE TIP Make sure that the population of interest and the setting have been clearly described so that if you were going to replicate the study, you would know exactly who the study population needed to be.
Testability The research question must imply that it is testable, measurable by either qualitative or quantitative methods. Example: ➤ The research question “Should postoperative patients control how much pain medication they receive?” is stated incorrectly for a
80
variety of reasons. One reason is that it is not testable; it represents a value statement rather than a research question. A scientific research question must propose a measureable relationship between an independent and a dependent variable. Many interesting and important clinical questions are not valid research questions because they are not amenable to testing.
HELPFUL HINT Remember that research questions are used to guide all types of research studies but are most often used in exploratory, descriptive, qualitative, or hypothesis-generating studies.
The question “What are the relationships between vaccine completion rates among the ex-offender population and use of varying intensities of nurse case management and peer coaches?” is a testable research question. It illustrates the relationship between the variables, identifies the independent and dependent variables, and implies the testability of the research question. Table 2.3 illustrates how this research question is congruent with the three research question criteria.
TABLE 2.3 Components of the Research Question and Related Criteria
This research question was originally derived from a general area of interest: health-seeking behavior and coping (HAV and HBV vaccine completion rates) in a high-risk population (ex-offenders on parole, homeless), factors related to vaccine completion (age, education, race/ethnicity, marital, and parental status), and potential strategies (nurse case management and peer coaching) to
81
improve protective behaviors and health outcomes. The question crystallized further after a preliminary literature review (Nyamathi et al., 2015).
HELPFUL HINT
• Remember that research questions are often not explicitly stated. The reader has to infer the research question from the title of the report, the abstract, the introduction, or the purpose.
• Using your focused question, search the literature for the best available answer to your clinical question.
Study purpose, aims, or objectives The purpose of the study encompasses the aims or objectives the investigator hopes to achieve with the research. These three terms are synonymous. The researcher selects verbs to use in the purpose statement that suggest the planned approach to be used when studying the research question as well as the level of evidence to be obtained through the study findings. Verbs such as discover, explore, or describe suggest an investigation of an infrequently researched topic that might appropriately be guided by research questions rather than hypotheses. In contrast, verb statements indicating that the purpose is to test the effectiveness of an intervention or compare two alternative nursing strategies suggest a hypothesis- testing study for which there is an established knowledge base of the topic.
Remember that when the purpose of a study is to test the effectiveness of an intervention or compare the effectiveness of two or more interventions, the level of evidence is likely to have more strength and rigor than a study whose purpose is to explore or describe phenomena. Box 2.1 provides examples of purpose, aims, and objectives. BOX 2.1 Examples of Purpose Statements
• The purpose of this study was to explore the relationship between
82
future expectations, attitude toward use of violence to solve problems, and self-reported physical and relational bullying perpetration in a sample of seventh grade students (Stoddard, Varela, & Zimmerman, 2015). The aim of this study was to determine knowledge, awareness, and practices of Turkish hospital nurses in relation to cervical cancer, HPV, and HPV (Koc & Cinarli, 2015).
• The purposes of this longitudinal study with a sample composed of Hispanic, Black non-Hispanic, and White non-Hispanic bereaved parents were to test the relationships between spiritual/religious coping strategies and grief, mental health, and personal growth for mothers and fathers at 1 and 3 months after the infant/child’s death in the NICU/PICU (Hawthorne et al., 2016). The goals of the current study were to examine psychological functioning and coping in parents and siblings of adolescent cancer survivors (Turner-Sack et al., 2016).
EVIDENCE-BASED PRACTICE TIP The purpose, aims, or objectives often provide the most information about the intent of the research question and hypotheses, and suggest the level of evidence to be obtained from the findings of the study.
Developing the research hypothesis Like the research question, hypotheses are often not stated explicitly in a research article. You will often find that hypotheses are embedded in the data analysis, results, or discussion section of the research report. Similarly, the population may not be explicitly stated, but will have been identified in the background, significance, and literature review. It is then up to you to figure out the hypotheses and population being tested. Example: ➤ In a study by Turner-Sack and colleagues (2016) (see Appendix B), the hypotheses are embedded in the “Data Analysis” and “Results” sections of the article. You must interpret that the statement, “Independent sample t-tests were conducted to compare the survivors, siblings, and parents on measures of psychological
83
distress, life satisfaction, posttraumatic growth (PTG), and that of their matched parents” to understand that it represents hypotheses used to compare psychological functioning, PTG, coping, and cancer-related characteristics of adolescent cancer survivors’ parents and siblings.
Hypotheses flow from the study’s purpose, literature review, and theoretical framework. Fig. 2.2 illustrates this flow. A hypothesis is a declarative statement about the relationship between two or more variables. A hypothesis predicts an expected outcome of a study. Hypotheses are developed before the study is conducted because they provide direction for the collection, analysis, and interpretation of data.
FIG 2.2 Interrelationships of purpose, literature review,
theoretical framework, and hypothesis.
HELPFUL HINT When hypotheses are not explicitly stated by the author at the end of the introduction section or just before the methods section, they will be embedded or implied in the data analysis, results, or discussion section of a research article.
84
Relationship statement The first characteristic of a hypothesis is that it is a declarative statement that identifies the predicted relationship between two or more variables: the independent variable (X) and a dependent variable (Y). The direction of the predicted relationship is also specified in this statement. Phrases such as greater than, less than, positively, negatively, or difference in suggest the directionality that is proposed in the hypothesis. The following is an example of a directional hypothesis: “Nurse staff members’ perceptions of transformational leadership among their nurse leaders (independent variable) is that it is negatively associated with nurse staff burnout (dependent variable)” (Lewis & Cunningham, 2016). The dependent and independent variables are explicitly identified, and the relational aspect of the prediction in the hypothesis is contained in the phrase “negatively associated with.”
The nature of the relationship, either causal or associative, is also implied by the hypothesis. A causal relationship is one in which the researcher can predict that the independent variable (X) causes a change in the dependent variable (Y). In research, it is rare that one is in a firm enough position to take a definitive stand about a cause- and-effect relationship. Example: ➤ A researcher might hypothesize selected determinants of the decision-making process, specifically expectation, socio-demographic factors, and decisional conflict would predict postdecision satisfaction and regret about their choice of treatment for breast cancer in Chinese-American women (Lee & Knobf, 2015). It would be difficult for a researcher to predict a cause-and-effect relationship, however, because of the multiple intervening variables (e.g., values, culture, role, support from others, personal resources, language literacy) that might also influence the subject’s decision making about treatment for their breast cancer diagnosis.
Variables are more commonly related in noncausal ways; that is, the variables are systematically related but in an associative way. This means that the variables change in relation to each other. Example: ➤ There is strong evidence that asbestos exposure is related to lung cancer. It is tempting to state that there is a causal relationship between asbestos exposure and lung cancer. Do not overlook the fact, however, that not all of those who have been
85
exposed to asbestos will have lung cancer, and not all of those who have lung cancer have had asbestos exposure. Consequently, it would be scientifically unsound to take a position advocating the presence of a causal relationship between these two variables. Rather, one can say only that there is an associative relationship between the variables of asbestos exposure and lung cancer, a relationship in which there is a strong systematic association between the two phenomena.
Testability The second characteristic of a hypothesis is its testability. This means that the variables of the study must lend themselves to observation, measurement, and analysis. The hypothesis is either supported or not supported after the data have been collected and analyzed. The predicted outcome proposed by the hypothesis will or will not be congruent with the actual outcome when the hypothesis is tested.
HELPFUL HINT When a hypothesis is complex (i.e., it contains more than one independent or dependent variable), it is difficult for the findings to indicate unequivocally that the hypothesis is supported or not supported. In such cases, the reader must infer which relationships are significant in the predicted direction from the findings or discussion section.
Theory base The third characteristic is that the hypothesis is consistent with an existing body of theory and research findings. Whether a hypothesis is arrived at on the basis of a review of the literature or a clinical observation, it must be based on a sound scientific rationale. You should be able to identify the flow of ideas from the research idea to the literature review, to the theoretical framework, and through the research question(s) or hypotheses. Example: ➤ Nyamathi and colleagues (2015) (see Appendix A) investigated the effectiveness of a nursing case management intervention in comparison to a peer coaching intervention based on the comprehensive health-seeking and coping paradigm developed by
86
Nyamathi in 1989, adapted from a coping model by Lazarus and Folkman (1984), and the health-seeking and coping paradigm by Schlotfeldt (1981), which is a useful theoretical framework for case management, peer coaching interventions, and vaccine completion outcomes.
Wording the hypothesis As you read the scientific literature and become more familiar with it, you will observe that there are a variety of ways to word a hypothesis that are described in Tables 2.4 and 2.5. Information about hypotheses may be further clarified in the instruments, sample, or methods sections of a research report (see Chapters 12 and 15).
TABLE 2.4 Examples of How Hypotheses are Worded
87
BP, Blood pressure; CRNA, Certified Nurse Anesthetists; DV, dependent variable; IV, independent variable; TM, telemonitoring; UC, usual care.
TABLE 2.5 Examples of Statistical Hypotheses
ANPs, Adult nurse practitioners; FNPs, family nurse practitioners; DV, dependent variable; IV, independent variable.
Statistical versus research hypotheses You may observe that a hypothesis is further categorized as either a research or a statistical hypothesis. A research hypothesis, also known as a scientific hypothesis, consists of a statement about the expected relationship of the variables. A research hypothesis indicates what the outcome of the study is expected to be. A research hypothesis is also either directional or nondirectional. If the researcher obtains statistically significant findings for a research hypothesis, the hypothesis is supported. The examples in Table 2.4 represent research hypotheses.
A statistical hypothesis, also known as a null hypothesis, states that there is no relationship between the independent and dependent variables. The examples in Table 2.5 illustrate statistical hypotheses. If, in the data analysis, a statistically significant relationship emerges between the variables at a specified level of significance, the null hypothesis is rejected. Rejection of the statistical hypothesis is equivalent to acceptance of the research hypothesis.
Directional versus nondirectional hypotheses Hypotheses can be formulated directionally or nondirectionally. A
88
directional hypothesis specifies the expected direction of the relationship between the independent and dependent variables. An example of a directional hypothesis is provided in a study by Parry and colleagues (2015) that investigated a novel noninvasive device to assess sympathetic nervous system functioning in patients with heart failure. The researchers hypothesized that participants with heart failure reduced ejection fraction (HFrEF), who have internal cardiac defibrillators or CRT pacemakers, will have a decrease in pre-ejection period (reflective of increased sympathetic nervous system activity) and decrease in left ventricular ejection time (reflective of an increased heart rate) with a postural change from sitting to standing.
In contrast, a nondirectional hypothesis indicates the existence of a relationship between the variables, but does not specify the anticipated direction of the relationship. Example: ➤ Rattanawiboon and colleagues (2016) evaluated the effectiveness of fluoride mouthwash delivery methods, swish, spray, or swab application, in raising salivary fluoride in comparison to conventional fluoride mouthwash, but did not predict which form of fluoride delivery would be most effective. Nurses who are learning to critically appraise research studies should be aware that both the directional and the nondirectional forms of hypothesis statements are acceptable.
Relationship between the hypothesis and the research design Regardless of whether the researcher uses a statistical or a research hypothesis, there is a suggested relationship between the hypothesis, the design of the study, and the level of evidence provided by the results of the study. The type of design, experimental or nonexperimental (see Chapters 9 and 10), will influence the wording of the hypothesis. Example: ➤ When an experimental design is used, you would expect to see hypotheses that reflect relationship statements, such as the following:
• X1 is more effective than X2 on Y.
89
• The effect of X1 on Y is greater than that of X2 on Y.
• The incidence of Y will not differ in subjects receiving X1 and X2 treatments.
• The incidence of Y will be greater in subjects after X1 than after X2.
EVIDENCE-BASED PRACTICE TIP Think about the relationship between the wording of the hypothesis, the type of research design suggested, and the level of evidence provided by the findings of a study using each kind of hypothesis. You may want to consider which type of hypothesis potentially will yield the strongest results applicable to practice.
Hypotheses reflecting experimental designs also test the effect of the experimental treatment (i.e., independent variable X) on the outcome (i.e., dependent variable Y). This suggests that the strength of the evidence provided by the results is Level II (experimental design) or Level III (quasi-experimental design).
In contrast, hypotheses related to nonexperimental designs reflect associative relationship statements, such as the following:
• X will be negatively related to Y.
• There will be a positive relationship between X and Y.
This suggests that the strength of the evidence provided by the results of a study that examined hypotheses with associative relationship statements would be at Level IV (nonexperimental design).
Table 2.6 provides an example of this concept. The Critical Thinking Decision Path will help you determine the type of hypothesis or research question presented in a study.
TABLE 2.6 Elements of a Clinical Question
90
CAUTIs, Catheter acquired urinary tract infections.
CRITICAL THINKING DECISION PATH Determining the Use of a Hypothesis or Research Question
Developing and refining a clinical question: A consumer’s perspective Practicing nurses, as well as students, are challenged to keep their practice up to date by searching for, retrieving, and critiquing
91
research articles that apply to practice issues that are encountered in their clinical setting (see Chapter 20). Practitioners strive to use the current best evidence from research when making clinical and health care decisions. As research consumers, you are not conducting research studies; however, your search for information from clinical practice is converted into focused, structured clinical questions that are the foundation of evidence-based practice and quality improvement projects. Clinical questions often arise from clinical situations for which there are no ready answers. You have probably had the experience of asking, “What is the most effective treatment for . . . ?” or “Why do we still do it this way?”
Using similar criteria related to framing a research question, focused clinical questions form a basis for searching the literature to identify supporting evidence from research. Clinical questions have four components:
• Population
• Intervention
• Comparison
• Outcome
These components, known as PICO, provide an effective format for helping nurses develop searchable clinical questions. Box 2.2 presents each component of the clinical question. BOX 2.2 Components of a Clinical Question Using the PICO Format Population: The individual patient or group of patients with a particular condition or health care problem (e.g., adolescents age 13–18 with type 1 insulin-dependent diabetes)
Intervention: The particular aspect of health care that is of interest to the nurse or the health team (e.g., a therapeutic [inhaler or nebulizer for treatment of asthma], a preventive [pneumonia vaccine], a diagnostic [measurement of blood pressure], or an organizational [implementation of a bar coding system to reduce
92
medication errors] intervention) Comparison intervention: Standard care or no intervention (e.g.,
antibiotic in comparison to ibuprofen for children with otitis media); a comparison of two treatment settings (e.g., rehabilitation center vs. home care)
Outcome: More effective outcome (e.g., improved glycemic control, decreased hospitalizations, decreased medication errors)
The significance of the clinical question becomes obvious as research evidence from the literature is critically appraised. Research evidence is used together with clinical expertise and the patient’s perspective to confirm, develop, or revise nursing standards, protocols, and policies that are used to plan and implement patient care (Cullum, 2000; Sackett et al., 2000; Thompson et al., 2004). Issues or questions can arise from multiple clinical and managerial situations. Using the example of catheter acquired urinary tract infections (CAUTIs), a team of staff nurses working on a medical unit in an acute care setting were reviewing their unit’s quarterly quality improvement data and observed that the number of CAUTIs had increased by 25% over the past 3 months. The nursing staff reviewed the unit’s standard of care and noted that although nurses were able to discontinue an indwelling catheter, according to a set of criteria and without a physician order, catheters were remaining in place for what they thought was too long and potentially contributing to an increase in the prevalence of CAUTIs. To focus the nursing staff’s search of the literature, they developed the following question: Does the use of daily nurse-led catheter rounds in hospitalized older adults with indwelling urinary catheters lead to a decrease in CAUTIs? Sometimes it is helpful for nurses who develop clinical questions from a quality improvement perspective to consider three elements as they frame their focused question: (1) the situation, (2) the intervention, and (3) the outcome.
• The situation is the patient or problem being addressed. This can be a single patient or a group of patients with a particular health problem (e.g., hospitalized adults with indwelling urinary catheters).
93
• The intervention is the dimension of health care interest, and often asks whether a particular intervention is a useful treatment (e.g., daily nurse-led catheter rounds).
• The outcome addresses the effect of the treatment (e.g., intervention) for this patient or patient population in terms of quality and cost (e.g., decreased CAUTIs). It essentially answers whether the intervention makes a difference for the patient population.
The individual parts of the question are vital pieces of information to remember when it comes to searching for evidence in the literature. One of the easiest ways to do this is to use a table, as illustrated in Table 2.6. Examples of clinical questions are highlighted in Box 2.3. Chapter 3 provides examples of how to effectively search the literature to find answers to questions posed by researchers and research consumers. BOX 2.3 Examples of Clinical Questions
• Does using a Discharge Bundle combined with Teachback Methodology reduce pediatric readmissions? (Shermont et al., 2016)
• What is the most effective IV insulin practice guideline for cardiac surgery patients? (Westbrook et al., 2016)
• Does using a structured content and electronic nursing handover reduce patient clinical management errors? (Johnson et al., 2016)
• What is the impact of nursing teamwork on nurse-sensitive quality indicators? (Rahn, 2016)
• Do PCMH access and care coordination measures reflect the contributions of all team members? (Annis et al., 2016)
• Is a patient-family-staff partnership video the most effective approach for preventing falls in hospitalized patients? (Silkworth et al., 2016)
94
• What is the impact of prompt nutrition care on patient outcomes and health care costs? (Meehan et al., 2016)
PCMH, Patient-centered medical home.
EVIDENCE-BASED PRACTICE TIP You should be formulating clinical questions that arise from your clinical practice. Once you have developed a focused clinical question using the PICO format, you will search the literature for the best available evidence to answer your clinical question.
Appraisal for evidence-based practice the research question and hypothesis When you begin to critically appraise a research study, consider the care the researcher takes when developing the research question or hypothesis; it is often representative of the overall conceptualization and design of the study. In a quantitative research study, the remainder of a study revolves around answering the research question or testing the hypothesis. In a qualitative research study, the objective is to answer the research question. Because this text focuses on you as a research consumer, the following sections will primarily pertain to the evaluation of research questions and hypotheses in published research reports.
Critiquing the research question and hypothesis The following Critical Appraisal Criteria box provides several criteria for evaluating the initial phase of the research process—the research question or hypothesis. Because the research question or hypothesis guides the study, it is usually introduced at the beginning of the research report to indicate the focus and direction of the study. You can then evaluate whether the rest of the study logically flows from its foundation—the research question or hypothesis. The author will often begin by identifying the background and significance of the issue that led to crystallizing development of the research question or hypothesis. The clinical and scientific background and/or significance will be summarized,
95
and the purpose, aim, or objective of the study is then identified. Often the research question or hypothesis will be proposed
before or after the literature review. Sometimes you will find that the research question or hypothesis is not specifically stated. In some cases, it is only hinted at or is embedded in the purpose statement, and you are challenged to identify the research question or hypothesis. In other cases, the research question is embedded in the findings toward the end of the article. To some extent, this depends on the style of the journal.
Although a hypothesis can legitimately be nondirectional, it is preferable, and more common, for the researcher to indicate the direction of the relationship between the variables in the hypothesis. Quantifiable words such as “greater than,” “less than,” “decrease,” “increase,” and “positively,” “negatively,” or “related” convey the idea of objectivity and testability. You should immediately be suspicious of hypotheses or research questions that are not stated objectively. You will find that when there is a lack of data available for the literature review (i.e., the researcher has chosen to study a relatively undefined area of interest), a nondirectional hypothesis or research question may be appropriate.
You should recognize that how the proposed relationship of the hypothesis or research question is phrased suggests the type of research design that will be appropriate for the study, as well as the level of evidence to be derived from the findings. Example: ➤ If a hypothesis proposes that treatment X1 will have a greater effect on Y than treatment X2, an experimental (Level II evidence) or quasi- experimental design (Level III evidence) is suggested (see Chapter 9). If a research question asks if there will be a positive relationship between variables X and Y, a nonexperimental design (Level IV evidence) is suggested (see Chapter10).
Hypotheses and research questions are never proven beyond the shadow of a doubt. Researchers who claim that their data have “proven” the validity of their hypothesis or research question should be regarded with grave reservation. You should realize that, at best, findings that support a hypothesis or research question are considered tentative. If repeated replication of a study yields the same results, more confidence can be placed in the conclusions advanced by the researchers.
96
When critically appraising clinical questions, think about the fact that the clinical question should be focused and specify the patient population or clinical problem being addressed, the intervention, and the outcome for a particular patient population. There should be evidence that the clinical question guided the literature search and that appropriate types of research studies are retrieved in terms of the study design and level of evidence needed to answer the clinical question.
CRITICAL APPRAISAL CRITERIA Developing Research Questions and Hypotheses The research question
1. Does the research question express a relationship between two or more variables, or at least between an independent and a dependent variable, implying empirical testability?
2. How does the research question specify the nature of the population being studied?
3. How has the research question been supported with adequate experiential and scientific background material?
4. How has the research question been placed within the context of an appropriate theoretical framework?
5. How has the significance of the research question been identified?
6. Have pragmatic issues, such as feasibility, been addressed?
7. How have the purpose, aims, or goals of the study been identified?
The hypothesis
1. Is the hypothesis concisely stated in a declarative form?
2. Are the independent and dependent variables identified in the
97
statement of the hypothesis?
3. Is each hypothesis specific to one relationship so that each hypothesis can be either supported or not supported?
4. Is the hypothesis stated in such a way that it is testable?
5. Is the hypothesis stated objectively, without value-laden words?
6. Is the direction of the relationship in each hypothesis clearly stated?
7. How is each hypothesis consistent with the literature review?
8. How is the theoretical rationale for the hypothesis made explicit?
9. Given the level of evidence suggested by the research question, hypothesis, and design, what is the potential applicability to practice?
The clinical question
1. Does the clinical question specify the patient population, intervention, comparison intervention, and outcome?
2. Does the clinical question address an outcome applicable to practice?
Key points • Developing the research question and stating the hypothesis are
key preliminary steps in the research process.
• The research question is refined through a process that proceeds from the identification of a general idea of interest to the definition of a more specific and circumscribed topic.
• A preliminary literature review reveals related factors that appear critical to the research topic of interest and helps further define the research question.
98
• The significance of the research question must be identified in terms of its potential contribution to patients, nurses, the medical community in general, and society. Applicability of the question for nursing practice, as well as its theoretical relevance, must be established. The findings should also have the potential for formulating or altering nursing practices or policies.
• The final research question is a statement about the relationship of two or more variables. It clearly identifies the relationship between the independent and dependent variables, specifies the nature of the population being studied, and implies the possibility of empirical testing.
• Research questions that are nondirectional may be used in exploratory, descriptive, or qualitative research studies.
• Research questions can be directional, depending on the type of study design being used.
• Focused clinical questions arise from clinical practice and guide the literature search for the best available evidence to answer the clinical question.
• A hypothesis is a declarative statement about the relationship between two or more variables that predicts an expected outcome. Characteristics of a hypothesis include a relationship statement, implications regarding testability, and consistency with a defined theory base.
• Hypotheses can be formulated in a directional or a nondirectional manner and be further categorized as either research or statistical hypotheses.
• The purpose, research question, or hypothesis provides information about the intent of the research question and hypothesis and suggests the level of evidence to be obtained from the study findings.
• The interrelatedness of the research question or hypothesis and the literature review and the theoretical framework should be
99
apparent.
• The appropriateness of the research design suggested by the research question or hypothesis is also evaluated.
Critical thinking challenges • Discuss how the wording of a research question or hypothesis
suggests the type of research design and level of evidence that will be provided.
• Using the study by Hawthorne, Youngblut, and Brooten (2016) (see Appendix B), describe how the background, significance, and purpose of the study are linked to the research questions.
• The prevalence of catheter acquired urinary infections (CAUTIs) has increased on your hospital unit by 10% in the last two quarters. As a member of the Quality Improvement (QI) Committee on your unit, collaborate with your committee colleagues from other professions to develop an interprofessional action plan. Deliberate to develop a clinical question to guide the QI project.
• A nurse is in charge of discharge planning for frail older adults with congestive heart failure. The goal of the program is to promote self-care and prevent rehospitalizations. Using the PICO approach, the nurse wants to develop a clinical question for an evidence-based practice project to evaluate the effectiveness of discharge planning for this patient population. How can the nurse accomplish that objective?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Angelhoff C, Edell-Gustafason L, Morelius E. Sleep of
100
parents living with a child receiving hospital-based home care. Nursing Research 2015;64(5):372-380.
2. Annis A.M, Harris M, Robinson C.H, Krein S.L. Do patient- centered medical home access and care coordination measures reflect the contributions of all team members? A systematic review. Journal of Nursing Care Quality 2016;31(4):357-366.
3. Bandura A. Self-efficacy The Exercise of Control. New York: Freeman;1997.
4. Cullum N. User’s guides to the nursing literature An introduction. Evidence-Based Nursing 2000;3(2):71-72.
5. Hawthorne D.M, Youngblut J.M, Brooten D. Patient spirituality, grief, and mental health at 1 and 3 months after their infant’s/child’s death in an intensive care unit. Journal of Pediatric Nursing 2016;31:73-80.
6. Johnson M, Sanchez P, Zheng C. Reducing patient clinical management errors using structured content and electronic nursing handover. Journal of Nursing Care Quality 2016;31(3):245-253.
7. Lazarus R.L, Folkman S. Stress, appraisal and coping. New York, NY: Springer;1984.
8. Lee S.C, Knobf M.T. Primary breast cancer decision-making among Chinese American women. Nursing Research 2015;64(5):391-401.
9. Lewis H.S, Cunningham C.J.L. Linking nurse leadership and work characteristics to nurse burnout and engagement. Nursing Research 2016;65(1):13-23.
10. Meehan A, Loose C, Bell J, et al. Impact of prompt nutrition care on patient outcomes and health care costs. Journal of Nursing Care Quality 2016;31(3):217-223.
11. Morrison J, Palumbo M.V, Rambur B. Reducing preventable hospitalizations with two models of transitional care. Journal of Nursing Scholarship 2016;48(3):322-329.
12. Nyamathi A, Salem B.E, Zhang S, et al. Nursing case management, peer coaching, and hepatitis A and B vaccine completion among homeless men recently released on parole Randomized clinical trial. Nursing Research 2015;64(3):177- 189.
13. NYU Langone Medical Center. New York, NY: Personal
101
Communication;2016. 14. Parry M, Nielson C.A, Muckle F, et al. A novel noninvasive
device to assess sympathetic nervous system function in patients with heart failure. Nursing Research 2015;64(5):351-360.
15. Rahn D. Transformational teamwork Exploring the impact of nursing teamwork on nurse-sensitive quality indicators. Journal of Nursing Care Quality 2016;31(3):262-268.
16. Rattanawiboon C, Chaweewannakorn C, Saisakphong T, et al. Effective fluoride mouthwash delivery methods as an alternative to rinsing. Nursing Research 2016;65(1):68-75.
17. Richards E.A, Ogata N, Cheng C. Evaluation of the dogs, physical activity, and walking dogs (Dogs PAW) intervention. Nursing Research 2016;65(3):191-201.
18. Sackett D, Straus S.E, Richardson W.S, et al. Evidence-based medicine How to practice and teach EBM. London: Churchill Livingstone;2000.
19. Schlotfeldt R. Nursing in the future. Nursing Outlook 1981;29:295-301.
20. Shermont H, Pignataro S, Humphrey K, Bukoye B. Reducing pediatric readmissions Using a discharge bundle combined with teach-back methodology. Journal of Nursing Care Quality 2016;31(3):224-232.
21. Silkworth A.I, Baker J, Ferrara J, et al. Nursing staff develop a video to prevent falls. A quality improvement project. Journal of Nursing Care Quality 2016;31(1):217-223.
22. Stoddard S.A, Varela J.J, Zimmerman M. Future expectations, attitude toward violence, and bullying perpetration during adolescence A mediation evaluation. Nursing Research 2015;64(6):422-433.
23. Thabault P.J, Burke P.J, Ades P.A. Intensive behavioral treatment weight loss program in an adult primary care practice. Journal of the American Association of Nurse Practitioners 2015;28:249-257.
24. Thompson C, Cullum N, McCaughan D, et al. Nurses, information use, and clinical decision-making The real world potential for evidence-based decisions in nursing. Evidence- Based Nursing 2004;7(3):68-72.
25. Traeger L, McDonnell T.M, McCarty C.E, et al. Nursing
102
intervention to enhance outpatient chemotherapy symptom management Patient-reported outcomes of a randomized controlled trial. Cancer Nursing 2015;121(21):3905-3913.
26. Turner-Sack A.M, Menna R, Setchell S.R, et al. Psychological functioning, post-traumatic growth, and coping in parents and siblings of adolescent cancer survivors. Oncology Nursing Forum 2016;43(1):48-56.
27. Westbrook A, Sherry D, McDermott M, et al. Examining IV insulin practice guidelines Nurses evaluating quality outcomes. Journal of Nursing Care Quality 2016;31(4):344-349.
103
CHAPTER 3
Gathering and appraising the literature Barbara Krainovich-Miller
Learning outcomes
After reading this chapter, you should be able to do the following:
• Discuss the purpose of a literature review in a research study. • Discuss the purpose of reviewing the literature for an evidence- based and quality improvement (QI) project. • Differentiate the purposes of a literature review from the evidence- based practice and the research perspective. • Differentiate between primary and secondary sources. • Differentiate between systematic reviews/meta-analyses and preappraised synopses. • Discuss the purpose of reviewing the literature for developing evidence-based practice and QI projects. • Use the PICO format to guide a search of the literature. • Conduct an effective search of the literature.
104
• Apply critical appraisal criteria for the evaluation of literature reviews in research studies.
KEY TERMS
Boolean operator
citation management software
controlled vocabulary
electronic databases
electronic search
Grey literature
literature review
preappraised synopses
primary source
refereed, or peer-reviewed, journals
secondary source
web browser
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
You may wonder why an entire chapter of a research text is devoted to gathering and appraising the literature. The main reason is because searching for, retrieving, and critically appraising the literature is a key step for researchers and for practitioners who are basing their practice on evidence. Searching for, retrieving, critically appraising, and synthesizing research evidence is essential to support an evidence-based practice (EBP). A question you might ask is, “Will knowing more about how to search efficiently and critically appraise research really help me as a student and as a practicing nurse?” The answer is, “Yes, it most certainly will!” Your
105
ability to locate, retrieve, critically appraise, and synthesize research articles will enable you to determine whether or not you have the best available evidence to inform your clinical practice (CP).
The critical appraisal of research studies is an organized, systematic approach to evaluating a research study or group of studies using a set of standardized critical appraisal criteria. The criteria are used to objectively determine the strength, quality, quantity, and consistency of evidence provided by the available literature to determine its applicability to practice, policy, and education (see Chapters 7, 11, and 18).
The purpose of this chapter is to introduce you to how to evaluate the literature review in a research study and how to critically appraise a group of studies for EBP and quality improvement (QI) projects. This chapter provides you with the tools to (1) locate, search, and retrieve individual research studies, systematic reviews/meta-analyses, and meta-syntheses (see Chapters 6, 9, 10, and 11), and other documents (e.g., CP guidelines); (2) differentiate between a research article and a theoretical/conceptual article or book; (3) critically appraise a research study or group of research studies; and (4) differentiate between a research article and a conceptual article or book. These tools will help you develop your competencies to develop EBP and develop QI projects.
Review of the literature The literature review: The researcher’s perspective The overall purpose of the literature review in a study is to present a systematic state of the science (i.e., what research exists) on a topic. In Box 3.1, Objectives 1 to 8 and 11 present the main purposes of a literature review found in a research article. In a published study, the literature review generally appears near the beginning of the report and may or may not be labeled. It provides an abbreviated version of the literature review conducted by a researcher and represents the building blocks, or framework, of the study. Keep in mind that researchers are constrained by page limitations and so do not expect to see a comprehensive literature
106
review in an article. The researcher must present in a succinct manner an overview and critical appraisal of the literature on a topic in order to generate research questions or hypotheses. A literature review is essential to all steps of the quantitative and qualitative research process, and is a broad, systematic critical review and evaluation of the literature in an area. BOX 3.1 Overall Purposes of a Literature Review Major goal To develop a strong knowledge base to conduct a research study or implement an evidence-based practice/QI project (1–3, 8–11) and carry out research (1–6, 11).
Objectives A review of the literature supports the following:
1. Determine what is known and unknown about a subject, concept, or problem.
2. Determine gaps, consistencies, and inconsistencies in the literature about a subject, concept, or problem.
3. Synthesize the strengths and weaknesses of available studies to determine the state of the science on a topic/problem.
4. Describe the theoretical/conceptual frameworks that guide a study.
5. Determine the need for replication or refinement of a study.
6. Generate research questions and hypotheses.
7. Determine an appropriate research design, methodology, and analysis for a study.
8. Provide information to discuss the findings of a study, draw conclusions, and make recommendations for future research, practice, education, and/or policy changes.
107
9. Uncover a new practice intervention(s) or gain supporting evidence for revising, maintaining current intervention(s), protocols, and policies, or developing new ones.
10. Generate clinical questions that guide development of EBP/QI projects, policies, and protocols.
11. Identify recommendations from the conclusion for future research, practice, education, and/or policy actions.
QI, Quality improvement.
The following overview about use of the literature review in relation to the steps of the quantitative and qualitative research process will help you understand the researcher’s focus. In quantitative studies, the literature review is at the beginning of the published research articles, and may or may not be titled literature review (see Appendix A, B, C, and D). As you read the selected research articles found in the appendices, you will see that none of these reports have a section titled Literature Review. But each has a literature review at the beginning of the article. Example: ➤ van Dijk and colleagues (2016) labeled this beginning section with the title Introduction (see Appendix C). Hawthorne and colleagues (2016), after a brief introduction about their topic, used sublevel headings for two major concepts of their review and then provided a sublevel heading to introduce their Conceptual Framework (see Appendix B). Appendix A’s study by Nyamathi and colleagues (2015), after presenting their nonlabeled literature review, also provided a sublevel heading labeled “Theoretical Framework.”
A review of the relevant literature found in a quantitative study (Fig. 3.1) is valuable, as it provides the following:
• Theoretical or conceptual framework
• Identifies concepts/theories used as a guide or map for developing research questions or hypotheses
• Suggests the presumed relationship between the
108
independent and dependent variables
• Provides a rationale and definition for the variable(s) and concepts studied (see Chapters 1 and 2)
• Primary and secondary sources
• Provides the researcher with a road map for designing the study
• Includes primary sources, which are research articles, theoretical documents, or other documents used by the author(s) who is conducting the study, developing a theory, or writing an autobiography
• Includes secondary sources, which are published articles or books written by persons other than the individual who conducted the research study or developed the theory. Table 3.1 provides definitions and examples of primary and secondary sources.
• Research question and hypothesis
• Helps the researcher identify completed studies about the research topic of interest, including gaps or inconsistencies that suggest potential research questions or hypotheses about a subject, theory, or problem
• Design and method
• Helps the researcher choose the appropriate design,
109
sampling strategy, data collection methods, setting, measurement instruments, and data analysis method. Journal space guidelines limit researchers to include only abbreviated information about these areas
• Data analysis, discussion, conclusions, implications, recommendations
• Helps the researcher interpret, discuss, and explain the study results/findings
• Provides an opportunity for the researcher to return to the literature review and selects relevant studies to inform the discussion of the findings, conclusions, limitations, and recommendations. Example: ➤ Turner-Sack and colleagues’ (2016) discussion section noted several times how their findings were similar to previous studies (Appendix D)
• Useful when considering implications of research findings and making practice, education, and recommendations for practice, education, and research
110
FIG 3.1 Relationship of the review of the literature to
the steps of the quantitative research process.
TABLE 3.1 Examples of Primary and Secondary Sources
Primary: Essential Secondary: Useful Publications written by the person(s) who conducted the study or developed the theory/conceptual model.
Publications written by a person(s) other than the person who conducted the study or developed the theory or model. It usually appears as a summary/critique of another author’s original work (research study, theory, or model); may appear in a study as the theoretical/conceptual framework, or paraphrased theory of the theorist.
Eyewitness accounts of historic events, autobiographies, oral histories, diaries, films, letters, artifacts, periodicals, and Internet communications on e-mail, Listservs, interviews, e- photographs, and audio/video recordings.
A biography or clinical article that cites original author’s work.
Can be published or unpublished. Can be published or unpublished. A published research study (e.g., research articles in ).
An edited textbook (e.g., LoBiondo-Wood, G., & Haber, J. [2018]. Nursing research: Methods and critical appraisal for evidence-based practice [9th ed.], Elsevier).
Theory example: Dr. Jeffries in collaboration with the National League for Nursing developed and published a monograph entitled, The NLN Jeffries Simulation Theory (2015).
Theoretical framework example: Nyamathi and colleague’s 2015 study used “comprehensive health seeking and coping paradigm” theoretical framework by Nyamathi (1989), which Nyamathi adopted from Lazarus and Folkman’s (1984) “coping model” and Schlotfeldt’s (1981) “health seeking and coping paradigm” (see study presented in Appendix A).
111
HINT: Critical appraisal of primary sources is essential to a thorough and relevant literature review.
HINT: Use secondary sources sparingly; however, secondary sources, especially a study’s literature review that presents a critique of studies, are a valuable learning tool from an EBP perspective.
In contrast to the styles of quantitative studies, literature reviews of qualitative studies are usually handled differently (see Chapters 5 to 7). In qualitative studies, often little is known about the topic under study, and thus the literature review may appear more abbreviated than in a quantitative study. However, qualitative researchers use the literature review in the same manner as quantitative researchers to interpret and discuss the study findings, draw conclusions, identify limitations, and suggest recommendations for future study.
Conducting a literature review: The EBP perspective The purpose of the literature review, from an EBP perspective, focuses on the critical appraisal of research studies, systematic reviews, CP guidelines, and other relevant documents. The literature review informs the development and/or refinement of the clinical question that will guide an EBP or QI project. When a clinical problem is identified, nurses and other team members collaborate to identify a clinical question using the PICO format (Yensen, 2013; see Chapter 2).
Once your clinical question is formulated, you will need to conduct a search in electronic database(s) (you may seek the help of a librarian) to gather and critically appraise relevant studies, and synthesize the strengths and weaknesses of the studies to determine if this is the “best available” evidence to answer your clinical question. Objectives 1 to 3 and 7 to 10 in Box 3.1 specifically reflect the purposes of a literature review for these projects.
A clear and precise articulation of a clinical question is critical to finding the best evidence. Clinical questions may sound like research questions, but they are questions used to search the literature for evidence-based answers, not to test research questions or hypotheses (see Chapter 2). The PICO format is as follows:
P Problem/patient population—What is the specifically defined group?
112
I Intervention—What intervention or event will be used to address the problem or population?
C Comparison—How does the intervention compare to current standards of care or another intervention?
O Outcome—What is the effect of the proposed or comparison intervention?
One group of students was interested in whether regular exercise prevented osteoporosis for postmenopausal women who had osteopenia. The PICO format for the clinical question that guided their search was as follows:
P Postmenopausal women with osteopenia (Age is part of the definition for this population.)
I Regular exercise program (How often is regular? Weekly? Twice a week?)
C No regular exercise program (comparing outcomes of regular exercise [I] and no regular exercise [C])
O Prevention of osteoporosis (How and when was this measured?)
These students’ assignment to answer the PICO question requires the following:
• Search the literature using electronic databases (e.g., Cumulative Index to Nursing and Allied Health Literature [CINAHL via EBSCO], MEDLINE, and Cochrane Database of Systematic Reviews) for the information to identify the significance of osteopenia and osteoporosis as a women’s health problem.
• Identify systematic reviews, practice guidelines, and research studies that provide the “best available evidence” related to the effectiveness of regular exercise programs for prevention of osteoporosis.
• Critically appraise information gathered using standardized
113
critical appraisal criteria and tools (see Chapters 7, 11, 18, 19, and 20).
• Synthesize the overall strengths and weaknesses of the evidence provided by the literature.
• Draw a conclusion about the strength, quality, and consistency of the evidence.
• Make recommendations about applicability of evidence to CP to guide development of a health promotion project about osteoporosis risk reduction for postmenopausal women with osteopenia.
As a practicing nurse, you may be asked to work with colleagues to develop or create an EBP/QI project and/or to update current EBP protocols, CP standards/guidelines, or policies in your health care organization using the best available evidence. This will require that you know how to retrieve and critically appraise individual research articles, practice guidelines, and systematic reviews to determine each study’s overall quality and then to determine if there is sufficient support (evidence) to change a current practice and/or policy or guideline.
HELPFUL HINT Hunting for a quantitative study’s literature review? Don’t expect to find it labeled as Literature Review—many are not. Assume that the beginning paragraphs of the article comprise the literature review; the length and style will vary.
EVIDENCE-BASED PRACTICE TIPS
• Formulating a clinical question using the PICO format provides a focus that will guide an efficient electronic literature search.
• Remember, the findings of one study on a topic do not provide sufficient evidence to support a change in practice.
• The ability to critically appraise and synthesize the literature is
114
essential to acquiring skills for making successful presentations, as well as participating in EBP/QI projects.
Searching for evidence Students often state, “I know how to do research; why I need to go see the librarian?” Perhaps you have thought the same thing because you too have “researched” a topic for many of your course requirements. However, it would be more accurate for you to say that you have “searched” the literature to uncover research studies and conceptual information to prepare an academic paper on a topic. During this process, you search for primary sources and secondary sources. It is best to use a primary source when available. Table 3.1 provides definitions and examples of primary and secondary sources, and Table 3.2 identifies the steps and strategies for conducting an efficient literature search. Table 3.3 indicates recommended databases. The top two, CINAHL Plus with full text and PubMed (MEDLINE), are always a must. There are multiple databases that health science libraries offer, and most offer online tutorials for how to use each database. Using the CINAHL Plus and PubMed databases and at least one additional resource database is recommended. Example: ➤ If your topic is about changing a patient’s behavior, such as promoting smoking cessation or increasing weight-bearing exercises, you would use the top two as well as PsycINFO. Another recommendation if your clinical question focuses on interventions is to use the Cochrane Library (http://www.cochranelibrary.com), which has full text systematic reviews as well as an extensive list of randomized control trials (RCTs) and other sources of studies.
TABLE 3.2 Steps and Strategies for Conducting a Literature Search: An EBP Perspective
Steps of Literature Review
Strategy
Step I: Determine clinical
Focus on the types of patients (population) of interest.
115
question or research topic.
If the goal is to develop an EBP project, start with a PICO question. If the goal is to develop a research study, a researcher starts with a broad review of the literature to refine the research question or hypothesis (see Chapter 2).
Step II: Identify key variables/terms.
Review your library’s online Help and Tutorial modules related to conducting a search, including the use of each databases’ vocabulary, prior to meeting with your librarian for help. Make sure you have your PICO format completed so the librarian can help you limit the research articles that fit the parameters of your PICO question. If, after reviewing tutorials on Boolean connectors “AND, OR, and NOT” that connect your search terms when using a specific database, you don’t understand the use of these connectors, clarify with a librarian.
Step III: Conduct electronic search using at least two, preferably three if needed for your topic, recognized electronic databases.
Conduct the search, and make a decision regarding which databases, in addition to CINAHL PLUS with Full Text via EBSCO and MEDLINE via Ovid, you should search; use key mesh terms and Boolean logic (AND, OR, NOT) to address your clinical question.
Step IV: Review abstracts online and weed out irrelevant articles.
Scan through your retrieved articles, read the abstracts, mark only those that fit the topic and are research; select “references” as well as “search history” and “full-text articles” if available, before printing and saving or e-mailing your search.
Step V: Retrieve relevant sources.
Organize by type or study design and year and reread the abstracts to determine if the articles chosen are relevant research to your topic and worth retrieving.
Step VI: Store or print relevant articles; if unable to print directly from the database, order through interlibrary loan.
Download the search to a web-based bibliography and database manager/writing and collaboration tool (e.g., RefWorks, EndNote); most academic institutions have “free” management tools, such as Zotero. Using a system will ensure that you have the information for each citation (e.g., journal name, year, volume number, pages), and it will format the reference list. Download PDF versions of articles as needed.
Step VII: Conduct preliminary reading; eliminate irrelevant sources.
First read each abstract to assess if the article is relevant.
Step VIII: Critically read each source (summarize and critique each source).
Use critical appraisal strategies (e.g., use an evidence table [see Chapter 20] or a standardized critiquing tool) to summarize and critique each articles; include references in APA format.
116
Step IX: Synthesize critical summaries of each article.
Decide how you will present the synthesis of overall strengths and weaknesses of the reviewed research articles (e.g., present chronologically or according to the designs); thus, the reader can review the evidence. Compare and contrast the studies in terms of the research process steps, so you conclude with the overall similarities and differences between and among studies. In the end, summarize the findings of the review—that is, determine if the strengths of the group of studies outweigh the limitations in order to determine confidence in the findings and draw a conclusion about the state of the science. Include the reference list.
CINAHL, Cumulative Index to Nursing and Allied Health Literature.
TABLE 3.3 Databases for Nursing
Database Source CINAHAL PLUS with FULL TEXT (EBSCO)
Full text database for nursing and allied health widely used by nursing and health care—a useful starting point (Source: https://health.ebsco.com/products/cinahl- plus-with-full-text)
PubMed (MEDLINE)
Provides free access to MEDLINE, NLM’s database of citations and abstracts in medicine, nursing, dentistry, veterinary medicine, health care systems, and preclinical sciences, including full text (Source: https://www.nlm.nih.gov/bsd/pmresources.html)
PsycINFO Centered on psychology, behavioral, and social sciences; interdisciplinary content, one of the most widely used databases (Source: http://www.apa.org/pubs/databases/psycinfo/)
Education Source with ERIC (EBSCO)
The largest and most complete collection of full-text education journals. This database provides research and information to meet needs of students, professionals, and policy makers, covers all levels of education—from early childhood to higher education—as well as all educational specialties such as multilingual education, health education, and testing. (Source: https://www.ebscohost.com/academic/education-source)
CINAHL, Cumulative Index to Nursing and Allied Health Literature.
Sources of literature
Preappraised literature Preappraised literature is a secondary source of evidence, sometimes referred to as preappraised synopses, or simply synopses. Reading an expert’s comment about another author’s research can help develop your critical appraisal and synthesis skills. Some synopses include a commentary about the strength and applicability of the evidence to a patient population. It is important to keep in mind that there are limitations to using preappraised sources. These sources are useful for giving you a preview about
117
the potential relevance of the publication to your clinical question and the strength of the evidence. You can then make a decision about whether to search for and critically appraise the primary source. Preappraised synopses can be found in journals such as Evidence-Based Nursing (http://ebn.bmj.com) and Evidence-Based Medicine (http://ebm.bmj.com) or the Joanna Briggs Institute (JBI) EBP Database (http://joannabriggs.org).
EVIDENCE-BASED PRACTICE TIP If you find a preappraised commentary on an individual study related to your PICO question, read the preappraised commentary first. As a beginner, this strategy will make it easier for you to pick out the strengths and weaknesses in the primary source study.
Primary sources When searching the literature, primary sources should be a search strategy priority. Review Table 3.1 to identify the differences between primary and secondary sources. Then, as noted in Step VIII of Table 3.2, strategies to conduct a literature search, you need to apply your critiquing skills to determine the quality of the primary source publications. Review Chapters 7, 11, and 18 so you can apply the critical appraisal criteria outlined in these chapters to your retrieved studies. Example: ➤ For your PICO question, you searched for and found two types of primary source publications. One primary source was a rigorous systematic review related to your clinical question that provided strong evidence to support your PICO comparison intervention, the current standard of care; you also found two poorly designed RCTs that provided weak evidence supporting your proposed intervention. Which primary source would you recommend? You would need to make an evidence-based decision about the applicability of the primary source evidence supporting or not supporting the proposed or comparison intervention. The well-designed systematic review provided the highest level of evidence (Level I on the Fig. 1.1 evidence hierarchy). It also provided strong evidence that supported continuation of the current standard of care in comparison to the weak evidence supporting the proposed intervention provided by two poorly designed RCTs (Level II on
118
the Fig. 1.1 evidence hierarchy). Your team would conclude that the primary source systematic review provided the strongest evidence supporting that the current standard of care be retained and recommended, and that there was insufficient evidence to recommend a practice change.
HELPFUL HINT
• If possible, consult a librarian before conducting your searches to determine which databases and keywords to use for your PICO question. Save your search history electronically.
• Learn how to use an online search management tool such as RefWorks, EndNote, or Zotero.
EVIDENCE-BASED PRACTICE TIP
• If you do not retrieve any studies from your search, review your PICO question and search strategies with a librarian.
• Every meta-analysis begins with a systematic review; however, not every systematic review results in a meta-analysis. Read Chapter 11 and find out why.
Performing an electronic search
Why use an electronic database? Perhaps you still are not convinced that electronic database searches are the best way to acquire information for a review of the literature. Maybe you have searched using Google or Yahoo! and found relevant information. This is an understandable temptation. Try to think about it from another perspective and ask yourself, “Is this the most appropriate and efficient way to find the latest and strongest research on a topic that affects patient care?” Yes, Google Scholar might retrieve some studies, but from an EBP perspective, you need to retrieve all the studies available on your topic/clinical question. The “I” and “C” of your PICO question require that you retrieve from your search all types of interventions, not just what
119
you have proposed. To understand the literature in a specific area requires a review of all relevant studies. A way to decrease your frustration is to take the time to learn how to conduct an efficient database search by reviewing the steps presented in Table 3.2. Following these strategies and reviewing the Helpful Hints and EBP Tips provided in this chapter will help you gain the essential competencies needed for you to be successful in your search. The Critical Thinking Decision Path provides a means for locating evidence to support your clinical question (Kendall, 2008). Path shows a way to locate evidence to support your research or clinical question.
CRITICAL THINKING DECISION PATH
120
Types of resources
Print and electronic indexes: Books and journals Most college/university libraries have management retrieval
121
systems or databases to retrieve both print and online books, journals, videos, and other media items, scripts, monographs, conference proceedings, masters’ theses, doctoral dissertations, archival materials, and Grey literature (e.g., information produced by government, industry, health care organizations, and professional organizations in the form of committee reports and policy documents; dissertations are included in the Grey literature). Print indexes are useful for finding sources that have not been entered into online databases. Print resources such as the Grey literature are still necessary if a search requires materials not entered into a database before a certain year. Also, another source is the citations/reference lists from the articles you retrieved; often they contain studies not captured with your search.
Refereed journals A major portion of most literature reviews consist of journal articles. Journals are published in print and online. In contrast to textbooks, which take much longer to publish, journals are a ready source of the latest information on almost any subject. Therefore, journals are the preferred mode of communicating the latest theory or study results. You should use refereed or peer-reviewed journals as your first choice when looking for primary sources of theoretical, clinical, or research articles. A refereed or peer- reviewed journal has a panel of internal and external reviewers who review submitted manuscripts for possible publication. The external reviewers are drawn from a pool of nurse scholars and scholars from related disciplines who are experts in various specialties. In most cases, the reviews are “blind”; that is, the manuscript to be reviewed does not include the name of the author(s). The reviewers use a set of criteria to judge whether a manuscript meets the publication journal’s standards. These criteria are similar to what you will use to critically appraise the evidence you obtained in order to determine the strengths and weaknesses of a study (see Chapters 7 and 18). The credibility of a published research or theoretical/conceptual article is strengthened by the peer-review process.
Electronic: Bibliographic and abstract databases
122
Electronic databases are used to find research and theoretical/conceptual articles on a variety of topics, including doctoral dissertations. Electronic databases contain bibliographic citation information such as the author name, title, journal, and indexed terms for each record. Libraries have lists of electronic databases, including the ones indicated in Table 3.3 and Table 3.4. Usually these include the abstract, and some have the full text of the article or links to the full text. If the full text is not available, look for other options such as the abstract to learn more about the article before requesting an interlibrary loan of the article. Reading the abstract (see Chapter 1) is a critical step of the process to determine if you need to retrieve the full text article through another mechanism. Use both CINAHL and MEDLINE electronic databases as well as a third database; this will facilitate all steps of critically reviewing the literature, especially identifying the gaps. Your college/university most likely enables you to access such databases electronically whether on campus or not.
TABLE 3.4 Selected Examples of Websites for Evidence-Based Practice
Website Scope Notes Virginia Henderson International Nursing Library (www.nursinglibrary.org)
Access to the Registry of Nursing Research database contains abstracts and the full text of research studies and conference papers
Offered without charge. Supported by Sigma Theta Tau International, Honor Society of Nursing.
National Guideline Clearinghouse (www.guidelines.gov)
Public resource for evidence- based CP guidelines
Offers a useful online feature of side-by-side comparison of guidelines and the ability to browse by disease/condition and treatment/intervention.
JBI (www.joannabriggs.org)
JBI is an international not-for- profit research and development center
Membership required for access. Recommended links worth reviewing, as well as descriptions on their levels of evidence and grading scale is provided.
TRIP (www.tripdatabase.com)
Content from free online resources, including synopses, guidelines, medical images, e- textbooks, and systematic reviews, organized under the TRIP search engine.
Site offers a wide sampling of available evidence and ability to filter by publication type—that is, evidence based synopses, systematic reviews, guidelines, textbooks, and research.
Agency for Health Research and Quality (www.ahrq.gov)
Evidence-based reports, statistical briefs, research findings and reports, and policy
Free source of government documents, searchable via PubMed.
123
reports. Cochrane Collaboration (www. cochrane.org)
Access to abstracts from Cochrane Database of Systematic Reviews. Full text of reviews and access to databases that are part of the Cochrane Library. Information is high quality and useful for health care decision making. It is a powerful tool for enhancing health care knowledge and decision making.
Abstracts are free and can be browsed or searched; uses many databases in its reviews, including CINAHL via EBSCO and MEDLINE; some are primary sources (e.g., systematic reviews/meta-analyses); others (if commentaries of single studies) are a secondary source; important source for clinical evidence.
CP, Clinical practice; CINAHL, Cumulative Index to Nursing and Allied Health Literature; JBI, Joanna Briggs Institute; TRIP, Turning Research into Practice.
Electronic: Secondary or summary databases Some databases contain more than journal article information. These resources contain either summaries or synopses of studies, overviews of diseases or conditions, or a summary of the latest evidence to support a particular treatment. Table 3.4 provides a few examples.
Internet: Search engines You are probably familiar with accessing a web browser (e.g., Internet Explorer, Mozilla Firefox, Chrome, Safari) to conduct searches, and with using search engines such as Google or Google Scholar to find information. However, “surfing” the web is not a good use of your time when searching for scholarly literature. Table 3.4 indicates sources of online information; all are free except JBI. Most websites are not a primary source for research studies.
HELPFUL HINTS Be sure to discuss with your instructor regarding the use of theoretical/conceptual articles and other Grey literature in your EBP/QI project or a review of the literature paper.
Less common and less used sources of scholarly material are audio, video, personal communications (e.g., letters, telephone or in-person interviews), unpublished doctoral dissertations, masters’ theses, and conference proceedings.
EVIDENCE-BASED PRACTICE TIP
124
Reading systematic reviews, if available, on your clinical question/topic will enhance your ability to implement evidence- based nursing practice because they generally offer the strongest and most consistent level of evidence and can provide helpful search terms. A good first step for any question is to search the Cochrane Database of Systematic Reviews to see if someone has already completed a systematic review addressing your clinical question.
How far back must the search go? Students often ask questions such as, “How many articles do I need?”; “How much is enough?”; “How far back in the literature do I need to go?” When conducting a search, you should use a rigorous focused process or you may end up with hundreds or thousands of citations. Retrieving too many citations is usually a sign that there was something wrong with your search technique, or you may not have sufficiently narrowed your clinical question.
Each electronic database offers an explanation of its features; take the time and click on each icon and explore the explanations offered, because this will increase your confidence. Also, take advantages of tutorials offered to improve your search techniques. Keep in mind the types of articles you are retrieving. Many electronic resources allow you to limit your search to the article type (e.g., systematic reviews/meta-analyses, RCTs). Box 3.2 provides a number of features through which CINAHL Plus with Full Test allows you to choose and/or insert information so that your search can be targeted. BOX 3.2 Tips: Using Cumulative Index to Nursing and Allied Health Literature via EBSCO
• Locate CINAHL from your library’s home page. It may be located under databases, online resources, or nursing resources.
• In the Advanced Search, type in your keyword, subject heading, or phrase (e.g., maternal-fetal attachment, health behavior). Do not use complete sentences. (Ask your librarian for any tip sheets,
125
or online tutorials or use the HELP feature in the database.)
• Before choosing “Search,” make sure you mark “Research Articles” to ensure that you have retrieved articles that are actually research.
• In the “Limit Your Results” section, you can limit by year, age group, clinical queries, and so on.
• Using the Boolean connector “AND” between each of the words of your PICO variables narrows your search—that is, it will exclude an article that doesn’t use both terms; using “OR” broadens your search.
• Once the search results appear, save them, review titles and abstracts online, export to your management system (e.g., RefWorks), and/or e-mail the results to yourself.
CINAHL, Cumulative Index to Nursing and Allied Health Literature.
When conducting a literature review for any purpose, there is always a question of how far back one should search. There is no general time period. But if in your search you find a well-done meta-analysis that was published 6 years ago, you could continue your search, moving forward from that time period. Some research and EBP projects may warrant going back 10 or more years. Extensive literature reviews on particular topics or a concept clarification helps you limit the length of your search.
As you scroll through and mark the citations you wish to include in your downloaded or printed search, make sure you include all relevant fields when you save or print. In addition to indicating which citations you want and choosing which fields, there is an opportunity to indicate if you want the “search history” included. It is always a good idea to include this information. It is especially helpful if you feel that some citations were missed; then you can replicate your search and determine which variable(s) you missed. This is also your opportunity to indicate if you want to e-mail the search to yourself. If you are writing a paper and need to develop a reference list, you can export your citations to citation management
126
software, which formats and stores your citations so that they are available for electronic retrieval when they are needed for a paper. Quite a few of these software programs are available; some are free, such as Zotero, and others your institution has most likely purchased, including EndNote and RefWorks.
HELPFUL HINT Ask your faculty for guidance if you are uncertain how far back you need to conduct your search. If you come across a systematic review/meta-analysis on your specific topic, review it to see what years the review covers; then begin your search from the last year of the studies included in the review and conduct your search from that year forward to the present to fill in the gap.
EVIDENCE-BASED PRACTICE TIP You will be tempted to use Google, Google Scholar, or even Wikipedia instead of going through Steps I through III of Table 3.2 and using the databases suggested, but this will most likely result in thousands of citations that aren’t classified as research and are not specific to your PICO question. Instead, use the specific parameters of your electronic database.
What do I need to know? Each database usually has a specific search guide that provides information on the organization of the entries and the terminology used. Academic and health science libraries continually update their websites in order to provide tutorials, guides, and tips for those who are using their databases. The strategies in Table 3.2 incorporate general search strategies, as well as those related to CINAHL and MEDLINE. Finding the right terms to “plug in” as keywords for a computer search is important for conducting an efficient search. In many electronic databases you can browse the controlled vocabulary terms and see how the terms of your question match up and then add them before you search. If you encounter a problem, ask your librarian for assistance.
HELPFUL HINT One way to discover new terms for searching is to find a systematic
127
review that includes the search strategy. Match your PICO words with the controlled vocabulary terms of each database.
In CINAHL the Full Text via EBSCO host provides you with the option of conducting a basic or advanced search using the controlled vocabulary of CINAHL headings. This user-friendly feature has a built-in tutorial that reviews how to use this option. You also can click on the “Help” feature at any time during your search. It is recommended that you conduct an Advanced Search with a Guided Style tutorial that outlines the steps for conducing your search. If you wanted to locate articles about maternal-fetal attachment as they relate to the health practices or health behaviors of low-income mothers, you would first want to construct your PICO:
P Maternal-fetal attachment in low-income mothers (specifically defined group)
I Health behaviors or health practices (event to be studied)
C None (comparison of intervention)
O Neonatal outcomes (outcome)
In this example, the two main concepts are maternal-fetal attachment and health practices and how these impact neonatal outcomes. Many times when conducting a search, you only enter in keywords or controlled vocabulary for the first two elements of your PICO—in this case, maternal-fetal attachment and health practices or behaviors. The other elements can be added if your list of results is overwhelming (review the Critical Thinking Decision Path).
Maternal-fetal attachment should be part of your keyword search; however, when you click the CINAHL heading, it indicates that you should use “prenatal bonding.” To be comprehensive, you should use the Boolean operator of “OR” to link these terms together. The second concept, of health practices OR health behavior, is accomplished in a similar manner. The subject heading or controlled vocabulary assigned by the indexers could be added
128
in for completeness. Boolean operators are “AND,” “OR,” and “NOT,” and they dictate the relationship between words and concepts. Note that if you use “AND,” then this would require that both concepts be located within the same article, while “OR” allows you to group together like terms or synonyms, and “NOT” eliminates terms from your search. It is suggested that you limit your search to “peer-reviewed” and “research” articles. Refine the publication range date to 10 years, or whatever the requirement is for your search, and save your search. Once these limits were chosen for the PICO search related to maternal-fetal attachment described previously, the search results decreased from an unmanageable 294 articles to 6 research articles. The key to understanding how to use this process is to try the search yourself using the terms just described. Developing search skills takes time, even if you complete the library tutorials and meet with a librarian to refine your PICO question, search terms, and limits. You should search several databases. Library database websites are continually being updated, so it is important to get to know your library database site.
HELPFUL HINT When combining synonyms for a concept, use the Boolean operator “OR”—OR is more!
Review your library’s tutorials on conducting a search for each type database (e.g., CINAHL and MEDLINE).
Use features in your database such as “limit search to” and choose peer review journal, research article, date range, age group, and country (e.g., United States).
How do I complete the search? Once you are confident about your search strategies for identifying key articles, it is time to critically read what you have found. Critically reading research articles requires several readings and the use of critical appraisal criteria (see Chapters 1, 7, and 18). Do not be discouraged if all of the retrieved articles are not as useful as you first thought, even though you limited your search to “research.” This happens to even the most experienced searcher. If most of the articles you retrieved were not useful, be prepared to do another
129
search, but before you do, discuss the search terms with your librarian and faculty. You may also want to add a fourth database. It is a good practice to always save your search history when conducting a search. It is very helpful if you provide a printout of the search you have completed when consulting with your librarian or faculty. Most likely your library will have the feature that allows you to save your search, and it can be retrieved during your meeting. In the example of maternal-fetal bonding and health behaviors in low-income women, the third database of choice may be PsycINFO (see Table 3.3).
HELPFUL HINT Read the abstract carefully to determine if it is a research article; you will usually see the use of headings such as “Methodology” and “Results” in research articles. It is also a good idea to review the reference list of the research articles you retrieved, as this strategy might uncover additional related articles you missed in your database search, and then you can retrieve them.
Literature review format: What to expect Becoming familiar with the format of a literature review in the various types of review articles and the literature review section of a research article will help you use critical appraisal criteria to evaluate the review. To decide which style you will use so that your review is presented in a logical and organized manner, you must consider:
• The research or clinical question/topic
• The number of retrieved sources reviewed
• The number and type of research versus theoretical/conceptual materials and/or Grey literature
Some reviews are written according to the variables or concepts being studied and presented chronologically under each variable. Others present the material chronologically with subcategories or variables discussed within each time period. Still others present the
130
variables and include subcategories related to the study’s type or designs or related variables.
Hawthorne and colleagues (2016) (see Appendix B) stated that the purpose of their “longitudinal study was to test the relationships between spirituality/religious coping strategies and grief, mental health (depression and post-traumatic stress disorder), and mothers and fathers” at selected time periods after experiencing the death of an infant in the neonatal intensive care unit (NICU) or pediatric intensive care unit (PICU). At the beginning of their article, after some basic overall facts on infant deaths and parents’ grieving, they logically presented the concepts they addressed in their quantitative study (see Appendix B). The researchers did not title the beginning of their article with a section labeled Literature Review. However, it is clear that the beginning of their article is a literature review. Example: ➤ After presenting general facts on infant deaths and parents grieving and related research, the authors title a section Use of Spirituality/Religion as a Coping Strategy and the next section Parent Mental Health and Personal Growth. In these sections they discuss studies related to each topic. Then, they present a section labeled Conceptual Framework, indicating that will use a specific grief framework to guide their study.
HIGHLIGHT Each member of your QI committee should be responsible for searching for one research study, using the agreed upon search terms and reviewing the abstract to determine its relevance to your QI project’s clinical question.
HELPFUL HINT The literature review for an EBP/QI project or another type of scholarly paper is different than one found in a research article.
Make an outline that will later become the level headings in your paper (i.e., title the concepts of your literature review). This is a good way to focus your writing and will let the reader know what to expect to read and demonstrate your logic and organization.
Include your search strategies so that a reader can re-create your search and come up with the same results. Include information on
131
databases searched, time frame of studies chosen, search terms used, and any limits used to narrow the search.
Include any standardized tools used to critically appraise the retrieved literature.
Appraisal for evidence-based practice When writing a literature review for an EBP/QI project, you need to critically appraise all research reports using appropriate criteria. Once you have conducted your search and obtained all your references, you need to evaluate the articles using standardized critical appraisal criteria (see Chapters 7, 11, and 18). Using the criteria, you will be able to identify the strengths and weaknesses of each study.
Critiquing research or theoretical/conceptual reports is a challenging task for seasoned consumers of research, so do not be surprised if you feel a little intimidated by the prospect of critiquing and synthesizing research. The important issue is to determine the overall value of the literature review, including both research and theoretical/conceptual materials. The purposes of a literature review (see Box 3.1) and the characteristics of a well-written literature review (Box 3.3) provide the framework for evaluating the literature. BOX 3.3 Characteristics of a Well-Written Review of the Literature—An EBP Perspective Each reviewed source reflects critical thinking and writing and is relevant to the study/topic/project, and the content meets the following criteria:
• Uses mainly primary sources—that is, a sufficient number of research articles for answering a clinical question with a justification of the literature search dates and search terms used
• Organizes the literature review using a systematic approach
• Uses established critical appraisal criteria for specific study designs to evaluate strengths, weaknesses, conflicts, or gaps
132
related to the PICO question
• Provides a synthesis and critique of the references indicating similarities, differences, strengths, and weaknesses between and among the studies
• Concludes with a summary that provides recommendations for practice and research
• In a table format, summarizes each article succinctly with references
The literature review should be presented in an organized manner. Theoretical/conceptual and research literature can be presented chronologically from earliest work of the theorist or first studies on a topic to most recent; sometimes the theoretical/conceptual literature that provided the foundation for the existing research will be presented first, followed by the research studies that were derived from this theoretical/conceptual base. Other times, the literature can be clustered by concepts, pro or con positions, or evidence that highlights differences in the theoretical/conceptual and/or research findings. The overall question to be answered from an EBP perspective is, “Does the review of the literature develop and present a knowledge base to provide sufficient evidence for an EBP/QI project?” Objectives 1 to 3, 5, 8, 10, and 11 in Box 3.1 specifically reflect the purposes of a literature review for EBP/QI project. Objectives 1 to 8 and 11 reflect the purposes of a literature review when conducting a research study.
Regardless of how the literature review is organized, it should provide a strong knowledge base for a CP or a research project. When a literature review ends with insufficient evidence, this provides a gap in knowledge and requires further research. The more you read published systematic and integrative reviews, as well as studies, the more competent you become at differentiating a well-organized literature review from one that has a limited organizing framework.
Another key to developing your competency in this area is to read both quantitative (meta-analyses) and qualitative (meta-
133
syntheses) systematic reviews. A well-done meta-analysis adheres to the rigorous search, appraisal, and synthesis process for a group of like studies to answer a question and to meet the required guidelines, which include that it should be conducted by a team.
The systematic review on how nurses who lead clinics for patients with cardiovascular disease found in Appendix E is an example of a well-done quantitative systematic review that critically appraises and synthesizes the evidence from research studies related to the effect of the mortality and morbidity rates of patients with cardiovascular disease who are followed in nurse-led clinics.
The Critical Appraisal Criteria box summarizes general critical appraisal criteria for a review of the literature. Other sets of critical appraisal criteria may phrase these questions differently or more broadly. Example: ➤ “Does the literature search seem adequate?” “Does the report demonstrate scholarly writing?” These may seem to be difficult questions for you to answer; one place to begin, however, is by determining whether the source is a refereed journal. It is reasonable to assume that a refereed journal publishes manuscripts that are adequately searched, use mainly primary sources, and are written in a scholarly manner. This does not mean, however, that every study reported in a refereed journal will meet all of the critical appraisal criteria for a literature review and other components of the study in an equal manner. Because of style differences and space constraints, each citation summarized is often very brief, or related citations may be summarized as a group and lack a critique. You still must answer the critiquing questions. Consultation with a faculty advisor may be necessary to develop skill in answering these questions.
The key to a strong literature review is a careful search of published and unpublished literature. When critically appraising a literature review written for a published research study, it should reflect a synthesis or pulling together of the main points or value of all of the sources reviewed in relation to your research question, hypothesis, or clinical question (see Box 3.1). The relationship between and among these studies must be explained. The summary synthesis of a review of the literature in an area should appear at the end of a paper or article. When reading a research article, the
134
summary of the literature appears before the methodology section and is referred to again when reviewing the results of the study.
CRITICAL APPRAISAL CRITERIA Literature Review
1. Are all of the relevant concepts and variables included in the literature review?
2. Is the literature review presented in an organized format that flows logically (e.g., chronologically, clustered by concept or variables), enhancing the reader’s ability to evaluate the need for the particular research study or evidence-based practice project?
3. Does the search strategy include an appropriate and adequate number of databases and other resources to identify key published and unpublished research and theoretical/conceptual sources?
4. Are both theoretical/conceptual and research sources used?
5. Are primary sources mainly used?
6. What gaps or inconsistencies in knowledge does the literature review uncover?
7. Does the literature review build on earlier studies?
8. Does the summary of each reviewed study reflect the essential components of the study design (e.g., type and size of sample, reliability and validity of instruments, consistency of data collection procedures, appropriate data analysis, identification of limitations)?
9. Does the critique of each reviewed study include strengths, weaknesses, or limitations of the design, conflicts, and gaps in information related to the area of interest?
10. Does the synthesis summary follow a logical sequence that presents the overall strengths and weaknesses of the reviewed
135
studies and arrive at a logical conclusion on its topic?
11. Does the literature review for an evidence-based practice project answer a clinical question?
12. Is the literature review presented in an organized format that flows logically (e.g., chronologically, clustered by concepts or variables), enhancing the reader’s ability to evaluate the need for the particular research study or evidence-based practice project?
HELPFUL HINT
• If you are doing an academic assignment, make sure you check with your instructor as to whether or not the following sources may be used: (1) unpublished material, (2) theoretical/conceptual articles, and (3) Grey literature.
• Use a standardized critical appraisal criteria appropriate to the study’s design to evaluate research articles.
• Make a table of the studies found (see Chapter 20 for an example of a summary table).
• Synthesize the results of your analysis by comparing and contrasting the similarities and differences between the studies on your topic/clinical question and draw a conclusion.
Key points • Review of the literature is defined as a broad, comprehensive, in-
depth, systematic critique and synthesis of publications, unpublished print and online materials, audiovisual materials, and personal communication.
• Review of the literature is used for the development of EBP/QI clinical projects as well as research studies.
• There are differences between a review of the literature for research and for EBP/QI projects. For an EBP/QI project, your
136
search should focus on the highest level of primary source literature available per the hierarchy of evidence, and it should relate to the specific clinical problem.
• The main objectives for conducting and writing a literature review are to acquire the ability to (1) conduct a comprehensive and efficient electronic research and/or print research search on a topic; (2) efficiently retrieve a sufficient amount of materials for a literature review in relation to the topic and scope of project; (3) critically appraise (i.e., critique) research and theoretical/conceptual material based on accepted critical appraisal criteria; (4) critically evaluate published reviews of the literature based on accepted standardized critical appraisal criteria; (5) synthesize the findings of the critique materials for relevance to the purpose of the selected scholarly project; and (6) determine applicability to answer your clinical question.
• Primary research and theoretical/conceptual resources are essential for literature reviews.
• Review the Grey literature for white papers and theoretical/conceptual materials not published in journals, and conduct “hand searches” of the reference list of your retrieved research articles as both provide background as well as uncover other studies.
• Secondary sources, such as commentaries on research articles from peer-reviewed journals, are part of a learning strategy for developing critical critiquing skills.
• It is more efficient to use electronic databases rather than print resources or general web search engines such as Google for retrieving materials.
• Strategies for efficiently retrieving literature for nursing include consulting the librarian and using at least three online sources (e.g., CINAHL, MEDLINE, and one that relates more specifically to your clinical question or topic).
137
• Literature reviews are usually organized according to variables, as well as chronologically.
• Critiquing and synthesizing a number of research articles, including systematic reviews, is essential to implementing evidence-based nursing practice.
Critical thinking challenges • Why is it important for your QI team colleagues to be able to
challenge each other about the overall strength and quality of evidence provided by the group of studies retrieved from your search?
• For an EBP project, why is it necessary to critically appraise studies that are published in a refereed journal?
• How does reading preappraised commentaries of a study and systematic reviews/meta-analyses develop your critical appraisal skills?
• A general guideline for a literature search is to use a timeline of 5 years or more. Would your timeline possibly differ if you found a well-done systematic review?
• What is the relationship of the research article’s literature review to the theoretical or conceptual framework?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Hawthorne D.M, Youngblut J.M, Brooten D. Parent
spirituality, grief, and mental health at 1 year and 3 months after their infants/child’s death in an intensive care unit. Journal of Pediatric Nursing 2016;31:73-80.
138
2. Jeffries P, National League for Nursing (NLN). The NLN Jeffries simulation theory. Philadelphia, PA: Wolters Kluwer;2015.
3. Kendall S. Evidence-based resources simplified. Canadian Family Physician 2008;54(2):241-243.
4. Nyamathi A, Salem B.E, Zhang S, et al. Nursing care management, peer coaching, and Hepatitis A and B vaccine completion among homeless men recently released on parole. Nursing Research 2015;64(3):177-189.
5. Turner-Sack A.M, Menna R, Setchell S.R, et al. Psychological functioning, post traumatic growth, and coping in parent and siblings of adolescent cancer survivors. Oncology Nursing Forum 2016;43:48-56.
6. van Dijk J.F, Vervoort S.C, van Wijck A.J, et al. Postoperative patients’ perspective on rating pain A qualitative study. International Journal of Nursing Studies 2016;53:260- 269.
7. Yensen J. PICO search strategies. Online Journal of Nursing Informatics 2013;17(3) Retrieved from http://ojni.org/issues/? p=2860
139
CHAPTER 4
Theoretical frameworks for research Melanie McEwen
Learning outcomes
After reading this chapter, you should be able to do the following:
• Describe the relationship among theory, research, and practice. • Identify the purpose of conceptual and theoretical frameworks for nursing research. • Differentiate between conceptual and operational definitions. • Identify the different types of theories used in nursing research. • Describe how a theory or conceptual framework guides research. • Explain the points of critical appraisal used to evaluate the appropriateness, cohesiveness, and consistency of a framework guiding research.
KEY TERMS
140
concept
conceptual definition
conceptual framework
construct
deductive
grand theory
inductive
middle range theory
model
operational definition
situation-specific theory
theoretical framework
theory
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
To introduce the discussion of the use of theoretical frameworks for nursing research, consider the example of Emily, a novice oncology nurse. From this case study, reflect on how nurses can understand the theoretical underpinnings of both nursing research and evidence-based nursing practice, and re-affirm how nurses should integrate research into practice.
Emily graduated with her bachelor of science in nursing (BSN) a little more than a year ago, and she recently changed positions to work on a pediatric oncology unit in a large hospital. She quickly learned that working with very ill and often dying children is tremendously rewarding, even though it is frequently heartbreaking.
One of Emily’s first patients was Benny, a 14-year-old boy admitted with a recurrence of leukemia. When she first cared for
141
Benny, he was extremely ill. Benny’s oncologist implemented the protocols for cases such as his, but the team was careful to explain to Benny and his family that his prognosis was guarded. In the early days of his hospitalization, Emily cried with his mother when they received his daily lab values and there was no apparent improvement. She observed that Benny was growing increasingly fatigued and had little appetite. Despite his worsening condition, however, Benny and his parents were unfailingly positive, making plans for a vacation to the mountains and the upcoming school year.
At the end of her shift one night before several days off, Emily hugged Benny’s parents, as she feared that Benny would die before her next scheduled workday. Several days later, when she listened to the report at the start of her shift, Emily was amazed to learn that Benny had been heartily eating a normal diet. He was ambulatory and had been cruising the halls with his baseball coach and playing video games with two of his cousins. When she entered Benny’s room for her initial assessment, she saw the much-improved teenager dressed in shorts and a T-shirt, sitting up in bed using his iPad. A half-finished chocolate milkshake was on the table in easy reaching distance. He joked with Emily about Angry Birds as she performed her assessment. Benny steadily improved over the ensuing days and eventually went home with his leukemia again in remission.
As Emily became more comfortable in the role of oncology nurse, she continued to notice patterns among the children and adolescents on her unit. Many got better, even though their conditions were often critical. In contrast, some of the children who had better prognoses failed to improve as much, or as quickly, as anticipated. She realized that the kids who did better than expected seemed to have common attributes or characteristics, including positive attitudes, supportive family and friends, and strong determination to “beat” their cancer. Over lunch one day, Emily talked with her mentor, Marie, about her observations, commenting that on a number of occasions she had seen patients rebound when she thought that death was imminent.
Marie smiled. “Fortunately this is a pattern that we see quite frequently. Many of our kids are amazingly resilient.” Marie told
142
Emily about the work of several nurse researchers who studied the phenomenon of resilience and gave her a list of articles reporting on their findings. Emily followed up with Marie’s prompting and learned about “psychosocial resilience in adolescents” (Tusaie et al., 2007) and “adolescent resilience” (Ahern, 2006; Ahern et al., 2008). These works led her to a “middle range theory of resilience” (Polk, 1997). Focusing her literature review even more, Emily was able to discover several recent research studies (Chen et al., 2014; Ishibashi et al., 2015; Wu et al., 2015) that examined aspects of resilience among adolescents with cancer, further piquing her interest in the subject.
From her readings, she gained insight into resilience, learning to recognize it in her patients. She also identified ways she might encourage and even promote resilience in children and teenagers. Eventually, she decided to enroll in a graduate nursing program to learn how to research different phenomena of concern to her patients and discover ways to apply the evidence-based findings to improve nursing care and patient outcomes.
Practice-theory-research links Several important aspects of how theory is used in nursing research are embedded in Emily’s story. First, it is important to notice the links among practice, theory, and research. Each is intricately connected with the others to create the knowledge base for the discipline of nursing (Fig. 4.1). In her practice, Emily recognized a pattern of characteristics in some patients that appeared to enhance their recovery. Her mentor directed her to research that other nurses had published on the phenomenon of “resilience.” Emily was then able to apply the information on resilience and related research findings as she planned and implemented care. Her goal was to enhance each child’s resilience as much as possible and thereby improve their outcomes.
143
FIG 4.1 Discipline knowledge: Theory-practice-
research connection.
Another key message from the case study is the importance of reflecting on an observed phenomenon and discussing it with colleagues. This promotes questioning and collaboration, as nurses seek ways to improve practice. Finally, Emily was encouraged to go to the literature to search out what had been published related to the phenomenon she had observed. Reviewing the research led her to a middle range theory on resilience as well as current nursing research that examined its importance in caring for adolescents with cancer. This then challenged her to consider how she might ultimately conduct her own research.
Overview of theory Theory is a set of interrelated concepts that provides a systematic view of a phenomenon. A theory allows relationships to be proposed and predictions made, which in turn can suggest potential actions. Beginning with a theory gives a researcher a logical way of collecting data to describe, explain, and predict nursing practice, making it critical in research.
144
In nursing, science is the result of the interchange between research and theory. The purpose of research is to build knowledge through the generation or testing of theory that can then be applied in practice. To build knowledge, research should develop within a theoretical structure or blueprint that facilitates analysis and interpretation of findings. The use of theory provides structure and organization to nursing knowledge. It is important that nurses understand that nursing practice is based on the theories that are generated and validated through research (McEwen & Wills, 2014).
In an integrated, reciprocal manner, theory guides research and practice; practice enables testing of theory and generates research questions; and research contributes to theory building and establishing practice guidelines (see Fig. 4.1). Therefore, what is learned through practice, theory, and research interweaves to create the knowledge fabric of nursing. From this perspective, like Emily in the case study, each nurse should be involved in the process of contributing to the knowledge or evidence-based practice of nursing.
Several key terms are often used when discussing theory. It is necessary to understand these terms when considering how to apply theory in practice and research. They include concept, conceptual definition, conceptual/theoretical framework, construct, model, operational definition, and theory. Each term is defined and summarized in Box 4.1. Concepts and constructs are the major components of theories and convey the essential ideas or elements of a theory. When a nurse researcher decides to study a concept/construct, the researcher must precisely and explicitly describe and explain the concept, devise a mechanism to identify and confirm the presence of the concept of interest, and determine a method to measure or quantify it. To illustrate, Table 4.1 shows the key concepts and conceptual and operational definitions provided by Turner-Sack and colleagues (2016) in their study on psychological issues among parents and siblings of adolescent cancer survivors (see Appendix D).
TABLE 4.1 Concepts and Variables: Conceptual and Operational Definitions
145
BOX 4.1 Definitions Concept Image or symbolic representation of an abstract idea; the key identified element of a phenomenon that is necessary to understand it. Concept can be concrete or abstract. A concrete concept can be easily identified, quantified, and measured, whereas an abstract concept is more difficult to quantify or measure. For example, weight, blood pressure, and body temperature are concrete concepts. Hope, uncertainty, and spiritual pain are more abstract concepts. In a study, resilience is a relatively abstract concept.
Conceptual definition Much like a dictionary definition, a conceptual definition conveys the general meaning of the concept. However, the conceptual definition goes beyond the general language meaning found in the dictionary by defining or explaining the concept as it is rooted in theoretical literature.
Conceptual framework/theoretical framework A set of interrelated concepts that represents an image of a phenomenon. These two terms are often used interchangeably. The conceptual/theoretical framework refers to a structure that provides guidance for research or practice. The framework identifies the key concepts and describes their relationships to each other and to the phenomena (variables) of concern to nursing. It serves as the foundation on which a study can be developed or as a
146
map to aid in the design of the study.
Construct Complex concept; constructs usually comprise more than one concept and are built or “constructed” to fit a purpose. Health promotion, maternal-infant bonding, health-seeking behaviors, and health-related quality of life are examples of constructs.
Model A graphic or symbolic representation of a phenomenon. A graphic model is empirical and can be readily represented. A model of an eye or a heart is an example. A symbolic or theoretical model depicts a phenomenon that is not directly observable and is expressed in language or symbols. Written music or Einstein’s theory of relativity are examples of symbolic models. Theories used by nurses or developed by nurses frequently include symbolic models. Models are very helpful in allowing the reader to visualize key concepts/constructs and their identified interrelationships.
Operational definition Specifies how the concept will be measured. That is, the operational definition defines what instruments will be used to assess the presence of the concept and will be used to describe the amount or degree to which the concept exists.
Theory Set of interrelated concepts that provides a systematic view of a phenomenon.
Types of theories used by nurses As stated previously, a theory is a set of interrelated concepts that provides a systematic view of a phenomenon. Theory provides a foundation and structure that may be used for the purpose of explaining or predicting another phenomenon. In this way, a theory is like a blueprint or a guide for modeling a structure. A blueprint depicts the elements of a structure and the relationships among the elements; similarly, a theory depicts the concepts that compose it and suggests how the concepts are related.
147
Nurses use a multitude of different theories as the foundation or structure for research and practice. Many have been developed by nurses and are explicitly related to nursing practice; others, however, come from other disciplines. Knowledge that draws upon both nursing and non-nursing theories is extremely important in order to provide excellent, evidence-based care.
Theories from related disciplines used in nursing practice and research Like engineering, architecture, social work, and teaching, nursing is a practice discipline. That means that nurses use concepts, constructs, models, and theories from many disciplines in addition to nursing-specific theories. This is, to a large extent, the rationale for the “liberal arts” education that is required before entering a BSN program. Exposure to knowledge and theories of basic and natural sciences (e.g., mathematics, chemistry, biology) and social sciences (e.g., psychology, sociology, political science) provides a fundamental understanding of those disciplines and allows for application of key principles, concepts, and theories from each, as appropriate.
Likewise, BSN-prepared nurses use principles of administration and management and learning theories in patient-centered, holistic practices. Table 4.2 lists a few of the many theories and concepts from other disciplines that are commonly used by nurses in practice and research that become part of the foundational framework for nursing.
TABLE 4.2 Theories Used in Nursing Practice and Research
Discipline Examples of Theories/Concepts Used by Nurses Biomedical sciences Germ theory (principles of infection), pain theories, immune
function, genetics/genomics, pharmacotherapeutics Sociologic sciences Systems theory (e.g., VonBertalanffy), family theory (e.g., Bowen),
role theory (e.g., Merton), critical social theory (e.g., Habermas), cultural diversity (e.g., Leininger)
Behavioral sciences Developmental theories (e.g., Erikson), human needs theories (e.g., Maslow), personality theories (e.g., Freud), stress theories (e.g., Lazarus & Folkman), health belief model (e.g., Rosenstock)
Learning theories Behavioral learning theories (e.g., Pavlov, Skinner), cognitive development/interaction theories (e.g., Piaget), adult learning
148
theories (e.g., Knowles) Leadership/management Change theory (e.g., Lewin), conflict management (e.g., Rapaport),
quality framework (e.g., Donabedian)
Nursing theories used in practice and research In addition to the theories and concepts from disciplines other than nursing, the nursing literature presents a number of theories that were developed specifically by and for nurses. Typically, nursing theories reflect concepts, relationships, and processes that contribute to the development of a body of knowledge specific to nursing’s concerns. Understanding these interactions and relationships among the concepts and phenomena is essential to evidence-based nursing care. Further, theories unique to nursing help define how it is different from other disciplines.
HELPFUL HINT In research and practice, concepts often create descriptions or images that emerge from a conceptual definition. For instance, pain is a concept with different meanings based on the type or aspect of pain being referred to. As such, there are a number of methods and instruments to measure pain. So a nurse researching postoperative pain would conceptually define pain based on the patient’s perceived discomfort associated with surgery, and then select a pain scale/instrument that allows the researcher to operationally define pain as the patient’s score on that scale.
Nursing theories are often described based on their scope or degree of abstraction. Typically, these are reported as “grand,” “middle range,” or “situation specific” (also called “microrange”) nursing theories. Each is described in this section.
Grand nursing theories Grand nursing theories are sometimes referred to as nursing conceptual models and include the theories/models that were developed to describe the discipline of nursing as a whole. This comprises the works of nurse theorists such as Florence Nightingale, Virginia Henderson, Martha Rogers, Dorthea Orem, and Betty Neuman. Grand nursing theories/models are all-inclusive
149
conceptual structures that tend to include views on persons, health, and environment to create a perspective of nursing. This most abstract level of theory has established a knowledge base for the discipline. These works are used as the conceptual basis for practice and research, and are tested in research studies.
One grand theory is not better than another with respect to research. Rather, these varying perspectives allow a researcher to select a framework for research that best depicts the concepts and relationships of interest, and decide where and how they can be measured as study variables. What is most important about the use of grand nursing theoretical frameworks for research is the logical connection of the theory to the research question and the study design. Nursing literature contains excellent examples of research studies that examine concepts and constructs from grand nursing theories. See Box 4.2 for an example. BOX 4.2 Grand Theory Example Wong and colleagues (2015) used Orem’s self-care deficit nursing theory to examine the relationships among several factors such as parental educational levels, pain intensity, and self-medication on self-care behaviors among adolescent girls with dysmenorrhea. The researchers used a correlational study design that surveyed 531 high school–aged girls. Using constructs from Orem’s theory, they determined health care providers should design interventions that promote self-care behaviors among adolescents with dysmenorrhea, specifically targeting those who are younger, those who report higher pain intensity, and those who do not routinely self-medicate for menstrual pain.
Middle range nursing theories Beginning in the late 1980s, nurses recognized that grand theories were difficult to apply in research, and considerable attention moved to the development and research of “middle range” nursing theories. In contrast to grand theories, middle range nursing theories contain a limited number of concepts and are focused on a limited aspect of reality. As a result, they are more easily tested through research and more readily used as frameworks for research
150
studies (McEwen & Wills, 2014). A growing number of middle range nursing theories have been
developed, tested through research, and/or are used as frameworks for nursing research. Examples are Pender’s Health Promotion Model (Pender et al., 2015), the Theory of Uncertainty in Illness (Mishel, 1988, 1990, 2014), the Theory of Unpleasant Symptoms (Lenz, Pugh, et al., 1997; Lenz, Gift, et al., 2017), and the Theory of Holistic Comfort (Kolcaba, 1994, 2017).
Examples of development, use, and testing of middle range theories and models are becoming increasingly common in the nursing literature. The comprehensive health-seeking and coping paradigm (Nyamathi, 1989) is one example. Indeed, Nyamathi’s model served as the conceptual framework of a recent research study that examined interventions to improve hepatitis A and B vaccine completion among homeless men (Nyamathi et al., 2015) (see Box 4.3 and Appendix A). In this study, the findings were interpreted according to the model. The researchers identified several predictors of vaccine completion and concluded that providers work to recognize factors that promote health-seeking and coping behaviors among high-risk populations. BOX 4.3 Middle Range Theory Exemplars An integrative research review was undertaken to evaluate the connection between symptom experience and illness-related uncertainty among patients diagnosed with brain tumors. The Theory of Uncertainty in Illness (Mishel, 1988, 1990, 2014) was the conceptual framework for interpretation of the review’s findings. The researchers concluded that somatic symptoms are antecedent to uncertainty among brain tumor patients, and that nursing strategies should attempt to understand and manage symptoms to reduce anxiety and distress by mitigating illness-related uncertainty (Cahill et al., 2012).
Bryer and colleagues (2013) conducted a study of health promotion behaviors of undergraduate nursing students. This study was based on Pender’s HPM (Pender et al., 2015). Several variables for the study were operationalized and measured using the Health Promotion Lifestyle Profile II, a survey instrument that
151
was developed to be used in studies that focus on HPM concepts.
HPM, Health Promotion Model.
Situation-specific nursing theories: Microrange, practice, or prescriptive theories Situation-specific nursing theories are sometimes referred to as microrange, practice, or prescriptive theories. Situation-specific theories are more specific than middle range theories and are composed of a limited number of concepts. They are narrow in scope, explain a small aspect of phenomena and processes of interest to nurses, and are usually limited to specific populations or field of practice (Chinn & Kramer, 2015; Im, 2014; Peterson, 2017). Im and Chang (2012) observed that as nursing research began to require theoretical bases that are easily operationalized into research, situation-specific theories provided closer links to research and practice. The idea and practice of identifying a work as a situation-specific theory is still fairly new. Often what is noted by an author as a middle range theory would more appropriately be termed situation specific. Most commonly, however, a theory is developed from a research study, and no designation (e.g., middle range, situation specific) is attached to it.
Examples of self-designated, situation-specific theories include the theory of men’s healing from childhood maltreatment (Willis et al., 2015) and a situation-specific theory of health-related quality of life among Koreans with type 2 diabetes (Chang & Im, 2014). Increasingly, qualitative studies are being used by nurses to develop and support theories and models that can and should be expressly identified as situation specific. This will become progressively more common as more nurses seek graduate study and are involved in research, and increasing attention is given to the importance of evidence-based practice (Im & Chang, 2012; McEwen & Wills, 2014).
Im and Chang (2012) conducted a comprehensive research review that examined how theory has been described in nursing literature for the last decade. They reported a dramatic increase in the number of grounded theory research studies, along with increases in studies using both middle range and situation-specific
152
theories. In contrast, the number and percentage directly dealing with grand nursing theories have fluctuated. Table 4.3 provides examples of grand, middle range, and situation-specific nursing theories used in nursing research.
TABLE 4.3 Levels of Nursing Theory: Examples of Grand, Middle Range, and Situation-Specific Nursing Theories
Grand Nursing Theories Middle Range NursingTheories Situation-Specific (or Micro) Nursing Theories
Florence Nightingale: Notes on Nursing (1860)
Dorothy Johnson: The Behavioral Systems Model for Nursing (1990)
Martha Rogers: Nursing: A Science of Unitary Human Beings (1970, 1990)
Betty Neuman: The Neuman Systems Model (2009)
Dorthea Orem: The Self Care Deficit Nursing Theory (2001)
Callista Roy: Roy Adaptation Model (2009)
Health promotion model (Pender et al., 2015)
Uncertainty in illness theory (Mishel, 1988, 1990, 2014)
Theory of unpleasant symptoms (Lenz, Gift, et al., 2017)
Theory of holistic comfort/theory of comfort (Kolcaba, 1994, 2017)
Theory of resilience (Polk, 1997)
Theory of health promotion in preterm infants (Mefford, 2004)
Theory of flight nursing expertise (Reimer & Moore, 2010)
Theory of the peaceful end of life (Ruland & Moore, 1998)
Theory of chronic sorrow (Eakes, 2017; Eakes et al., 1998)
Asian immigrant women’s menopausal symptom experience in the United States (Im, 2012)
Theory of Caucasians’ cancer pain experience (Im, 2006)
Becoming a mother (Mercer, 2004)
How theory is used in nursing research Nursing research is concerned with the study of individuals in interaction with their environments. The intent is to discover interventions that promote optimal functioning and self-care across the life span; the goal is to foster maximum wellness (McEwen & Wills, 2014). In nursing research, theories are used in the research process in one of three ways:
• Theory is generated as the outcome of a research study (qualitative designs).
• Theory is used as a research framework, as the context for a study (qualitative or quantitative designs).
153
• Research is undertaken to test a theory (quantitative designs).
Theory-generating nursing research When research is undertaken to create or generate theory, the idea is to examine a phenomenon within a particular context and identify and describe its major elements or events. Theory- generating research is focused on “What” and “How,” but does not usually attempt to explain “Why.” Theory-generating research is inductive; that is, it uses a process in which generalizations are developed from specific observations. Research methods used by nurses for theory generation include concept analysis, case studies, phenomenology, grounded theory, ethnography, and historical inquiry. Chapters 5, 6, and 7 describe these research methods. As you review qualitative methods and study examples in the literature, be attuned to the stated purpose(s) or outcomes of the research and note whether a situation-specific (practice or micro) theory or model or middle range theory is presented as a finding or outcome.
Theory as framework for nursing research In nursing research, theory is most commonly used as the conceptual framework, theoretical framework, or conceptual model for a study. Frequently, correlational research designs attempt to discover and specify relationships between characteristics of individuals, groups, situations, or events. Correlational research studies often focus on one or more concepts, frameworks, or theories to collect data to measure dimensions or characteristics of phenomena and explain why and the extent to which one phenomenon is related to another. Data is typically gathered by observation or self-report instruments (see Chapter 10 for nonexperimental designs).
HELPFUL HINT When researchers use conceptual frameworks to guide their studies, you can expect to find a system of ideas synthesized for the purpose of organizing, thinking, and providing study direction. Whether the researcher is using a conceptual or a theoretical framework, conceptual and then operational definitions will
154
emerge from the framework.
Often in correlational (nonexperimental/quantitative) research, one or more theories will be used as the conceptual/theoretical framework for the study. In these cases, a theory is used as the context for the study and basis for interpretation of the findings. The theory helps guide the study and enhances the value of its findings by setting the findings within the context of the theory and previous works, describing use of the theory in practice or research. When using a theory as a conceptual framework for research, the researcher will:
• Identify an existing theory (or theories) and designate and explain the study’s theoretical framework.
• Develop research questions/hypotheses consistent with the framework.
• Provide conceptual definitions taken from the theory/framework.
• Use data collection instrument(s) (and operational definitions) appropriate to the framework.
• Interpret/explain findings based on the framework.
• Determine support for the theory/framework based on the study findings.
• Discuss implications for nursing and recommendations for future research to address the concepts and relationships designated by the framework.
Theory-testing nursing research Finally, nurses may use research to test a theory. Theory testing is deductive—that is, hypotheses are derived from theory and tested, employing experimental research methods. In experimental research, the intent is to move beyond explanation to prediction of relationships between characteristics or phenomena among different groups or in various situations. Experimental research
155
designs require manipulation of one or more phenomena to determine how the manipulation affects or changes the dimension or characteristics of other phenomena. In these cases, theoretical statements are written as research questions or hypotheses. Experimental research requires quantifiable data, and statistical analyses are used to measure differences (see Chapter 9).
In theory-testing research, the researcher (1) chooses a theory of interest and selects a propositional statement to be examined; (2) develops hypotheses that have measurable variables; (3) conducts the study; (4) interprets the findings considering the predictive ability of the theory; and (5) determines if there are implications for further use of the theory in nursing practice and/or whether further research could be beneficial.
EVIDENCE-BASED PRACTICE TIP In practice, you can use observation and analysis to consider the nuances of situations that matter to patient health. This process often generates questions that are cogent for improving patient care. In turn, following the observations and questions into the literature can lead to published research that can be applied in practice.
HIGHLIGHT When an interprofessional QI team launches a QI project to develop evidence-based behavior change self-management strategies for a targeted patient population, it may be helpful to think about the Transtheoretical Model of Change and health self- efficacy as an appropriate theoretical framework to guide the project.
Application to research and evidence-based practice To build knowledge that promotes evidence-based practice, research should develop within a theoretical structure that facilitates analysis and interpretation of findings. When a study is placed within a theoretical context, the theory guides the research process, forms the questions, and aids in design, analysis, and
156
interpretation. In that regard, a theory, conceptual model, or conceptual framework provides parameters for the research and enables the researcher to weave the facts together.
As a consumer of research, you should know how to recognize the theoretical foundation of a study. Whether evaluating a qualitative or a quantitative study, it is essential to understand where and how the research can be integrated within nursing science and applied in evidence-based practice. As a result, it is important to identify whether the intent is to (1) generate a theory, (2) use the theory as the framework that guides the study, or (3) test a theory. This section provides examples that illustrate different types of theory used in nursing research (e.g., non-nursing theories, middle range nursing theories) and examples from the literature highlighting the different ways that nurses can use theory in research (e.g., theory-generating study, theory testing, theory as a conceptual framework).
Application of theory in qualitative research As discussed, in many instances, a theory, framework, or model is the outcome of nursing research. This is often the case in research employing qualitative methods such as grounded theory. From the study’s findings, the researcher builds either an implicit or an explicit structure explaining or describing the findings of the research.
Example: ➤ van Dijk and colleagues (2016) (see Appendix C) reported findings from a study examining how postoperative patients rated their pain experiences. The researchers were interested in understanding potential differences in how postoperative patients interpret numeric pain rating scales. Using a qualitative approach to data collection, the team interviewed 27 patients 1 day after surgery. They discovered three themes (score- related factors, intrapersonal factors, and anticipated consequences of a pain score). The result of the research was a model that may be used by health providers to understand the factors that influence how pain scales may be interpreted by patients. Appropriate questions for calcification were also suggested.
Generally, when the researcher is using qualitative methods and inductive reasoning, you will find the framework or theory at the
157
end of the manuscript in the discussion section (see Chapters 5 to 7). You should be aware that the framework may be implicitly suggested rather than explicitly diagrammed (Box 4.4). BOX 4.4 Research Martz (2015) used grounded theory research methods to examine actions taken by hospice nurses to alleviate the feelings of guilt often experienced by caregivers. In this study, 16 hospice providers (most were nurses) were interviewed to identify interventions they used to reduce feelings of guilt among family caregivers during the transition from caring for their loved one at home to enlisting their loved one in an assisted living facility. The hospice nurses explained that the family caregivers worked through a five-stage process in their guilt experiences, moving from “feeling guilty” to “resolving their guilt” during the transition period. The actions of the hospice nurses varied based on the stage of the family caregiver’s feelings of guilt. These actions included supporting, managing, navigating, negotiating, encouraging, monitoring, and coaching. A situation-specific model was proposed to explain the relationships among these processes and suggesting congruent hospice nursing interventions.
The nursing literature is full of similar examples in which inductive, qualitative research methods were used to develop theory. Example: ➤ A team headed by Oneal and colleagues (2015) used grounded theory methods to conduct interviews with 10 low- income families who were involved in a program to reduce environmental risks to their children. Their findings were developed into the “theory of re-forming the risk message,” which can be used by designing nursing interventions to reduce environmental risk. It was concluded that nurses working with low-income families should seek to discover how risk messages are heard and interpreted and develop interventions accordingly. Finally, a team led by Taplay and colleagues (2015) used grounded theory methods to develop a model to describe the process of adopting and incorporating simulation into nursing education. The researchers interviewed 27 nursing faculty members from several
158
schools to learn about their experiences incorporating simulation activities into their nursing programs. From the interviews, the researchers identified a seven-phase process of simulation adoption: securing resources, leaders working in tandem, “getting it out of the box,” learning about simulation and its potential, trialing the equipment, finding a fit, and integrating simulation into the curriculum.
Examples of theory as research framework When the researcher uses quantitative methods, the framework is typically identified and explained at the beginning of the paper, before the discussion of study methods. Example: ➤ In their study examining the relationships among spirituality, coping strategies, grief, and mental health in bereaved parents, Hawthorne and colleagues (2016) (see Appendix B) indicated that their “conceptual framework” was derived from a Theory of Bereavement developed by Hogan and colleagues (1996). Specifically, Hawthorne’s team used tools developed to measure variables from the Theory of Bereavement. In addition to grief, their research examined spiritual coping, mental health, and personal growth—all variables implicit or explicit in the bereavement theory. Their conclusions were interpreted with respect to the theory, suggesting that nurses and other health care providers promote coping strategies, including religious and spiritual activities, as these appear to be helpful for mental health and personal growth in many bereaved parents.
In another example, one of the works read by Emily from the case study dealt with resilience in adolescents (Tusaie et al., 2007). The researchers in this work used Lazarus and Folkman’s (1984) theory of stress and coping as part of the theoretical framework, researching factors such as optimism, family support, age, and life events.
Examples of theory-testing research Although many nursing studies that are experimental and quasi- experimental (see Chapter 9) are frequently conducted to test interventions, examples of research expressly conducted to test a theory are relatively rare in nursing literature. One such work is a multisite, multimethods study examining women’s perceptions of
159
cesarean birth (Fawcett et al., 2012). This work tested multiple relationships within the Roy Adaptation Model as applied to the study population.
CRITICAL APPRAISAL CRITERIA Critiquing Theoretical Framework
1. Is the framework for research clearly identified?
2. Is the framework consistent with a nursing perspective?
3. Is the framework appropriate to guide research on the subject of interest?
4. Are the concepts and variables clearly and appropriately defined?
5. Was sufficient literature presented to support study of the selected concepts?
6. Is there a logical, consistent link between the framework, the concepts being studied, and the methods of measurement?
7. Are the study findings examined in relationship to the framework?
Critiquing the use of theory in nursing research It is beneficial to seek out, identify, and follow the theoretical framework or source of the background of a study. The framework for research provides guidance for the researcher as study questions are fine-tuned, methods for measuring variables are selected, and analyses are planned. Once data are collected and analyzed, the framework is used as a base of comparison. Ideally, the research should explain: Did the findings coincide with the framework? Did the findings support or refute findings of other researchers who used the framework? If there were discrepancies, is there a way to explain them using the framework? The reader of research needs to know how to critically appraise a framework for research (see the Critical Appraisal Criteria box).
160
The first question posed is whether a framework is presented. Sometimes a structure may be guiding the research, but a diagrammed model is not included in the manuscript. You must then look for the theoretical framework in the narrative description of the study concepts. When the framework is identified, it is important to consider its relevance for nursing. The framework does not have to be one created by a nurse, but the importance of its content for nursing should be clear. The question of how the framework depicts a structure congruent with nursing should be addressed. For instance, although the Lazarus Transaction Model of Stress and Coping was not created by a nurse, it is clearly related to nursing practice when working with people facing stress. Sometimes frameworks from different disciplines, such as physics or art, may be relevant. It is the responsibility of the author to clearly articulate the meaning of the framework for the study and to link the framework to nursing.
Once the meaning and applicability of the theory (if the objective of the research was theory development) or the theoretical framework to nursing are articulated, you will be able to determine whether the framework is appropriate to guide the research. As you critically appraise a study, you would identify a mismatch, for example, in which a researcher presents a study of students’ responses to the stress of being in the clinical setting for the first time within a framework of stress related to recovery from chronic illness. You should look closely at the framework to determine if it is “on target” and the “best fit” for the research question and proposed study design.
Next, the reader should focus on the concepts being studied. Does the researcher clearly describe and explain concepts that are being studied and how they are defined and translated into measurable variables? Is there literature to support the choice of concepts? Concepts should clearly reflect the area of study. Example: ➤ Using the concept of “anger,” when “incivility” or “hostility” is more appropriate to the research focus creates difficulties in defining variables and determining methods of measurement. These issues have to do with the logical consistency among the framework, the concepts being studied, and the methods of measurement.
161
Throughout the entire critiquing process, from worldview to operational definitions, the reader is evaluating the fit. Finally, the reader will expect to find a discussion of the findings as they relate to the theory or framework. This final point enables evaluation of the framework for use in further research. It may suggest necessary changes to enhance the relevance of the framework for continuing study, and thus serves to let others know where one will go from here.
Evaluating frameworks for research requires skills that must be acquired through repeated critique and discussion with others who have critiqued the same work. As with other abilities and skills, you must practice and use the skills to develop them further. With continuing education and a broader knowledge of potential frameworks, you will build a repertoire of knowledge to assess the foundation of a research study and the framework for research, and/or to evaluate findings where theory was generated as the outcome of the study.
Key points • The interaction among theory, practice, and research is central to
knowledge development in the discipline of nursing.
• The use of a framework for research is important as a guide to systematically identify concepts and to link appropriate study variables with each concept.
• Conceptual and operational definitions are critical to the evolution of a study.
• In developing or selecting a framework for research, knowledge may be acquired from other disciplines or directly from nursing. In either case, that knowledge is used to answer specific nursing questions.
• Theory is distinguished by its scope. Grand theories are broadest in scope and situation-specific theories are the narrowest in scope and at the lowest level of abstraction; middle range theories are
162
in the middle.
• In critiquing a framework for research, it is important to examine the logical, consistent link among the framework, the concepts for study, and the methods of measurement.
Critical thinking challenges • Search recent issues of a prominent nursing journal (e.g., Nursing
Research, Research in Nursing & Health) for notations of conceptual frameworks of published studies. How many explicitly discussed the theoretical framework? How many did not mention any theoretical framework? What kinds of theories were mentioned (e.g., grand nursing theories, middle range nursing theories, non- nursing theories)? How many studies were theory generating? How many were theory testing?
• Identify a non-nursing theory that you would like to know more about. How could you find out information on its applicability to nursing research and nursing practice? How could you identify whether and how it has been used in nursing research?
• Select a nursing theory, concept, or phenomenon (e.g., resilience from the case study) that you are interested in and would like to know more about and consider: How could you find studies that have used that theory in research and practice? How could you locate published instruments and tools that reportedly measure concepts and constructs of the theory?
• You have just joined an interprofessional primary care QI Team focused on developing evidence-based self-management strategies to decrease hospital admissions for the practice’s heart failure patients. Which theoretical framework could be used to guide your project?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
163
The author would like to acknowledge the contribution of Patricia Liehr, who contributed this chapter in a previous edition.
References 1. Ahern N.R. Adolescent resilience An evolutionary concept
analysis. Journal of Pediatric Nursing 2006;21(3):175-185. 2. Ahern N.R, Ark P, Byers J. Resilience and coping strategies in
adolescents. Pediatric Nursing 2008;20(10):32-36. 3. Bryer J, Cherkis F, Raman J. Health-promotion behaviors of
undergraduate nursing students A survey analysis. Nursing Education Perspectives 2013;34(6):410-415.
4. Cahill J, LoBiondo-Wood G, Bergstrom N, et al. Brain tumor symptoms as antecedents to uncertainty An integrative review. Journal of Nursing Scholarship 2012;44(2):145-155.
5. Chang S.J, Im E. Development of a situation-specific theory for explaining health-related quality of life among older South Korean adults with type 2 diabetes. Research and Theory for Nursing Practice An International Journal 2014;28(2):113-126.
6. Chen C.M, Chen Y.C, Wong T.T. Comparison of resilience in adolescent survivors of brain tumors and health adolescents. Cancer Nursing 2014;37(5):373-381.
7. Chinn P.L, Kramer M.K. Integrated theory and knowledge development in nursing. 9th ed. St. Louis, MO: Elsevier;2015.
8. Eakes G. Chronic sorrow. In: Peterson S.J, BredowS T.S. Middle range theories Application to nursing research 4th ed. Philadelphia, PA: Wolters Kluwer;2017.
9. Eakes G, Burke M.L, Hainsworth M.A. Middle rang theory of chronic sorrow. Image Journal of Nursing Scholarship 1998;30(2):179-185.
10. Fawcett J, Abner C, Haussler S, et al. Women’s perceptions of caesarean birth A Roy international study. Nursing Science Quarterly 2012;24(40):352-362.
11. Hawthorne D.M, Youngblut J.M, Brooten D. Parent spirituality, grief and mental health at 1 and 3 months after their infant’s child’s death in an intensive care unit. Journal of Pediatric Nursing 2016;31(1):73-80.
12. Hogan N.S, Morse J.M, Tason M.C. Toward an experiential
164
theory of bereavement. Omega Journal of Death and Dying 1996;33:43-65.
13. Im E. A situation-specific theory of Caucasian cancer patients’ pain experience. Advances in Nursing Science 2006;28(2):137- 151.
14. Im E. The status quo of situation-specific theories. Research and Theory for Nursing Practice An International Journal 2014;28(4):278-298.
15. Im E, Chang S.J. Current trends in nursing theories. Journal of Nursing Scholarship 2012;44(2):156-164.
16. Ishibashi A, Okamura J, Ueda R, et al. Psychological strength enhancing resilience in adolescents and young adults with cancer. Journal of Pediatric Oncology Nursing 2016;33(1):45-54.
17. Johnson D.E. The behavioral system model for nursing. In: Parker M.E. Nursing theories in practice. New York, NY: National League for Nursing Press 1990;23-32
18. Kolcaba K.Y. A theory of holistic comfort for nursing. Journal of Advanced Nursing 1994;19(6):1178-1184.
19. Kolcaba K.Y. Comfort. In: Peterson S.J, Bredow T.S. Middle range theories Application to nursing research 4th ed. Philadelphia, PA: Wolters Kluwer 2017;254-272.
20. Lazarus R.S, Folkman S. Stress, appraisal and coping. New York, NY: Springer;1984.
21. Lenz E.R, Gift A, Pugh L.C, Milligan R.A. Unpleasant symptoms. In: Peterson S.J, Bredow T.S. Middle range theories Application to nursing research 4th ed. Philadelphia, PA: Wolters Kluwer;2017.
22. Lenz E.R, Pugh L.C, Miligan R.A, et al. The middle range theory of unpleasant symptoms An update. Advances in Nursing Science 1997;19(3):14-27.
23. Martz K. Actions of hospice nurses to alleviate guilt in family caregiver during residential care transitions. Journal of Hospice and Palliative Nursing 2015;17(1):48-55.
24. McEwen M, Wills E. Theoretical basis for nursing. 4th ed. Philadelphia, PA: Lippincott;2014.
25. Mefford L.C. A theory of health promotion for preterm infants based on Levine’s conservation model of nursing. Nursing Science Quarterly 2004;17(3):260-266.
165
26. Mercer R.T. Becoming a mother versus maternal role attainment. Journal of Nursing Scholarship 2004;36(3):226-232.
27. Mishel M.H. Uncertainty in illness. Journal of Nursing Scholarship 1988;20(4):225-232.
28. Mishel M.H. Reconceptualization of the uncertainty in illness theory. Image Journal of Nursing Scholarship 1990;22(4):256- 262.
29. Mishel M.H. Theories of uncertainty in illness. In: Smith M.J, Liehr P.R. Middle range theory for nursing 3rd ed. New York, NY: Springer Publishing Co 2014;53-86.
30. Neuman B, Fawcett J. The Neuman systems model. 5th ed. Upper Saddle River, NJ: Pearson Education;2009.
31. Nightingale F. Notes on nursing What it is and what it is not. New York, NY: Dover Publications (Original work published 1860);1969.
32. Nyamathi A. Comprehensive health seeking and coping paradigm. Journal of Advanced Nursing 1989;14(4):281-290.
33. Nyamathi A, Salem B.E, Zhang S, et al. Nursing case management, peer coaching, and hepatitis A and B vaccine completion among homeless men recently released on parole. Nursing Research 2015;64(3):177-189.
34. Oneal G.A, Eide P, Hamilton R, et al. Rural families’ process of re-forming environmental health risk messages. Journal of Nursing Scholarship 2015;47(4):354-362.
35. Orem D.E. Nursing Concepts of practice. 6th ed. St Louis, MO: Mosby;2001.
36. Pender N.J, Murdaugh C, Parsons M. Health promotion in nursing practice. 7th ed. Upper Saddle River, NJ: Pearson Education;2015.
37. Peterson S.J. Introduction to the nature of nursing knowledge. In: Peterson S.J, Bredow T.S. Middle range theories Application to nursing research 4th ed. Philadelphia, PA: Wolters Kluwer 2017;3-41.
38. Polk L.V. Toward a middle range theory of resilience. Advances in Nursing Science 1997;19(3):1-13.
39. Reimer A.P, Moore S.M. Flight nursing expertise towards a middle-range theory. Journal of Advanced Nursing 2010;66(5):1183-1192.
166
40. Rogers M.E. An introduction to the theoretical basis of nursing. Philadelphia, PA: Davis;1970.
41. Rogers M.E. Nursing the science of unitary, irreducible, human beings: Update: 1990. In: Barrett E.A.M. Visions of Rogers’ science-based nursing. New York, NY: National League for Nursing Press 1990;5-11
42. Roy C. The Roy adaptation model. 3rd ed. Upper Saddle River, NJ: Pearson;2009.
43. Ruland C.M, Moore S.M. Theory construction based on standards of care A proposed theory of the peaceful end of life. Nursing Outlook 1998;46(4):169-175.
44. Taplay K, Jack S.M, Baxter P, et al. The process of adopting and incorporating simulation into undergraduate nursing curricula A grounded theory study. Journal of Professional Nursing 2015;31(1):26-36.
45. Turner-Sack A.M, Menna R, Setchell S.R, et al. Psychological functioning, post-traumatic growth and coping in parents and sibling of adolescent cancer survivors. Oncology Nursing Forum 2016;43(10):48-56.
46. Tusaie K, Puskar K, Sereika S.M. A predictive and moderating model of psychosocial resilience in adolescents. Journal of Nursing Scholarship 2007;39(1):54-60.
47. van Dijk J.F.M, Vervoort S.C.J.M, van Wijck A.J.M, et al. Postoperative patients’ perspectives on rating pain A qualitative study. International Journal of Nursing Studies 2016;53:260- 269.
48. Willis D.G, DeSanto-Madeya S, Fawcett J. Moving beyond dwelling in suffering A situation-specific theory of men’s healing form childhood maltreatment. Nursing Science Quarterly 2015;28(1):57-63.
49. Wong C.L, Ip W.Y, Choi K.C, Lam L.W. Examining self-care behaviors and their associated factors among adolescent girls with dysmenorrhea An application of Orem’s Self-care Deficit Nursing Theory. Journal of Nursing Scholarship 2015;47(3):219-227.
50. Wu W.W, Tsai S.Y, Liang S.Y, et al. The mediating role of resilience on quality of life and cancer symptom distress in adolescent patients with cancer. Journal of Pediatric Oncology
167
Nursing 2015;32(5):304-313.
168
PART I I
Processes and Evidence Related to Qualitative Research Research Vignette: Gail D’Eramo Melkus
OUTLINE
Introduction
5. Introduction to qualitative research
6. Qualitative approaches to research
7. Appraising qualitative research
169
Introduction
Research vignette
Type 2 diabetes: Journey from description to biobehavioral intervention
Gail D’Eramo Melkus, EdD, ANP, FAAN
Florence and William Downs Professor in Nursing Research
Director, Muriel and Virginia Pless Center for Nursing Research
Associate Dean for Research
New York University Rory Meyers College of Nursing
My nursing career began at a time when there was an emphasis on health promotion, disease prevention, and active participation of patients and families in health care decision making and interactions. This emphasis was consistent with an ever-increasing incidence and prevalence of chronic conditions, particularly diabetes and cardiovascular disease. It became apparent in time through epidemiological studies that certain populations had a disproportionate burden of these chronic conditions that resulted in premature morbidity and mortality. It also became apparent that the health care workforce was not prepared to deal with the changing paradigm of chronic disease management that necessitated active patient involvement. Thus I came to understand the best way to enhance diabetes care for all persons was to
170
improve clinical practice through research and professional education.
Diabetes is a prevalent chronic illness affecting approximately 29 million individuals in the United States and 485 million globally. Thus the dissemination and translation of research findings to clinical practice is necessary to decrease the personal and economic burden of disease. In order to contribute to the improvement of diabetes care and outcomes, my role as a direct care provider extended to and encompassed clinical research and education and served as a model for my mentees. My integrated scholarship addresses the quality and effectiveness of diabetes behavioral interventions and care in the context of the patient and culture, primary care, and professional practice while also serving as a training ground for clinical practice and clinical research. This work has extended to collaborations with colleagues nationally and internationally. My research collectively demonstrated the beneficial effects of behavioral self-management interventions combined with diabetes care in primary care.
My program of research has contributed to the body of literature that has demonstrated the effectiveness of behavioral interventions in improving metabolic control (hemoglobin A1c [HbA1c], BP, lipids, and weight) and diabetes-related emotional distress. One of my early studies tested a comprehensive intervention for obese men and women with type 2 diabetes that demonstrated efficacy in significantly improving diabetes control and weight loss compared to a control group that received a customary intervention of diabetes patient education (D’Eramo-Melkus et al., 1992). Post-hoc analysis of study participants with equal weight loss yet disparate HbA1c levels revealed that persons with elevated HbA1c levels had decreased insulin secretion capacity that was associated with a 10 years or greater duration of type 2 diabetes. This study contributed to clinical practice recommendations that called for assessment of insulin secretion capacity to direct therapeutic interventions such that persons with low insulin secretory reserve should be started on insulin rather than continued weight loss intervention alone. It became apparent during the implementation of the intervention study that the majority of participants received diabetes care in primary care settings where diabetes care and self-management
171
resources were scarce or nonexistent. In an effort to better understand the delivery of diabetes care within primary care settings so that we could best develop effective patient centered interventions, my research turned to assessing nurse practitioner (NP) and physician diabetes care practice patterns in a large urban primary care center. This study showed that both primary NPs and physicians were not providing diabetes care according to the American Diabetes Association clinical care guidelines. In fact, screening for diabetes complications occurred in fewer than 50% of cases, and NPs performed foot exams less often than physicians (Fain & D’Eramo-Melkus, 1994). These findings along with other studies that found similar results provided an impetus to develop and implement a model program of advanced practice nursing education and subspecialty training in diabetes care (D’Eramo- Melkus & Fain, 1995). Graduates of this program (over 300 to date) have assumed leadership roles in facilitating diabetes care in generalist and specialty settings throughout the United States, Canada, and various international sites. During this education and training program, many of the students participated in my program of research and contributed to a growing body of literature on diabetes care.
Epidemiological studies in the early 1990s showed that increasingly ethnic minorities suffered a disproportionate burden of type 2 diabetes and related complications. In particular, black women had and continue to have the highest rate of disease and diabetes-related complications, with the poorest health outcomes, and a 40% greater mortality compared to black men and white men and women. Therefore my program of research came to focus on this group, beginning with descriptive studies that described the context of type 2 diabetes for black women. The first study of a small convenience sample of volunteers from an urban center revealed a group of midlife black women, the majority of whom were employed and customary utilizers of primary care. Despite their poor glycemic control (average HbA1c 12.8%), only 68% received diabetes medications, and less than 50% of the time were they screened for diabetes complications. In order to better understand factors contributing to such findings, we conducted focus groups to elicit information on diabetes beliefs and practices
172
of black women with type 2 diabetes. Key themes that emerged were a need for diabetes education and health care provider rapport, importance of culturally appropriate diabetes education materials, and the importance of family support (Maillet et al., 1996; Melkus et al., 2002).
Based on an informant survey and focus group data, using social learning theory and cognitive behavioral methods that incorporated the context of culture for black women with type 2 diabetes and input from a community advisory board, we developed and tested a culturally relevant intervention of group diabetes self- management education and skills training (DSME/T), along with nurse practitioner care. This intervention was first tested for feasibility using a one group repeated measures pretest, posttest design that demonstrated participant acceptability based on high rates of attendance at both group sessions and NP care visits, and feasibility of methods based on formative and summative process and fidelity measures. Further glycemic control was significantly improved baseline to 3 months and maintained at 6 months (p =.008), and the psychosocial outcome of diabetes-related emotional distress was also greatly reduced (p =.06) (Melkus et al., 2004). Given these promising results, we went on to test the efficacy of the DSME/T intervention using a two-group repeated measures design with a comparison group (control) that received customary group diabetes education; time and attention were controlled for in both groups. The primary outcome of glycemic control as measured by HbA1c was significantly improved from baseline to 3 and 6 months (p =.01, F = 6.15). The gold standard for glycemic control is HbA1c. HbA1c, when maintained in a normal range (≥7.0%), has been shown to prevent or slow the progression of diabetes-related complications (The Diabetes Control and Complications Trial Research Group, 1993).
One of the salient findings in all of the work-up to this point was that the women reported high levels of diabetes-related emotional distress, given the demands of diabetes self-management and complex lives that often included multigenerational family caregiving and work. The majority were grandmothers responsible for some extent of child care, which for many negatively affected their diabetes control (Balukonis et al., 2008). Recognizing the need
173
to address this concern, we added a coping skills training component that followed DSME/T when we conducted a prospective randomized clinical trial (RCT) to test intervention effectiveness. The control/comparison group received a customary diabetes education program followed by drop-in question and answer sessions equivalent in time so to control for a potential attention effect. The experimental (n = 52) and control group (n = 57) participants were in active intervention for 12 months, consisting of assigned group sessions and monthly NP visits for the first 2 months and quarterly thereafter; they were followed for a total of 24 months. As with any prospective behavioral intervention, trial attrition occurred resulting in a sample of 77 study completers. An intention to treat analysis that included all participants as randomly assigned showed that the primary outcome of HbA1c was significantly improved over time for both groups (p <.0001) up to 12 months, after which time control group levels showed an increase from 12 to 24 months while the intervention group remained stable (Melkus et al., 2010). This finding demonstrates the importance of active intervention that includes numerous contacts and feedback in order to facilitate optimal diabetes self-management and glycemic control. When data of completers (n = 77) were analyzed, the same significant finding resulted over time for HbA1c at 12 and 24 months. Low-density and high-density lipoprotein cholesterol levels also significantly improved over time for both groups. Quality of life (MOS-36) vitality domain, social support, and diabetes-related emotional distress were all significantly changed in the intervention group at 24 months compared to the control group. The results showed that we reached the intended target group of black women with suboptimal glycemic control, cardiovascular risk factors, poor quality of life, and high levels of emotional distress, in need of social support. Moreover, it is important to note that both groups received intervention beyond standard “real world” primary care. Thus patients with type 2 diabetes cared for in primary care settings when given the opportunity to participate in DSME/T may improve in both physiological and psychosocial outcomes. Further evidence is needed to promote the need for chronic disease self-management programs and psychosocial care beyond the medical visit that focuses on physiological parameters
174
and prescribing of therapeutic regimens.
References 1. Balukonis J., Melkus G. D., Chyun D. Grandparenthood
status and health outcomes in midlife African American women with type 2 diabetes. Ethnicity and Disease 2008;18(2):141-146.
2. D’Eramo Melkus G., Fain J. A. Diabetes care concentration a program of study for advanced practice nurses. Clinical Nurse Specialist 1995;9(6):313-316.
3. D’Eramo-Melkus G., Wylie-Rosett J., Hagan J. Metabolic impact of education on NIDDM. Diabetes Care 1992;15(7):864- 869.
4. The Diabetes Control and Complications Trial Research Group. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. New England Journal of Medicine 1993;329:977-986.
5. Fain J. A., D’Eramo G. Nurse practitioner practice patterns based on standards of medical care for patients with diabetes. Diabetes Care 1994;17(8):879-881.
6. Maillet N. A., D’Eramo Melkus G., Spollett G. Using focus groups to characterize beliefs and practices of African American women with NIDDM. The Diabetes Educator 1996;22(1):39-45.
7. Melkus G. D., Chyun D., Newlin K., et al. Effectiveness of a diabetes self-management intervention on physiological and psychosocial outcomes. Biological Research in Nursing 2010;12(1):7-19.
8. Melkus G. D., Maillet N., Novak J., et al. Primary care cancer screening and diabetes complications screening for black women with type 2 diabetes. Journal of the American Academy of Nurse Practitioners 2002;4(1):43-48.
9. Melkus G. D., Spollett G., Jefferson V., et al. Feasibility testing of a culturally competent intervention of education and care for black women with type 2 diabetes. Applied Nursing Research 2004;17(1):10-20.
175
CHAPTER 5
Introduction to qualitative research Mark Toles, Julie Barroso
Learning outcomes
After reading this chapter, the student should be able to do the following:
• Describe the components of a qualitative research report. • Describe the beliefs generally held by qualitative researchers. • Identify four ways qualitative findings can be used in evidence- based practice.
KEY TERMS
context dependent
data saturation
grand tour question
inclusion and exclusion criteria
inductive
naturalistic setting
176
paradigm
qualitative research
theme
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
Let’s say that you are reading an article that reports findings that HIV-infected men are more adherent to their antiretroviral regimens than HIV-infected women. You wonder, “Why is that? Why would women be less adherent in taking their medications? Certainly, it is not solely due to the fact that they are women.” Or say you are working in a postpartum unit and have just discharged a new mother who has debilitating rheumatoid arthritis. You wonder, “What is the process by which disabled women decide to have children? How do they go about making that decision?” These, like so many other questions we have as nurses, can be best answered through research conducted using qualitative methods. Qualitative research gives us the answers to those difficult “why?” questions. Although qualitative research can be used at many different places in a program of research, you will most often find it answering questions that we have when we understand very little about some phenomenon in nursing.
What is qualitative research? Qualitative research is a broad term that encompasses several different methodologies that share many similarities. Qualitative studies help us formulate an understanding of a phenomenon. Nurse scholars who are trained in qualitative methods use these methods to best answer discovery-oriented research questions.
Qualitative research is explanatory, descriptive, and inductive in nature. It uses words, as opposed to numbers, to explain a phenomenon. Qualitative research lets us see the world through the eyes of another—the woman who struggles to take her antiretroviral medication, or the woman who has carefully thought
177
through what it might be like to have a baby despite a debilitating illness. Qualitative researchers assume that we can only understand these things if we consider the context in which they take place, and this is why most qualitative research takes place in naturalistic settings. Qualitative studies make the world of an individual visible to the rest of us. Qualitative research involves an “interpretative, naturalistic approach to the world; meaning that qualitative researchers study things in their natural settings, attempting to make sense of or interpret phenomena in terms of the meaning people bring to them” (Denzin & Lincoln, 2011, p. 3).
What do qualitative researchers believe? Qualitative researchers believe that there are multiple realities that can be understood by carefully studying what people can tell us or what we can observe as we spend time with them. Example: ➤ The experience of having a baby, while it has some shared characteristics, is not the same for any two women, and it is definitely different for a disabled mother. Thus qualitative researchers believe that reality is socially constructed and context dependent. Even the experience of reading this book is different for any two students; one may be completely engrossed by the content, while another is reading but at the same time worrying about whether or not her financial aid will be approved soon.
Because qualitative researchers believe that the discovery of meaning is the basis for knowledge, their research questions, approaches, and activities are often quite different from quantitative researchers (see the Critical Thinking Decision Path). Qualitative researchers seek to understand the “lived experience” of the research participants. They might use interviews or observations to gather new data, and use new data to create narratives about research phenomena. Thus qualitative researchers know that there is a very strong imperative to clearly describe the phenomenon under study. Ideally, the reader of a qualitative research report, if even slightly acquainted with the phenomenon, would have an “aha!” moment in reading a well-written qualitative report.
So, you may now be saying, “Wow! This sounds great!
178
Qualitative research is for me!” Many nurses feel very comfortable with this approach because we are educated with regard to how to speak with people about the health issues concerning them; we are used to listening, and listening well. But the most important consideration for any research study is whether or not the methodology fits the question. This means that qualitative researchers must select an approach for exploring phenomena that will actually answer their research questions. Thus, as you read studies and are considering them as evidence on which to base your practice, you should ask yourself, “Does the methodology fit with the research question under study?”
HELPFUL HINT All research is based on a paradigm, but this is seldom specifically stated in a research report.
Does the methodology fit with the research question being asked? As we said before, qualitative methods are often best for helping us determine the nature of a phenomenon and the meaning of experience. Sometimes authors will state that they are using qualitative methods because little is known about a phenomenon, but that alone is not a good reason for conducting a study. Little may be known about a phenomenon because it does not matter! When researchers ask people to participate in a study, to open themselves and their lives for analysis, they should be asking about things that will help make a difference in people’s lives or help provide more effective nursing care. You should be able to articulate a valid reason for conducting a study, beyond “little is known about this topic.”
Considering the examples at the start of this chapter, we may want to know why HIV-infected women are less adherent to their medication regimens, so we can work to change these barriers and anticipate them when our patients are ready to start taking these pills. Similarly, we need to understand the decision-making processes women use to decide whether or not to have a child when they are disabled, so we can guide or advise the next woman who is
179
going through this process. To summarize, a qualitative approach “fits” a research question when the researchers seek to understand the nature or experience of phenomena by attending to personal accounts of those with direct experiences related to the phenomena. Keeping in mind the purpose of qualitative research, let’s discuss the parts of a qualitative research study.
CRITICAL THINKING DECISION PATH Selecting a Research Process
180
Components of a qualitative research study The components of a qualitative research study include the review of literature, study design, study setting and sample, approaches for data collection and analysis, study findings, and conclusions with implications for practice and research. As we reflect on these parts of qualitative studies, we will see how nurses use the qualitative research process to develop new knowledge for practice
181
(Box 5.1). BOX 5.1 Steps in the Research Process
• Review of the literature
• Study design
• Sample
• Setting: Recruitment and data collection
• Data collection
• Data analysis
• Findings
• Conclusions
Review of the literature When researchers are clear that a qualitative approach is the best way to answer the research question, their first step is to review the relevant literature and describe what is already known about the phenomena of interest. This may require creativity on the researcher’s part, because there may not be any published research on the phenomenon in question. Usually there are studies on similar subjects, or with the same patient population, or on a closely related concept. Example: ➤ Researchers may want to study how women who have a disabling illness make decisions about becoming pregnant. While there may be no other studies in this particular area, there may be some on decision making in pregnancy when a woman does not have a disabling illness. These studies would be important in the review of the literature because they identify concepts and relationships that can be used to guide the research process. Example: ➤ Findings from the review can show us the precise need for new research, what participants should be in the study sample, and what kinds of questions should
182
be used to collect the data. Let’s consider an example. Say a group of researchers wanted to
examine HIV-infected women’s adherence to antiretroviral therapy. If there was no research on this exact topic, the researcher might examine studies on adherence to therapy in other illnesses, such as diabetes or hypertension. They might include studies that examine gender differences in medication adherence. Or they might examine the literature on adherence in a stigmatizing illness, or look at appointment adherence for women, to see what facilitates or acts as a barrier to attending health care appointments. The major point is that even though there may be no literature on the phenomenon of interest, the review of the literature will identify existing related studies that are useful for exploring the new questions. At the conclusion of an effective review, you should be able to easily identify the strengths and weaknesses in prior research and a clear understanding of the new research questions, as well as the significance of studying them.
Study design The study design is a description of how the qualitative researcher plans to go about answering the research questions. In qualitative research, there may simply be a descriptive or naturalistic design in which the researchers adhere to the general tenets of qualitative research but do not commit to a particular methodology. There are many different qualitative methods used to answer the research questions. Some of these methods will be discussed in the next chapter. What is important, as you read from this point forward, is that the study design must be congruent with the philosophical beliefs that qualitative researchers hold. You would not expect to see a qualitative researcher use methods common to quantitative studies, such as a random sample, a battery of questionnaires administered in a hospital outpatient clinic, or a multiple regression analysis. Rather, you would expect to see a design that includes participant interviews or observation, strategies for inductive analysis, and plans for using data to develop narrative summaries with rich description of the details from participants’ experiences. You may also read about a pilot study in the description of a study design; this is work the researchers did before undertaking the
183
main study to make sure that the logistics of the proposed study were reasonable. For example, pilot data may describe whether the investigators were able to recruit participants and whether the research design led them to the information they needed.
Sample The study sample refers to the group of people that the researcher will interview or observe in the process of collecting data to answer the research questions. In most qualitative studies, the researchers are looking for a purposeful or purposively selected sample (see Chapter 10). This means that they are searching for a particular kind of person who can illuminate the phenomenon they want to study. Example: ➤ The researchers may want to interview women with multiple sclerosis or rheumatoid arthritis. There may be other parameters—called inclusion and exclusion criteria—that the researchers impose as well, such as requiring that participants be older than 18 years, not under the influence of illicit drugs, or experiencing a first pregnancy (as opposed to subsequent pregnancies). When researchers are clear about these criteria, they are able to identify and recruit participants with the experiences needed to shed light on the phenomenon in question. Often the researchers make decisions such as determining who might be a “long-term survivor” of a certain illness. In this case, they must clearly describe why and how they decided who would fit into this category. Is a long-term survivor someone who has had an illness for 5 years or 10 years? What is the median survival time for people with this diagnosis? Thus, as a reader of nursing research, you are looking for evidence of sound scientific reasoning behind the sampling plan.
When the researchers have identified the type of person to include in the research sample, the next step is to develop a strategy for recruiting participants, which means locating and engaging them in the research. Recruitment materials are usually very specific. Example: ➤ If the researchers want to talk to HIV-infected women about adherence to their medication regimen, they may distribute flyers or advertise their interest in recruiting women who consistently take their medication as indicated, as well as those who do not. Or, they may want to talk to women who fit into only one of
184
those categories. Similarly, the researchers who are examining decision making in pregnancy among women with disabling conditions would develop recruitment strategies that identify subjects with the conditions or characteristics they want to study.
In a research report, the researcher may include a description of the study sample in the findings. (This can also be reported in the description of the sample.) In any event, besides a demographic description of the study participants, a qualitative researcher should also report on key axes of difference in the sample. Example: ➤ In a sample of HIV-infected women, there should be information about the stage of illness, what kind/how many pills they must take, how many children they have, and so on. This information helps you place the findings into a context.
Setting: Recruitment and data collection The study setting refers to the places where participants are recruited and the data are collected. Settings for recruitment are usually a point of contact for people of common social, medical, or other individual traits. In the example of HIV-infected women who are having difficulties adhering to their antiretroviral regimens, researchers might distribute flyers describing the study at AIDS service organizations, support groups for HIV-infected women, clinics, online support groups, and other places people with HIV may seek services. The settings for data collection are another critical area of difference between quantitative and qualitative studies. Data collection in a qualitative study is usually done in a naturalistic setting, such as someone’s home, not in a clinic interview room or researcher’s office. This is important in qualitative research because the researcher’s observations can inform the data collection. To be in someone else’s home is a great advantage, as it helps the researcher to understand what that participant values. An entire wall in a participant’s living room might contain many pictures of a loved one, so anyone who enters the home would immediately understand the centrality of that person in the participant’s life. In the home of someone who is ill, many household objects may be clustered around a favorite chair: perhaps an oxygen tank, a glass of water, medications, a telephone, tissues, and so on. A good qualitative researcher will use clues like
185
these in the study setting to complete the complex, rich drawing that is being rendered in the study.
HIGHLIGHT Reading and critically appraising qualitative research studies may be the best way for interprofessional teams to understand the experience of living with a chronic illness so they can provide more effective whole person care.
Data collection The procedures for data collection differ significantly in qualitative and quantitative studies. Where quantitative researchers focus on statistics and numbers, qualitative researchers are usually concerned with words: what people can tell them and the narratives about meaning or experience. Qualitative researchers interview participants; they may interview an individual or a group of people in what is called a focus group. They may observe individuals as they go about daily tasks, such as sorting medications into a pill minder or caring for a child. But in all cases, the data collected are expressed in words. Most qualitative researchers use voice recorders so that they can be sure that they have captured what the participant says. This reduces the need to write things down and frees researchers to listen fully. Interview recordings are usually transcribed verbatim and then listened to for accuracy. In a research report, investigators describe their procedures for collecting the data, such as obtaining informed consent, the steps from initial contact to the end of the study visit, and how long each interview or focus group lasted or how much time the researcher spent “in the field” collecting data.
A very important consideration in qualitative data collection is the researcher’s decision that they have a sufficient sample and that data collection is complete. Researchers generally continue to recruit participants until they have reached redundancy or data saturation, which means that nothing new is emerging from the interviews. There usually is not a predetermined number of participants to be selected as there is in quantitative studies; rather, the researcher keeps recruiting until she or he has all of the data needed. One important exception to this is if the researcher is very
186
interested in getting different types of people in the study. Example: ➤ In the study of HIV-infected women and medication adherence, the researchers may want some women who were very adherent in the beginning but then became less so over time, or they may want women who were not adherent in the beginning but then became adherent; alternately, they may want to interview women with children and women without children to determine the influence of having children on adherence. Whatever the specific questions may be, sample sizes tend to be fairly small (fewer than 30 participants) because of the enormous amounts of written text that will need to be analyzed by the researcher.
Investigators use great care to design the interview questions because they must be crafted to help study participants describe their personal experiences and perceptions. Interview questions are different from research questions. Research questions are typically broad, encompassing, and written in scientific language. The interview questions may also be broad, like the overview or grand tour question that seeks the “big picture.” Example: ➤ Researchers might ask, “Tell me about taking your medications—the things that make it easier, and the things that make it harder,” or “Tell me what you were thinking about when you decided to get pregnant.” Along with overview questions, there are usually a series of prompts (additional questions) that were derived from the literature. These are areas that the researcher believes are important to cover (and that the participant will likely cover), but the prompts are there to remind the researcher in case the material is not mentioned. Example: ➤ With regard to medication adherence, the researcher may have read in other studies that motherhood can influence adherence in two very different ways: children can become a reason to live, which would facilitate taking antiretroviral medication; and children can be all-demanding, leaving the mother with little to no time to take care of herself. Thus, a neutrally worded question about the influence of children would be a prompt if the participants do not mention it spontaneously. In a research report, you should expect to find the primary interview questions identified verbatim; without them, it is impossible to know how the data were collected and how the researcher shaped what was discovered in the interviews.
187
EVIDENCE-BASED PRACTICE TIP Qualitative researchers use more flexible procedures than quantitative researchers. While collecting data for a project, they consider all of the experiences that may occur.
Data analysis Next is the description of data analysis. Here, researchers tell you how they handled the raw data, which, in a qualitative study, are usually transcripts of recorded interviews. The goal of qualitative analysis is to find commonalities and differences in the interviews, and then to group these into broader, more abstract, overarching categories of meaning, sometimes called themes, that capture much of the data. In the example we have been using about decision making regarding pregnancy for disabled women, one woman might talk about discussing the need for assistance with her friends if she became pregnant, and finding out that they were willing and able to help her with the baby. Another woman might talk about how she discussed the decision with her parents and siblings, and found them to be a ready source of aid. And yet a third woman may say that she talked about this with her church study group, and they told her that they could arrange to bring meals and help with housework during the pregnancy and afterward. On a more abstract level, these women are all talking about social support. So an effective analysis would be one that identifies this pattern in social support and, perhaps, goes further by also describing how social support influences some other concept in the data. Example: ➤ Consider women’s decision making about having a baby. In an ideal situation, written reports about the data will give you an example like the one you just read, but the page limitations of most journals limit the level of detail that researchers can present.
Many qualitative researchers use computer-assisted qualitative data analysis programs to find patterns in the interviews and field notes, which, in many studies, can seem overwhelming due to the sheer quantity of data to be dealt with. With a computer-assisted data analysis program, researchers from multiple sites can simultaneously code and analyze data from hundreds of files without using a single piece of paper. The software is a tool for
188
managing and remembering steps in analysis; however, it does not replace the thoughtful work of the researcher who must apply the program to guide the analysis of the data. In research reports, you should see a description of the way data were managed and analyzed, and whether the researchers used software or other paper-based approaches, such as using index cards with handwritten notes.
Findings At last, we come to the results. Findings in qualitative reports, as we have suggested already, are words—the findings are patterns of any kind in the data, such as the ways that participants talked, the things that they talked about, even their behaviors associated with where the researcher spent time with them. When researchers describe patterns in the data, they may describe a process (such as the way decision making occurs); they may identify a list of things that are functioning in some way (such as a list of barriers and facilitators to taking medications for HIV-infected women); they may specify a set of conditions that must be present for something to occur (such as what parents state they need to care for a ventilator-dependent child at home); or they may describe what it is like to go through some health-related transition (such as what it is like to become the caregiver for a parent with dementia). This is by no means an all-inclusive list; rather, it is a range of examples to help you recognize what types of findings might be possible. It may help to think of the findings as discoveries. The qualitative researcher has explored a phenomenon, and the findings are a report on what he or she “found” —that is, what was discovered in the interviews and observations.
When researchers describe their results, they usually break the data down into units of meaning that help the data cohere and tell a story. Effective research reports will describe the logic that was used for breaking down the units of data. Example: ➤ Are the themes—a means of describing a large quantity of data in a condensed format—identified from the most prevalent to the least prevalent? Are the researchers describing a process in temporal (time ordered) terms? Are they starting with things that were most important to the subject, then moving to less important items? As a
189
report on the findings unfolds, the researcher should proceed with a thorough description of the phenomenon, defining each of the themes and fleshing out each of the themes with a thorough explanation of the role that it plays in the question under study. The researcher should also provide quotations that support their themes. Ideally, they will stage the quote, giving you some information about the subject from whom it came. For example, was the subject a newly diagnosed HIV-infected African American woman without children? Or was it a disabled woman who has chosen to become pregnant, but who has suffered two miscarriages? The staging of quotes is important because it allows you to put the information into some social context.
In a well-written report of qualitative research, some of the quotes will give you an “aha!” feeling. You will have a sense that the researcher has done an excellent job of getting to the core of the problem. Quotes are as critical to qualitative reports as numbers are to a quantitative study; you would not have a great deal of confidence in a quantitative or qualitative report in which the author asks you to believe the conclusion without also giving concrete, verifiable findings to back it up.
HELPFUL HINT Values are involved in all research. It is important, however, that they not influence the results of the research.
Discussion of the results and implications for evidence-based practice When the researchers are satisfied that their findings answer the research questions, they should summarize the results for you and should compare their findings to the existing literature. Researchers usually explain how these findings are similar to or different from the existing literature. This is one of the great contributions of qualitative research—using findings to open up new venues of discovery that were not anticipated when the study was designed. Example: ➤ The researchers can use findings to develop new concepts or new conceptual models to explain broader phenomena. The conceptual work also identifies implications for how findings
190
can be used in practice and can direct future research. Another alternative is for researchers to use their findings to extend or refine existing theoretical models. For example, a researcher may learn something new about stigma that has not been described in the literature, and in writing about these findings, the researcher may refer to an existing stigma theory, pointing out how his or her work extends that theory.
Nursing is a practice discipline, and the goal of nursing research is to use research findings to improve patient care. Qualitative methods are the best way to start to answer clinical and research questions that have not been addressed or when a new perspective is needed in practice. The qualitative answers to these questions provide important evidence that offers the first systematic insights into phenomena previously not well understood and often lead to new perspectives in nursing practice and improved patient care outcomes.
Kearney (2001) developed a typology of levels and applications of qualitative research evidence that helps us see how new evidence can be applied to practice (Table 5.1). She described five categories of qualitative findings that are distinguished from one another in their levels of complexity and discovery: those restricted by a priori frameworks, descriptive categories, shared pathway or meaning, depiction of experiential variation, and dense explanatory description. She argued that the greater the complexity and discovery within qualitative findings, the stronger the potential for clinical application.
TABLE 5.1 Kearney’s Categories of Qualitative Findings, from Least to Most Complex
Category Definition Example Restricted by a priori frameworks
Discovery aborted because researcher has obscured the findings with an existing theory
Use of the theory of “relatedness” to describe women’s relationships without substantiation in the data, or when there may be an alternative explanation to describe how women exist in relationship to others; the data seem to point to an explanation other than “relatedness”
Descriptive categories
Phenomenon is vividly portrayed from a new perspective; provides a map into previously uncharted
Children’s descriptions of pain, including descriptors, attributed causes, and what constitutes good care during a painful episode
191
territory in the human experience of health and illness
Shared pathway or meaning
Synthesis of a shared experience or process; integration of concepts that provides a complex picture of a phenomenon
Description of women’s process of recovery from depression; each category was fully described, and the conditions for progression were laid out; able to see the origins of a phase in the previous phase
Depiction of experiential variation
Describes the main essence of an experience, but also shows how the experience varies, depending on the individual or context
Description of how pregnant women recovering from cocaine addiction might or might not move forward to create a new life, depending on the amount of structure they imposed on their behavior and their desire to give up drugs and change their lives
Dense explanatory description
Rich, situated understanding of a multifaceted and varied human phenomenon in a unique situation; portray the full range and depth of complex influences; densely woven structure to findings
Unique cultural conditions and familial breakdown and hopelessness led young people to deliberately expose themselves to HIV infection in order to find meaning and purpose in life; describes loss of social structure and demands of adolescents caring for their diseased or drugged parents who were unable to function as adults
Findings developed with only a priori frameworks provide little or no evidence for changing practice, because the researchers have prematurely limited what they are able to learn from participants or describe in their analysis. Findings that identify descriptive categories portray a higher level of discovery when a phenomenon is vividly portrayed from a new perspective. For nursing practice, these findings serve as maps of previously uncharted territory in human experience. Findings in Kearney’s third category, shared pathway or meaning, are more complex. In this type of finding, there is an integration of concepts or themes that results in a synthesis of a shared process or experience that leads to a logical, complex portrayal of the phenomenon. The researcher’s ideas at this level reveal how discrete bits of data come together in a meaningful whole. For nursing practice, this allows us to reflect on the bigger picture and what it means for the human experience (Kearney, 2001). Findings that depict experiential variation describe the essence of an experience and how this experience varies, depending on the individual or context. For nursing practice, this type of finding helps us see a variety of viewpoints, realizations of a human experience, and the contextual sources of that variety. In nursing practice, these findings explain how different variables can produce different consequences in different people or settings. Finally, findings that are presented as a dense explanatory description are at the highest level of complexity and discovery.
192
They provide a rich, situated understanding of a multifaceted and varied human phenomenon in a unique situation. These types of findings portray the full depth and range of complex influences that propel people to make decisions. Physical and social contexts are fully accounted for. There is a densely woven structure of findings in these studies that provide a rich fund of clinically and theoretically useful information for nursing practice. The layers of detail work together in the findings to increase understanding of human choices and responses in particular contexts (Kearney, 2001).
EVIDENCE-BASED PRACTICE TIP Qualitative research findings can be used in many ways, including improving ways clinicians communicate with patients and with each other.
So how can we further use qualitative evidence in nursing? The evidence provided by qualitative studies is used conceptually by the nurse: qualitative studies let nurses gain access to the experiences of patients and help nurses expand their ability to understand their patients, which should lead to more helpful approaches to care (Table 5.2).
TABLE 5.2 Kearney’s Modes of Clinical Application for Qualitative Research
Mode of Clinical Application Example Insight or empathy: Better understanding our patients and offering more sensitive support
Nurse is better able to understand the behaviors of a woman recovering from depression
Assessment of status or progress: Descriptions of trajectories of illness
Nurse is able to describe trajectory of recovery from depression and can assess how the patient is moving through this trajectory
Anticipatory guidance: Sharing of qualitative findings with the patient
Nurse is able to explain the phases of recovery from depression to the patient and to reassure her that she is not alone, that others have made it through a similar experience
Coaching: Advising patients of steps they can take to reduce distress or improve adjustment to an illness, according to the evidence in the study
Nurse describes the six stages of recovery from depression to the patient, and in ongoing contact, points out how the patient is moving through the stages, coaching her to recognize signs that she is improving and moving through the stages
Kearney (2001) proposed four modes of clinical application:
193
insight or empathy, assessment of status or progress, anticipatory guidance, and coaching. The simplest mode, according to Kearney, is to use the information to better understand the experiences of our patients, which in turn helps us to offer more sensitive support. Qualitative findings can also help us assess the patient’s status or progress through descriptions of trajectories of illness or by offering a different perspective on a health condition. They allow us to consider a range of possible responses from patients. We can then determine the fit of a category to a particular client, or try to locate them on an illness trajectory. Anticipatory guidance includes sharing of qualitative findings directly with patients. The patient can learn about others with a similar condition and can learn what to anticipate. This allows them to better garner resources for what might lie ahead or look for markers of improvement. Anticipatory guidance can also be tremendously comforting in that the sharing of research results can help patients realize they are not alone, that there are others who have been through a similar experience with an illness. Finally, coaching is a way of using qualitative findings; in this instance, nurses can advise patients of steps they can take to reduce distress, improve symptoms, or monitor trajectories of illness (Kearney, 2001).
Unfortunately, qualitative research studies do not fare well in the typical systematic reviews upon which evidence-based practice recommendations are based. Randomized clinical trials and other types of intervention studies traditionally have been the major focus of evidence-based practice. Typically, the selection of studies to be included in systematic reviews is guided by levels of evidence models that focus on the effectiveness of interventions according to their strength and consistency of their predictive power. Given that the levels of evidence models are hierarchical in nature and they perpetuate intervention studies as the “gold standard” of research design, the value of qualitative studies and the evidence offered by their results have remained unclear. Qualitative studies historically have been ranked lower in a hierarchy of evidence, as a “weaker” form of research design.
Remember, however, that qualitative research is not designed to test hypotheses or make predictions about causal effects. As we use qualitative methods, these findings become more and more
194
valuable as they help us discover unmet patient needs, entire groups of patients that have been neglected, and new processes for delivering care to a population. Though qualitative research uses different methodologies and has different goals, it is important to explore how and when to use the evidence provided by findings of qualitative studies in practice.
Appraisal for evidence-based practice foundation of qualitative research A final example illustrates the differences in the methods discussed in this chapter and provides you with the beginning skills of how to critique qualitative research. The information in this chapter, coupled with information presented in Chapter 7, provides the underpinnings of critical appraisal of qualitative research (see the Critical Appraisal Criteria box, Chapter 7). Consider the question of nursing students learning how to conduct research. The empirical analytical approach (quantitative research) might be used in an experiment to see if one teaching method led to better learning outcomes than another. The students’ knowledge might be tested with a pretest, the teaching conducted, and then a posttest of knowledge obtained. Scores on these tests would be analyzed statistically to see if the different methods produced a difference in the results.
In contrast, a qualitative researcher may be interested in the process of learning research. The researcher might attend the class to see what occurs and then interview students to ask them to describe how their learning changed over time. They might be asked to describe the experience of becoming researchers or becoming more knowledgeable about research. The goal would be to describe the stages or process of this learning. Alternately, a qualitative researcher might consider the class as a culture and could join to observe and interview students. Questions would be directed at the students’ values, behaviors, and beliefs in learning research. The goal would be to understand and describe the group members’ shared meanings. Either of these examples are ways of viewing a question with a qualitative perspective. The specific qualitative methodologies are described in Chapter 6.
195
Many other research methods exist. Although it is important to be aware of the qualitative research method used, it is most important that the method chosen is the one that will provide the best approach to answering the question being asked. One research method does not rank higher than another; rather, a variety of methods based on different paradigms are essential for the development of a well informed and comprehensive approach to evidence-based nursing practice.
Key points • All research is based on philosophical beliefs, a worldview, or a
paradigm.
• Qualitative research encompasses different methodologies.
• Qualitative researchers believe that reality is socially constructed and is context dependent.
• Values should be acknowledged and examined as influences on the conduct of research.
• Qualitative research follows a process, but the components of the process vary.
• Qualitative research contributes to evidence-based practice.
Critical thinking challenges • Discuss how a researcher’s values could influence the results of a
study. Include an example in your answer.
• Can the expression, “We do not always get closer to the truth as we slice and homogenize and isolate [it]” be applied to both qualitative and quantitative methods? Justify your answer.
• What is the value of qualitative research in evidence-based practice? Give an example.
196
• Discuss how your interprofessional team could apply the findings of a qualitative study about coping with a diagnosis of multiple sclerosis.
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Denzin N.K, Lincoln Y.S. The SAGE handbook of qualitative
research. 4th ed. Thousand Oaks, CA: Sage;2011. 2. Kearney M.H. Levels and applications of qualitative research
evidence. Research in Nursing and Health 2001;24:145-153.
197
CHAPTER 6
Qualitative approaches to research Mark Toles, Julie Barroso
Learning outcomes
After reading this chapter, you should be able to do the following:
• Identify the processes of phenomenological, grounded theory, ethnographic, and case study methods. • Recognize appropriate use of community-based participatory research (CBPR) methods. • Discuss significant issues that arise in conducting qualitative research in relation to such topics as ethics, criteria for judging scientific rigor, and combination of research methods. • Apply critical appraisal criteria to evaluate a report of qualitative research.
KEY TERMS
198
auditability
bracketing
case study method
community-based participatory research
constant comparative method
credibility
culture
data saturation
domains
emic view
ethnographic method
etic view
fittingness
grounded theory method
instrumental case study
intrinsic case study
key informants
lived experience
meta-summary
meta-synthesis
mixed methods
phenomenological method
theoretical sampling
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
199
Qualitative research combines the science and art of nursing to enhance understanding of the human health experience. This chapter focuses on four commonly used qualitative research methods: phenomenology, grounded theory, ethnography, and case study. Community-based participatory research (CBPR) is also presented. Each of these methods, although distinct from the others, shares characteristics that identify it as a method within the qualitative research tradition.
Traditional hierarchies of research evaluation and how they categorize evidence from strongest to weakest, with emphasis on support for the effectiveness of interventions, are presented in Chapter 1. This perspective is limited because it does not take into account the ways that qualitative research can support practice, as discussed in Chapter 5. There is no doubt about the merit of qualitative studies; the problem is that no one has developed a satisfactory method for including them in current evidence hierarchies. In addition, qualitative studies can answer the critical why questions that emerge in many evidence-based practice summaries. Such summaries may report the answer to a research question, but they do not explain how it occurs in the landscape of caring for people.
As a research consumer, you should know that qualitative methods are the best way to start to answer clinical and research questions when little is known or a new perspective is needed for practice. The very fact that qualitative research studies have increased exponentially in nursing and other social sciences speaks to the urgent need of clinicians to answer these why questions and to deepen our understanding of experiences of illness. Thousands of reports of well-conducted qualitative studies exist on topics such as the following:
• Personal and cultural constructions of disease, prevention, treatment, and risk
• Living with disease and managing the physical, psychological, and social effects of multiple diseases and their treatment
200
• Decision-making experiences at the beginning and end of life, as well as assistive and life-extending, technological interventions
• Contextual factors favoring and mitigating against quality care, health promotion, prevention of disease, and reduction of health disparities (Sandelowski, 2004; Sandelowski & Barroso, 2007)
Findings from qualitative studies provide valuable insights about unique phenomena, patient populations, or clinical situations. In doing so, they provide nurses with the data needed to guide and change practice.
In this chapter, you are invited to look through the lens of human experience to learn about phenomenological, grounded theory, ethnographic, CBPR, and case study methods. You are encouraged to put yourself in the researcher’s shoes and imagine how it would be to study an issue of interest from the perspective of each of these methods. No matter which method a researcher uses, there is a focus on the human experience in natural settings.
The researcher using these methods believes that each unique human being attributes meaning to their experience and that experience evolves from one’s social and historical context. Thus one person’s experience of pain is distinct from another’s and can be elucidated by the individual’s subjective description of it. Example: ➤ Researchers interested in studying the lived experience of pain for the adolescent with rheumatoid arthritis will spend time in the adolescents’ natural settings, perhaps in their homes and schools (see Chapter 5). Research efforts will focus on uncovering the meaning of pain as it extends beyond the number of medications taken or a rating on a pain scale. Qualitative methods are grounded in the belief that objective data do not capture the whole of the human experience. Rather, the meaning of the adolescent’s pain emerges within the context of personal history, current relationships, and future plans, as the adolescent lives daily life in dynamic interaction with the environment.
Qualitative approach and nursing science The evidence provided by qualitative studies that consider the unique perspectives, concerns, preferences, and expectations each
201
patient brings to a clinical encounter offers in-depth understanding of human experience and the contexts in which they occur. Thus findings in qualitative research often guide nursing practice, contribute to instrument development (see Chapter 15), and develop nursing theory (Fig. 6.1).
FIG 6.1 Qualitative approach and nursing science.
Qualitative research methods Thus far you have studied an overview of the qualitative research approach (see Chapter 5). Recognizing how the choice to use a qualitative approach reflects one’s worldview and the nature of some research questions, you have the necessary foundation for exploring selected qualitative methodologies. Now, as you review the Critical Thinking Decision Path and study the remainder of Chapter 6, note how different qualitative methods are appropriate for distinct areas of interest. Also note how unique research questions might be studied with each qualitative research method. In this chapter, we will explore five qualitative research methods in depth, including phenomenological, grounded theory, ethnographic, case study, and CBPR methods.
202
CRITICAL THINKING DECISION PATH Selecting a Qualitative Research Method
Phenomenological method The phenomenological method is a process of learning and constructing the meaning of human experience through intensive dialogue with persons who are living the experience. It rests on the assumption that there is a structure and essence to shared experiences that can be narrated (Marshall & Rossman, 2011). The researcher’s goal is to understand the meaning of the experience as it is lived by the participant. Phenomenological studies usually incorporate data about the lived space, or spatiality; the lived body, or corporeality; lived time, or temporality; and lived human relations, or relationality. Meaning is pursued through a process of dialog, which extends beyond a simple interview and requires thoughtful presence on the part of the researcher. There are many schools of phenomenological research, and each school of thought uses slight differences in research methods. Example: ➤ Husserl belonged to the group of transcendental phenomenologists, who saw phenomenology as an interpretive, as opposed to an objective, mode of description. Using vivid and detailed attentiveness to
203
description, researchers in this school explore the ways knowledge comes into being. They seek to understand knowledge that is based on insights rather than objective characteristics (Richards & Morse, 2013). In contrast, Heidegger was an existential phenomenologist who believed that the observer cannot separate him/herself from the lived world. Researchers in this school of thought study how being in the world is a reality that is perceived; they study a reciprocal relationship between observers and the phenomenon of interest (Richards & Morse, 2013). In all forms of phenomenological research, you will find researchers asking a question about the lived experience and using methods that explore phenomena as they are embedded in people’s lives and environments.
Identifying the phenomenon Because the focus of the phenomenological method is the lived experience, the researcher is likely to choose this method when studying a dimension of day-to-day existence for a particular group of people. An example of this is provided later in this chapter, in which Cook and colleagues (2015) studied the complex issues surrounding the residential status of assisted living residents in terms of fundamental human needs.
Structuring the study When thinking about methods, we say the methodological approach “structures the study.” This phrase means that the method shapes the way we think about the phenomenon of interest and the way we would go about answering a research question. For the purpose of describing structuring, the following topics are addressed: the research question, the researcher’s perspective, and sample selection.
Research question. The question that guides phenomenological research always asks about some human experience. It guides the researcher to ask the participant about some past or present experience. In most cases, the research question is not exactly the same as the question used to initiate dialogue with study participants. Example: ➤ Cook and colleagues (2015) state that the objective of their study was to
204
explore the meaning and meaningfulness that older people attribute to their everyday experiences in an assisted living facility and how these experiences define their status as residents. They describe their methodology as hermeneutic phenomenology. Their goal was to provide knowledge that assisted living facility administrators and staff could use so that residents could feel “at home” in the facility.
Researcher’s perspective. When using the phenomenological method, the researcher’s perspective is bracketed. This means that the researcher identifies their own personal biases about the phenomenon of interest to clarify how personal experience and beliefs may color what is heard and reported. Further, the phenomenological researcher is expected to set aside their personal biases—to bracket them—when engaged with the participants. By becoming aware of personal biases, the researcher is more likely to be able to pursue issues of importance as introduced by the participant, rather than leading the participant to issues the researcher deems important (Richards & Morse, 2013).
HIGHLIGHT Discuss with your interprofessional QI team why searching for qualitative studies might be most appropriate for understanding about living with Hepatitis C and managing the physical, psychological, and social effects of multiple treatments and their effects.
Using phenomenological methods, researchers strive to identify personal biases and hold them in abeyance while querying the participant. Readers of phenomenological articles may find it difficult to identify bracketing strategies because they are seldom explicitly identified in a research manuscript. Sometimes a researcher’s worldview or assumptions provide insight into biases that have been considered and bracketed.
Sample selection. As you read a phenomenological study, you will find that the participants were selected purposively (selecting subjects who are considered typical of the population) and that members of the
205
sample either are living the experience the researcher studies or have lived the experience in their past. Because phenomenologists believe that each individual’s history is a dimension of the present, a past experience exists in the present moment. For the phenomenologist, it is a matter of asking the right questions and listening. Even when a participant is describing a past experience, remembered information is being gathered in the present at the time of the interview.
HELPFUL HINT Qualitative studies often use purposive sampling (see Chapter 12).
Data gathering Written or oral data may be collected when using the phenomenological method. The researcher may pose the query in writing and ask for a written response, or may schedule a time to interview the participant and record the interaction. In either case, the researcher may return to ask for clarification of written or recorded transcripts. To some extent, the particular data collection procedure is guided by the choice of a specific analysis technique. Different analysis techniques require different numbers of interviews. A concept known as data saturation usually guides decisions regarding how many interviews are enough. Data saturation is the situation of obtaining the full range of themes from the participants, so that in interviewing additional participants, no new data emerge (Marshall & Rossman, 2011).
Data analysis Several techniques are available for data analysis when using the phenomenological method. Although the techniques are slightly different from each other, there is a general pattern of moving from the participant’s description to the researcher’s synthesis of all participants’ descriptions. Colaizzi (1978) suggests a series of seven steps:
1. Read the participants’ narratives to acquire a feeling for their ideas in order to understand them fully.
2. Extract significant statements to identify keywords and sentences
206
relating to the phenomenon being studied.
3. Formulate meanings for each of these significant statements.
4. Repeat this process across participants’ stories and cluster recurrent meaningful themes. Validate these themes by returning to the informants to check interpretation.
5. Integrate the resulting themes into a rich description of the phenomenon under study.
6. Reduce these themes to an essential structure that offers an explanation of the behavior.
7. Return to the participants to conduct further interviews or elicit their opinions on the analysis in order to cross-check interpretation.
Cook and colleagues (2015) do not cite a reference for data analysis; they describe using narrative analysis to interpret how participants viewed their experiences and environment over a series of up to eight interviews with each resident over 6 months.
It is important to note that giving verbatim transcripts to participants can have unanticipated consequences. It is not unusual for people to deny that they said something in a certain way, or that they said it at all. Even when the actual recording is played for them, they may have difficulty believing it. This is one of the more challenging aspects of any qualitative method: every time a story is told, it changes for the participant. The participant may sincerely feel that the story as it was recorded is not the story as it is now.
EVIDENCE-BASED PRACTICE TIP Phenomenological research is an important approach for accumulating evidence when studying a new topic about which little is known.
Describing the findings When using the phenomenological method, the nurse researcher provides you with a path of information leading from the research question, through samples of participants’ words and the
207
researcher’s interpretation, to the final synthesis that elaborates the lived experience as a narrative. When reading the report of a phenomenological study, the reader should find that detailed descriptive language is used to convey the complex meaning of the lived experience that offers the evidence for this qualitative method (Richards & Morse, 2013). Cook and colleagues (2015) described five themes that emerged from the narratives that collectively demonstrate that residents wanted their residential status to involve “living with care” rather than “existing in care.”
1. Caring for oneself/being cared for
2. Being in control/losing control
3. Relating to others/putting up with others
4. Active choosers and users of space/occupying space
5. Engaging in meaningful activity/lacking meaningful activity
The themes in their phenomenological report describe the need for assisted living facility staff to be more focused on recognizing, acknowledging, and supporting residents’ aspirations regarding their future lives and their status as residents. By using direct participant quotes, researchers enable readers to evaluate the connections between what individual participants said and how the researcher labeled or interpreted what they said.
Grounded theory method The grounded theory method is an inductive approach involving a systematic set of procedures to arrive at a theory about basic social processes (Silverman & Marvasti, 2008). The emergent theory is based on observations and perceptions of the social scene and evolves during data collection and analysis (Corbin & Strauss, 2015). Grounded theory describes a research approach to construct theory where no theory exists, or in situations where existing theory fails to provide evidence to explain a set of circumstances.
Developed originally as a sociologist’s tool to investigate interactions in social settings (Glaser & Strauss, 1967), the grounded
208
theory method is used in many disciplines; in fact, investigators from different disciplines use grounded theory to study the same phenomenon from their varying perspectives (Corbin & Strauss, 2015; Denzin & Lincoln, 2003; Marshall & Rossman, 2011; Strauss & Corbin, 1994, 1997). Example: ➤ In an area of study such as chronic illness, a nurse might be interested in coping patterns within families, a psychologist might be interested in personal adjustment, and a sociologist might focus on group behavior in health care settings. In grounded theory, the usefulness of the study stems from the transferability of theories; that is, a theory derived from one study is applicable to another. Thus the key objective of grounded theory is the development of theories spanning many disciplines that accurately reflect the cases from which they were derived (Sandelowski, 2004).
Identifying the phenomenon Researchers typically use the grounded theory method when interested in social processes from the perspective of human interactions or patterns of action and interaction between and among various types of social units (Denzin & Lincoln, 2003). The basic social process is sometimes expressed in the form of a gerund (i.e., the -ing form of a verb when functioning as a noun), which is designed to indicate change occurring over time as individuals negotiate social reality. Example: ➤ Hyatt and colleagues (2015) explore soldiers and their family reintegration experiences, as described by married dyads, following a combat-related mild traumatic brain injury.
Structuring the study
Research question. Research questions for the grounded theory method are those that address basic social processes that shape human behavior. In a grounded theory study, the research question can be a statement or a broad question that permits in-depth explanation of the phenomenon. For example, Hyatt and colleagues (2015) examined the following research question: “How do soldiers and their spouses identify the special challenges, sources of support, and
209
overall rehabilitation process of post–mild traumatic brain injury family reintegration?”
Researcher’s perspective. In a grounded theory study, the researcher brings some knowledge of the literature to the study, but an exhaustive literature review may not be done. This allows theory to emerge directly from data and to reflect the contextual values that are integral to the social processes being studied. In this way, the new theory that emerges from the research is “grounded in” the data (Richards & Morse, 2013).
Sample selection. Sample selection involves choosing participants who are experiencing the circumstance and selecting events and incidents related to the social process under investigation. Hyatt and colleagues (2015) obtained their purposive (see Chapter 12) sample through self-referral from flyers posted in military health care clinics, health care provider referrals, and directly approaching potential participants while in the traumatic brain injury clinic of a military health care system; it is important to note that Hyatt was an active military member herself at the time of data collection.
Data gathering In the grounded theory method, data are collected through interviews and skilled observations of individuals interacting in a social setting. Interviews are recorded and transcribed, and observations are recorded as field notes. Open-ended questions are used initially to identify concepts for further focus. At their first data collection point, Hyatt and colleagues (2015) interviewed couples (soldiers and their spouses) together; then they interviewed each of them separately, to probe the themes that emerged in the joint interviews.
Data analysis A unique and important feature of the grounded theory method is that data collection and analysis occur simultaneously. The process requires systematic data collection and documentation using field
210
notes and transcribed interviews. Hunches about emerging patterns in the data are noted in memos that the researcher uses to direct activities in fieldwork. This technique, called theoretical sampling, is used to select experiences that will help the researcher to test hunches and ideas and to gather complete information about developing concepts. The researcher begins by noting indicators or actual events, actions, or words in the data. As data are concurrently collected and analyzed, new concepts, or abstractions, are developed from the indicators (Charmaz, 2003; Strauss, 1987).
The initial analytical process is called open coding (Strauss, 1987). Data are examined carefully line by line, broken down into discrete parts, then compared for similarities and differences (Corbin & Strauss, 2015). Coded data are continuously compared with new data as they are acquired during research. This is a process called the constant comparative method. When data collection is complete, codes in the data are clustered to form categories. The categories are expanded and developed, or they are collapsed into one another, and relationships between the categories are used to develop new “grounded” theories. As a result, data collection, analysis, and theory generation have a direct, reciprocal relationship which grounds new theory in the perspectives of the research participants (Charmaz, 2003; Richards & Morse, 2013; Strauss & Corbin, 1990).
HELPFUL HINT In a report of research using the grounded theory method, you can expect to find a diagrammed model of a theory that synthesizes the researcher’s findings in a systematic way.
Describing the findings Grounded theory studies are reported in detail, permitting readers to follow the exact steps in the research process. Descriptive language and diagrams of the research process are used as evidence to document the researchers’ procedures for moving from the raw data to the new theory. Hyatt and colleagues (2015) found the basic social process of family reintegration after mild traumatic brain injury to be “finding a new normal.” The couples described this new normal as the phenomenon of finding or adjusting to changes
211
in their new, post-mild traumatic brain injury family roles or routines. The following were the core categories:
1. Facing the unexpected—“Homecoming”—and adjusting to having the soldier back home; noticing changes in the soldier
2. Managing unexpected change—Assuming a caregiver role, managing the post-mild traumatic brain injury changes within the context of the married relationship
3. Experiencing mismatched expectations—Coping with the shifting state of the relationship, losing a career, or a shifting future
4. Adjusting to new expectations—Accepting changes, building a new family life
5. Learning to live with new expectations—Accepting the new normal
EVIDENCE-BASED PRACTICE TIP When thinking about the evidence generated by the grounded theory method, consider whether the theory is useful in explaining, interpreting, or predicting the study phenomenon of interest.
Ethnographic method Derived from the Greek term ethnos, meaning people, race, or cultural group, the ethnographic method focuses on scientific description and interpretation of cultural or social groups and systems (Creswell, 2013). The goal of the ethnographer is to understand the research participants’ views of their world, or the emic view. The emic view (insiders’ view) is contrasted with the etic view (outsiders’ view), which is obtained when the researcher uses quantitative analyses of behavior. The ethnographic approach requires that the researcher enter the world of the study participants to watch what happens, listen to what is said, ask questions, and collect whatever data are available. It is important to note that the term ethnography is used to mean both the research technique and the product of that technique—that is, the study
212
itself (Creswell, 2013; Richards & Morse, 2013; Tedlock, 2003). Vidick and Lyman (1998) trace the history of ethnography, with roots in the disciplines of sociology and anthropology, as a method born out of the need to understand “other” and “self.” Nurses use the method to study cultural variations in health and patient groups as subcultures within larger social contexts.
Identifying the phenomenon The phenomenon under investigation in an ethnographic study varies in scope from a long-term study of a very complex culture, such as that of the Aborigines (Mead, 1949), to a short-term study of a phenomenon within subunits of cultures. Kleinman (1992) notes the clinical utility of ethnography in describing the “local world” of groups of patients who are experiencing a particular phenomenon, such as suffering. The local worlds of patients have cultural, political, economic, institutional, and social-relational dimensions in much the same way as larger complex societies. An example of ethnography is found in Grassley and colleagues’ (2015) study of nurses’ support of breastfeeding on the night shift. Grassley and colleagues used institutional ethnography, which has as its goal to explore how social experiences and processes, in particular those of everyday work, are organized. Institutional ethnography also considers the institutional processes and interactions that mediate the context of nurses’ everyday work (Grassley et al., 2015).
Structuring the study
Research question. In ethnographic studies, questions are asked about “lifeways” or particular patterns of behavior within the social context of a culture or subculture. In this type of research, culture is viewed as the system of knowledge and linguistic expressions used by social groups that allows the researcher to interpret or make sense of the world (Aamodt, 1991; Richards & Morse, 2013). Thus ethnographic nursing studies address questions that concern how cultural knowledge, norms, values, and other contextual variables influence people’s health experiences. Example: ➤ Grassley and colleagues’ (2015) research question is implied in their purpose statement: “To
213
describe nurses’ support of breastfeeding on the night shift and to identify the interpersonal interactions and institutional structures that affect their ability to offer breastfeeding support and to promote exclusive breastfeeding on the night shift.” Remember that ethnographers have a broader definition of culture, where a particular social context is conceptualized as a culture. In this case, nurses who provide care on a mother/baby unit to mother/infant dyads in the immediate postpartum period are seen as a cultural entity that is appropriate for ethnographic study.
Researcher’s perspective. When using the ethnographic method, the researcher’s perspective is that of an interpreter entering an alien world and attempting to make sense of that world from the insider’s point of view (Richards & Morse, 2013). Like phenomenologists and grounded theorists, ethnographers make their own beliefs explicit and bracket, or set aside, their personal biases as they seek to understand the worldview of others.
Sample selection. The ethnographer selects a cultural group that is living the phenomenon under investigation. The researcher gathers information from general informants and from key informants. Key informants are individuals who have special knowledge, status, or communication skills, and who are willing to teach the ethnographer about the phenomenon (Richards & Morse, 2013). Example: ➤ Grassley and colleagues’ (2015) research took place in a tertiary care hospital with 4200 births per year (20% of the state’s total births) and an exclusive breastfeeding rate of 75% on discharge. They described the setting and its employees in detail.
HELPFUL HINT Managing personal bias is an expectation of researchers using all of the methods discussed in this chapter.
Data gathering Ethnographic data gathering involves immersion in the study setting and the use of participant observation, interviews of
214
informants, and interpretation by the researcher of cultural patterns (Richards & Morse, 2013). Ethnographic research involves face-to- face interviewing with data collection and analysis taking place in the natural setting. Thus fieldwork is a major focus of the method. Other techniques may include obtaining life histories and collecting material items reflective of the culture. Example: ➤ Photographs and films of the informants in their world can be used as data sources. In their study, Grassley and colleagues (2015) collected data using focus groups, individual and group interviews, and mother/baby unit observations.
Data analysis Like the grounded theory method, ethnographic data are collected and analyzed simultaneously. Data analysis proceeds through several levels as the researcher looks for the meaning of cultural symbols in the informant’s language. Analysis begins with a search for domains or symbolic categories that include smaller categories. Language is analyzed for semantic relationships, and structural questions are formulated to expand and verify data. Analysis proceeds through increasing levels of complexity until the data, grounded in the informant’s reality, are synthesized by the researcher (Richards & Morse, 2013). Grassley and colleagues (2015) described analysis of data as beginning with interview transcripts using content analysis, with subsequent team meetings to discuss findings and agree on categories. The observation notes were used to substantiate the themes.
Describing the findings Ethnographic studies yield large quantities of data that reflect a wide array of evidence amassed as field notes of observations, interview transcriptions, and sometimes other artifacts such as photographs. The first level description is the description of the scene, the parameters or boundaries of the research group, and the overt characteristics of group members (Richards & Morse, 2013). Strategies that enhance first level description include maps and floor plans of the setting, organizational charts, and documents. Researchers may report item-level analysis, followed by pattern and structure level of analysis. Ethnographic research articles
215
usually provide examples from data, thorough descriptions of the analytical process, and statements of the hypothetical propositions and their relationship to the ethnographer’s frame of reference, which can be rather detailed and lengthy. Grassley and colleagues (2015) identified three main themes that described nurses’ support of breastfeeding on the night shift: competing priorities, incongruent expectations, and influential institutional structure; these described the interpersonal interactions and institutional structures that affected the nurses. Competing priorities included maternal rest, the newborn night feeding pattern, the presence of visitors, support of the breastfeeding dyad, and other patients’ care needs. Incongruent expectations included the breastfeeding expectations of parents, the newborn’s breastfeeding behaviors, parental night feeding expectations, the newborn’s nocturnal sleep pattern, the nurses’ expectations about support, and challenging breastfeeding dyads. Finally, influential institutional structures included hospital practices, staffing (including the nurse/patient ratio, RN experience, and lactation of RNs), and feeding policies.
EVIDENCE-BASED PRACTICE TIP Evidence generated by ethnographic studies will answer questions about how cultural knowledge, norms, values, and other contextual variables influence the health experience of a particular patient population in a specific setting.
Case study Case study research, which is rooted in sociology, has a complex history and many definitions (Aita & McIlvain, 1999). As noted by Stake (2000), a case study design is not a methodological choice; rather, it is a choice of what to study. Thus the case study method is about studying the peculiarities and the commonalities of a specific case, irrespective of the actual strategies for data collection and analysis that are used to explore research questions. Case studies include quantitative and/or qualitative data but are defined by their focus on uncovering an individual case and, in some instances, identifying patterns in variables that are consistent across a set of cases. Stake (2000) distinguishes intrinsic from instrumental case studies. Intrinsic case study is undertaken to have a better
216
understanding of the case—for example, one child with chickenpox, as opposed to a group or all children with chickenpox. The researcher at least temporarily subordinates other curiosities so that the stories of those “living the case” will be teased out (Stake, 2000). Instrumental case study is used when researchers are pursuing insight into an issue or want to challenge some generalization—for example, the qualities of sleep and restfulness in a set of four children with chickenpox. Very often, in case studies, there is an emphasis on holism, which means that researchers are searching for global understanding of a case within a spatially or temporally defined context.
Identifying the phenomenon Although some definitions of case study demand that the focus of research be contemporary, Stake’s (1995, 2000) defining criterion of attention to the single case broadens the scope of phenomenon for study. By a single case, Stake is designating a focus on an individual, a family, a community, an organization—some complex phenomenon that demands close scrutiny for understanding. Walker and colleagues (2015) used a case study design to examine how older, early-stage breast and prostate cancer patients managed the transition from active treatment of cancer to recovery when treatment was completed. To explore the strategies that cancer patients used, Walker and colleagues used a purposive sampling strategy to select a sample of 11 patient and caregiver dyads from a larger group of dyads enrolled in a randomized clinical trial of a new cancer treatment.
Structuring the study
Research question. Stake (2000) suggests that research questions be developed around issues that serve as a foundation to uncover complexity and pursue understanding. Although researchers pose questions to begin discussion, the initial questions are never all-inclusive; rather, the researcher uses an iterative process of “growing questions” in the field. That is, as data are collected to address these questions, it is expected that other questions will emerge and serve as guides to
217
the researcher to untangle the complex, context-laden story within the case. Example: ➤ In Walker and colleagues’ (2015) study, data were collected from patients’ daily written journals, patient interview transcripts, and researcher notes from telephone calls with patients and caregivers. By using multiple ways of identifying how patients recovered after treatment, the researchers were able to describe a central theme about cancer recovery—with the return of a sense of “normalcy,” patients experienced less anxiety and greater quality of life. Using rich description in the case study data, the researchers were also able to describe resources, such as conversations with family members and health care workers, which promote a sense of normalcy and well-being after treatment.
Researcher’s perspective. When the researcher begins with questions developed around suspected issues of importance, they are said to have an “etic” focus, which means the research is focused on the perspective of the researcher. As case study researchers engage the phenomenon of interest in individual cases, the uniqueness of individual stories unfold and shift from an etic (researcher orientation) to an “emic” (participant orientation) focus (Stake, 2000). Ideally, the case study researcher will develop an insider view that permits narration of the way things happen in the case. Example: ➤ In the study by Walker and colleagues (2015), the etic focus on the abstract concept of “recovery” shifted to the emic focus on the precise details about the way patients returned to a sense of normalcy after treatment.
Sample selection. This is one of the areas where scholars in the field present differing views, ranging from only choosing the most common cases to only choosing the most unusual cases (Aita & McIlvain, 1999). Stake (2000) advocates selecting cases that may offer the best opportunities for learning. In some instances, the convenience of studying the case may even be a factor. For instance, if there are several patients who have undergone heart transplantation and are willing to participate in the study, practical factors may influence which patient offers the best opportunity for learning. Persons who live in the area and can be easily visited at home or in the medical
218
center might be better choices than those living much farther away (where multiple contacts over time might be impossible). Similarly, the researcher may choose to study a case in which a potential participant has an actively involved family, because understanding the family context of transplant patients may shed important new light on their healing. It can safely be said that no choice is perfect when selecting a case; however, selecting cases for their contextual features fosters the strength of data that can be learned at the level of the individual case. Example: ➤ In the Walker and colleagues’ (2015) study, the selection of 11 patient and caregiver dyads permitted the detailed data collection necessary to describe the actual process of returning to normalcy and how factors in the environment contributed to this process.
Data gathering Case study data are gathered using interviews, field observations, document reviews, and any other methods that accumulate evidence for describing or explaining the complexity of the case. Stake (1995) advocates development of a data gathering plan to guide the progress of the study from definition of the case through decisions regarding data collection involving multiple methods, at multiple time points, and sometimes with multiple participants within the case. In the Walker and colleagues’ (2015) study, multiple methods for collecting data were used, including daily written diaries, interview transcripts, and notes from phone calls. Using data from multiple sources, the researchers used data from different times and points of view to describe the step-by-step process of returning to normal after cancer treatment.
Data analysis/describing findings Data analysis is often concurrent with data gathering and description of findings as the narrative in the case develops. Qualitative case study is characterized by researchers spending extended time on site, personally in contact with activities and operations of the case, and reflecting and revising meanings of what transpires (Stake, 2000). Reflecting and revising meanings are the work of the case study researcher, who records data, searches for patterns, links data from multiple sources, and develops
219
preliminary thoughts regarding the meaning of collected data. This reflective and iterative process for writing the case narrative produces a unique form of evidence. Many times case study research reports do not list all of the research activities. However, reported findings are usually embedded in the following: (1) a chronological development of the case; (2) the researcher’s story of coming to know the case; (3) the one-by-one description of case dimensions; and (4) vignettes that highlight case qualities (Stake, 1995). Example: ➤ As Walker and colleagues (2015) analyzed “cases” of patient recovery after treatment, the diversity of cases in the study permitted the researchers to identify behaviors, such as conversations with trusted health care workers, which patients used to reassess their wellness and realize they were healing after treatment. Analysis consisted of the search for patterns in raw data, variation in the patterns within and between cases, and identification of themes that described common patterns within and between the cases. In the study by Walker and colleagues (2015), the researchers ultimately used patterns in the case data to develop a theory about the process of working toward normalcy after cancer treatment; this was significant because the new theory is focused on patient experiences and will be a guide for assisting cancer patients in the future.
EVIDENCE-BASED PRACTICE TIP Case studies are a way of providing in-depth evidence-based discussion of clinical topics that can be used to guide practice.
Community-based participatory research Community-based participatory research is a research method that systematically accesses the voice of a community to plan context- appropriate action. CBPR provides an alternative to traditional research approaches that assume a phenomenon may be separated from its context for purposes of study. Investigators who use CBPR recognize that engaging members of a study population as active and equal participants, in all phases of the research, is crucial for the research process to be a means of facilitating change (Holkup et al., 2004). Change or action is the intended end product of CBPR, and “action research” is a term related to CBPR. Many scholars
220
consider CBPR to be a type of action research and group this within the tradition of critical science (Fontana, 2004).
In his book Action Research, Stringer (1999) distilled the research process into three phases: look, think, and act. In the look phase Stringer (1999) describes “building the picture” by getting to know stakeholders so that the problem is defined in their terms and the problem definition is reflective of the community context. He characterizes the think phase as interpretation and analysis of what was learned in the look phase. As investigators “think,” the researcher is charged with connecting the ideas of the stakeholders so that they provide evidence that is understandable to the larger community group (Stringer, 1999). Finally, in the act phase, Stringer (1999) advocates planning, implementation, and evaluation based on information collected and interpreted in the other phases of research.
Bisung and colleagues (2015) used photovoice as a CBPR tool to understand water, sanitation, and hygiene behaviors and to catalyze community-led solutions to change behaviors among women in Western Kenya. Changing these behaviors is essential for reducing waterborne and water-related diseases. Photovoice is a CBPR tool that can be used to foster trust and capacity building for community-led solutions to environment and health issues. Through photography, participants, who take the pictures themselves, are able to identify, represent, discuss, and find solutions to their everyday environment and health problems. In the first part of their study, photovoice one-on-one interviews were used to explore local perceptions and practices around water-health linkages and how the ecological and sociopolitical environment shapes these perceptions and practices. The second component consisted of using photovoice group discussions to explore participants’ experiences with and reactions to the photographs and the photovoice project. From the group discussions, three major themes emerged: awareness, immediate reactions, and planned actions. Awareness involved the photos serving as prompts to certain behaviors and practices in the community and the influence of these practices on their health. Immediate reactions involved spontaneous decisions to educate people and stop children from certain negative practices and having discussions on how to find
221
solutions to common negative behaviors and practices. Planned actions involved working with village leaders and the whole community.
Mixed methods research Mixed methods research is basically the use of both qualitative and quantitative methods in one study. Mixed methods research has evolved over the past decade. There are several types of mixed methods designs (Creswell & Plano Clark, 2011). Researchers who choose a mixed methods study choose on the basis of the question. (See Chapter 10 for further information.)
Data from different sources can be used to corroborate, elaborate, or illuminate the phenomenon in question. Example: ➤ Bhandari and Kim (2016) conducted a mixed methods study. The study aimed to develop an exploratory model for self-care in type 2 diabetic adults and enhance the model’s interpretation through qualitative input. For the qualitative component, the researchers conducted semistructured interviews with a subset (N = 13) of the total sample (N = 230). For the quantitative component, the subjects responded to several questionnaires related to self-care behaviors. As you read research, you will quickly discover that approaches and methods, such as mixed methods, are being combined to contribute to theory building, guide practice, and facilitate instrument development.
Although certain questions may be answered effectively by combining qualitative and quantitative methods in a single study, this does not necessarily make the findings and related evidence stronger. In fact, if a researcher inappropriately combines methods in a single study, the findings could be weaker and less credible.
Synthesizing qualitative evidence: Meta- synthesis The depth and breadth of qualitative research has grown over the years, and it has become important to qualitative researchers to synthesize critical masses of qualitative findings.
The terms most commonly used to describe this activity are qualitative meta-summary and qualitative meta-synthesis.
222
Qualitative meta-summary is a quantitatively oriented aggregation of qualitative findings that are topical or thematic summaries or surveys of data. Meta-summaries are integrations that are approximately equal to the sum of parts, or the sum of findings across reports in a target domain of research. They address the manifest content in findings and reflect a quantitative logic: to discern the frequency of each finding and to find in higher frequency the evidence of replication foundational to validity in most quantitative research. Qualitative meta-summary involves the extraction and further abstraction of findings, and the calculation of manifest frequency effect sizes (Sandelowski & Barroso, 2003a). Qualitative meta-synthesis is an interpretive integration of qualitative findings that are interpretive syntheses of data, including the phenomenologies, ethnographies, grounded theories, and other integrated and coherent descriptions or explanations of phenomena, events, or cases that are the hallmarks of qualitative research. Meta-syntheses are integrations that are more than the sum of parts in that they offer novel interpretations of findings. These interpretations will not be found in any one research report; rather, they are inferences derived from taking all of the reports in a sample as a whole. Meta-syntheses offer a description or explanation of a target event or experience, instead of a summary view of unlinked features of that event or experience. Such interpretive integrations require researchers to piece the individual syntheses constituting the findings in individual research reports together to craft one or more meta-syntheses. Their validity does not reside in a replication logic, but in an inclusive logic whereby all findings are accommodated and the accumulative analysis displayed in the final product. Meta-synthesis methods include constant comparison, taxonomic analysis, the reciprocal translation of in vivo concepts, and the use of imported concepts to frame data (Sandelowski & Barroso, 2003b). Meta-synthesis integrates qualitative research findings on a topic and is based on comparative analysis and interpretative synthesis of qualitative research findings that seek to retain the essence and unique contribution of each study (Sandelowski & Barroso, 2007).
Fleming and colleagues (2015) published a meta-synthesis of qualitative studies related to antibiotic prescribing in long-term care
223
facilities. The synthesis of qualitative research was used to facilitate determination of antibiotic prescribing in long-term care settings. This meta-synthesis provided a way to describe findings across a set of qualitative studies and create knowledge that is relevant to clinical practice. Sandelowski (2004) cautions that the use of qualitative meta-synthesis is laudable and necessary, but requires careful application of qualitative meta-synthesis methods. There are a number of meta-synthesis studies being conducted by nurse scientists. It will be interesting for research consumers to follow the progress of researchers who seek to develop criteria for appraising a set of qualitative studies and use those criteria to guide the incorporation of these studies into systematic literature reviews.
EVIDENCE-BASED PRACTICE TIP Although qualitative in its approach to research, community-based participatory research leads to an action component in which a nursing intervention is implemented and evaluated for its effectiveness in a specific patient population.
Issues in qualitative research Ethics Protection of human subjects is a critical aspect of all scientific investigation. This demand exists for both quantitative and qualitative research approaches. Protection of human subjects in quantitative approaches is discussed in Chapter 13. These basic tenets hold true for the qualitative approach. However, several characteristics of the qualitative methodologies outlined in Table 6.1 generate unique concerns and require an expanded view of protecting human subjects.
TABLE 6.1 Characteristics of Qualitative Research Generating Ethical Concerns
Characteristics Ethical Concerns Naturalistic setting
Some researchers using participant observation methods may believe that consent is not always possible or necessary.
Emergent Planning for questioning and observation emerges over the time of the study.
224
nature of design
Thus it is difficult to inform the participant precisely of all potential threats before he or she agrees to participate.
Researcher- participant interaction
Relationships developed between the researcher and participant may blur the focus of the interaction.
Researcher as instrument
The researcher is the study instrument, collecting data and interpreting the participant’s reality.
Naturalistic setting The central concern that arises when research is conducted in naturalistic settings focuses on the need to gain informed consent. The need to obtain informed consent is a basic researcher responsibility but is not always easy to obtain in naturalistic settings. For instance, when research methods include observing groups of people interacting over time, the complexity of gaining consent becomes apparent: Have all parties consented for all periods of time? Have all parties been consented? What have all parties consented to doing? These complexities generate controversy and debate among qualitative researchers. The balance between respect for human participants and efforts to collect meaningful data must be continuously negotiated. The reader should look for information indicating that the researcher has addressed this issue of balance by recording attention to human participant protection.
Emergent nature of design The emergent nature of the research design in qualitative research underscores the need for ongoing negotiation of consent with participants. In the course of a study, situations change, and what was agreeable at the beginning may become intrusive. Sometimes, as data collection proceeds and new information emerges, the study shifts direction in a way that is not acceptable to participants. For instance, if the researcher were present in a family’s home during a time when marital discord arose, the family may choose to renegotiate the consent. From another perspective, Morse (1998) discussed the increasing involvement of participants in the research process, sometimes resulting in their request to have their names published in the findings or be included as a coauthor. If the participant originally signed a consent form and then chose an
225
active identified role, Morse (1998) suggests that the participant then sign a “release for publication” form to address this request. The emergent qualitative research process demands ongoing negotiation of researcher-participant relationships, including the consent relationship. The opportunity to renegotiate consent establishes a relationship of trust and respect characteristic of the ethical conduct of research.
Researcher-participant interaction The nature of the researcher-participant interaction over time introduces the possibility that the research experience will become a therapeutic one. It is a case of research becoming practice. It is important to recognize that there are basic differences between the intent of nurses when engaging in practice and when conducting research (Smith & Liehr, 2003). In practice, the nurse has caring- healing intentions. In research, the nurse intends to “get the picture” from the perspective of the participant. The process of “getting the picture” may be a therapeutic experience for the participant. When a research participant talks to a caring listener about things that matter, the conversation may promote healing, even though it was not intended. From an ethical perspective, the qualitative researcher is promising only to listen and encourage the other’s story. If this experience is therapeutic for the participant, it becomes an unplanned benefit of the research. If it becomes harmful, the ethics of continuing the research becomes an issue and the study design will require revision.
Researcher as instrument The responsibility to establish rigor in data collection and analysis requires that the researcher acknowledge any personal bias and strive to interpret data in a way that accurately reflects the participant’s point of view. This serious ethical obligation may require that researchers return to the subjects at critical interpretive points and ask for clarification or validation.
Credibility, auditability, and fittingness Quantitative studies are concerned with reliability and validity of
226
instruments, as well as internal and external validity criteria as measures of scientific rigor (see the Critical Thinking Decision Path), but these are not appropriate for qualitative work. The rigor of qualitative methodology is judged by unique criteria appropriate to the research approach. Credibility, auditability, and fittingness were scientific criteria proposed for qualitative research studies by Guba and Lincoln (1981). Although these criteria were proposed decades ago, they still capture the rigorous spirit of qualitative inquiry and persist as reasonable criteria for appraisal of scientific rigor in the research. The meanings of credibility, auditability, and fittingness are briefly explained in Table 6.2.
TABLE 6.2 Criteria for Judging Scientific Rigor: Credibility, Auditability, Fittingness
Criteria Criteria Characteristics Credibility Truth of findings as judged by participants and others within the discipline. For
instance, you may find the researcher returning to the participants to share interpretation of findings and query accuracy from the perspective of the persons living the experience.
Auditability Accountability as judged by the adequacy of information leading the reader from the research question and raw data through various steps of analysis to the interpretation of findings. For instance, you should be able to follow the reasoning of the researcher step by step through explicit examples of data, interpretations, and syntheses.
Fittingness Faithfulness to participants’ everyday reality, described in enough detail so that others can evaluate importance for practice, research, and theory development. For instance, you will know enough about the human experience being reported that you can decide whether it “rings true” and is useful for guiding your practice.
EVIDENCE-BASED PRACTICE TIPS
• Mixed methods research offers an opportunity for researchers to increase the strength and consistency of evidence provided by the use of both qualitative and quantitative research methods.
• The combination of stories with numbers (qualitative and quantitative research approaches) through use of mixed methods may provide the most complete picture of the phenomenon being studied and, therefore, the best evidence for guiding practice.
227
Appraisal for evidence-based practice qualitative research General criteria for critiquing qualitative research are proposed in the following Critical Appraisal Criteria box. Each qualitative method has unique characteristics that influence what the research consumer may expect in the published research report, and journals often have page restrictions that penalize qualitative research. The criteria for critiquing are formatted to evaluate the selection of the phenomenon, the structure of the study, data collection, data analysis, and description of the findings. Each question of the criteria focuses on factors discussed throughout the chapter. Appraising qualitative research is a useful activity for learning the nuances of this research approach. You are encouraged to identify a qualitative study of interest and apply the criteria for critiquing. Keep in mind that qualitative methods are the best way to start to answer clinical and/or research questions that previously have not been addressed in research studies or that do not lend themselves to a quantitative approach. The answers provided by qualitative data reflect important evidence that may provide the first insights about a patient population or clinical phenomenon.
CRITICAL APPRAISAL CRITERIA Qualitative Approaches Identifying the phenomenon
1. Is the phenomenon focused on human experience within a natural setting?
2. Is the phenomenon relevant to nursing and/or health?
Structuring the study Research question
3. Does the question specify a distinct process to be studied?
4. Does the question identify the context (participant group/place)
228
of the process that will be studied?
5. Does the choice of a specific qualitative method fit with the research question?
Researcher’s perspective
6. Are the biases of the researcher reported?
7. Do the researchers provide a structure of ideas that reflect their beliefs?
Sample selection
8. Is it clear that the selected sample is living the phenomenon of interest?
Data collection
9. Are data sources and methods for gathering data specified?
10. Is there evidence that participant consent is an integral part of the data-gathering process?
Data analysis
11. Can the dimensions of data analysis be identified and logically followed?
12. Does the researcher paint a clear picture of the participant’s reality?
13. Is there evidence that the researcher’s interpretation captured the participant’s meaning?
14. Have other professionals confirmed the researcher’s interpretation?
Describing the findings
229
15. Are examples provided to guide the reader from the raw data to the researcher’s synthesis?
16. Does the researcher link the findings to existing theory or literature, or is a new theory generated?
In summary, the term qualitative research is an overriding description of multiple methods with distinct origins and procedures. In spite of distinctions, each method shares a common nature that guides data collection from the perspective of the participants to create a story that synthesizes disparate pieces of data into a comprehensible whole that provides evidence and promises direction for building nursing knowledge.
Key points • Qualitative research is the investigation of human experiences in
naturalistic settings, pursuing meanings that inform theory, practice, instrument development, and further research.
• Qualitative research studies are guided by research questions.
• Data saturation occurs when the information being shared with the researcher becomes repetitive.
• Qualitative research methods include five basic elements: identifying the phenomenon, structuring the study, gathering the data, analyzing the data, and describing the findings.
• The phenomenological method is a process of learning and constructing the meaning of human experience through intensive dialogue with persons who are living the experience.
• The grounded theory method is an inductive approach that implements a systematic set of procedures to arrive at theory about basic social processes.
• The ethnographic method focuses on scientific descriptions of cultural groups.
230
• The case study method focuses on a selected phenomenon over a short or long time period to provide an in-depth description of its essential dimensions and processes.
• CBPR is a method that systematically accesses the voice of a community to plan context-appropriate action.
• Ethical issues in qualitative research involve issues related to the naturalistic setting, emergent nature of the design, researcher- participant interaction, and researcher as instrument.
• Credibility, auditability, and fittingness are criteria for judging the scientific rigor of a qualitative research study.
• Mix methods approaches to research are promising.
Critical thinking challenges • How can mixed methods increase the effectiveness of qualitative
research?
• How can a nurse researcher select a qualitative research method when he or she is attempting to accumulate evidence regarding a new topic about which little is known?
• How can the case study approach to research be applied to evidence-based practice?
• Describe characteristics of qualitative research that can generate ethical concerns.
• Your interprofessional team is asked to provide a rationale about why they are searching for a meta-synthesis rather than individual qualitative studies to answer their clinical question.
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
231
References 1. Aamodt A.A. Ethnography and epistemology Generating
nursing knowledge. In: Morse J.M. Qualitative nursing research A contemporary dialogue. Newbury Park, CA: Sage;1991.
2. Aita V.A, McIlvain H.E. An armchair adventure in case study research. In: Crabtree B, Miller W.L. Doing qualitative research. Thousand Oaks, CA: Sage;1999.
3. Bhandari P, Kim M. Self-care behaviors of Nepalese adults with Type 2 diabetes. Nursing Research 2016;65(3):202-241.
4. Bisung E, Elliott S.J, Abudho B, et al. Using photovoice as a community based participatory research tool for changing water, sanitation, and hygiene behaviours in Usoma, Kenya. BioMed Research International 2015;2015:903025 Available at: doi:1155/2015/903025.
5. Charmaz K. Grounded theory Objectivist and constructivist methods. In: Denzin N.K, Lincoln Y.S. Handbook of qualitative research. Thousand Oaks, CA: Sage;2003.
6. Colaizzi P. Psychological research as a phenomenologist views it. In: Valle R.S, King M. Existential phenomenological alternatives for psychology. New York, NY: Oxford University Press;1978.
7. Cook G, Thompson J, Reed J. Reconceptualising the status of residents in a care home Older people wanting to “live with care.”. Ageing and Society 2015;35:1587-1613.
8. Corbin J, Strauss A. Basics of qualitative research. Los Angeles, CA: Sage;2015.
9. Creswell J.W. Qualitative inquiry and research design Choosing among five traditions. Thousand Oaks, CA: Sage;2013.
10. Creswell J.W, Plano-Clark V.L. Designing and conducting mixed methods research. 2nd ed. Thousand Oakes, CA: Sage;2011.
11. Denzin N.K, Lincoln Y.S. The landscape of qualitative research. Thousand Oaks, CA: Sage;2003.
12. Fleming A, Bradley C, Cullinan S, et al. Antibiotic prescribing in long-term care facilities A meta-synthesis of
232
qualitative research. Drugs & Aging 2015;32(4):295-303 Available at: doi:10.1007/s40266-015-0252-2.
13. Fontana J.S. A methodology for critical science in nursing. Advances in Nursing Science 2004;27(2):93-101.
14. Glaser B.G, Strauss A.L. The discovery of grounded theory Strategies for qualitative research. Chicago, IL: Aldine;1967.
15. Grassley J.S, Clark M, Schleis J. An institutional ethnography of nurses’ support of breastfeeding on the night shift. Journal of Obstetric, Gynecologic & Neonatal Nursing 2015;44:567-577.
16. Guba E, Lincoln Y. Effective evaluation. San Francisco, CA: Jossey-Bass;1981.
17. Holkup P.A, Tripp-Reimer T, Salois E.M, et al. Community- based participatory research An approach to intervention research with a Native American community. ANS Advance in Nursing Science 2004;27(3):162-175.
18. Hyatt K.S, Davis L.L, Barroso J. Finding the new normal Accepting changes after combat-related mild traumatic brain injury. Journal of Nursing Scholarship 2015;47:300-309.
19. Kleinman A. Local worlds of suffering An interpersonal focus for ethnographies of illness experience. Qualitative Health Research 1992;2(2):127-134.
20. Marshall C, Rossman G.B. Designing qualitative research. 5th ed. Los Angeles, CA: Sage;2011.
21. Mead M. Coming of age in Samoa. New York, NY: New American Library, Mentor Books;1949.
22. Morse J.M. The contracted relationship Ensuring protection of anonymity and confidentiality. Qualitative Health Research 1998;8(3):301-303.
23. Richards L, Morse J.M. Read me first for a user’s guide to qualitative methods. 2nd ed. Los Angeles, CA: Sage;2013.
24. Sandelowski M. Using qualitative research. Qualitative Health Research 2004;14(10):1366-1386.
25. Sandelowski M, Barroso J. Creating metasummaries of qualitative findings. Nursing Research 2003;52:226-233.
26. Sandelowski M, Barroso J. Toward a metasynthesis of qualitative findings on motherhood in HIV-positive women. Research in Nursing & Health 2003;26:153-170.
27. Sandelowski M, Barroso J. The travesty of choosing after
233
positive prenatal diagnosis. Journal of Obstetric, Gynecologic, and Neonatal Nursing 2005;34(4):307-318.
28. Sandelowski M, Barroso J. Handbook for synthesizing qualitative research. Philadelphia, PA: Springer;2007.
29. Silverman D, Marvasti A. Doing qualitative research. Los Angeles, CA: Sage;2008.
30. Smith M.J, Liehr P. The theory of attentively embracing story. In: Smith M.J, Liehr P. Middle range theory for nursing. New York, NY: Springer;2003.
31. Stake R.E. The art of case study research. Thousand Oaks, CA: Sage;1995.
32. Stake R.E. Case studies. In: Denzin N.K, Lincoln Y.S. Handbook of qualitative research 2nd ed. Thousand Oaks, CA: Sage;2000.
33. Strauss A.L. Qualitative analysis for social scientists. New York, NY: Cambridge University Press;1987.
34. Strauss A, Corbin J. Basics of qualitative research Grounded theory procedures and techniques. Newbury Park, CA: Sage;1990.
35. Strauss A, Corbin J. Grounded theory methodology. In: Denzin N.K, Lincoln Y.S. Handbook of qualitative research. Thousand Oaks, CA: Sage;1994.
36. In: Strauss A, Corbin J. Grounded theory in practice. Thousand Oaks, CA: Sage;1997.
37. Stringer E.T. Action research. 2nd ed. Thousand Oaks, CA: Sage;1999.
38. Tedlock B. Ethnography and ethnographic representation. In: Denzin N.K, Lincoln Y.S. Handbook of qualitative research. Thousand Oaks, CA: Sage;2003.
39. Vidick A.J, Lyman S.M. Qualitative methods their history in sociology and anthropology. In: Denzin N.K, Lincoln Y.S. The landscape of qualitative research Theories and issues. Thousand Oaks, CA: Sage;1998.
40. Walker R, Szanton S.L, Wenzel J. Working toward normalcy post-treatment A qualitative study of older adult breast and prostate cancer survivors. Oncology Nursing Forum 2015;42(6):358-367.
234
CHAPTER 7
Appraising qualitative research Dona Rinaldi Carpenter
Learning outcomes
After reading this chapter, you should be able to do the following:
• Understand the role of critical appraisal in research and evidence- based practice. • Identify the criteria for critiquing a qualitative research study. • Identify the stylistic considerations in a qualitative study. • Apply critical reading skills to the appraisal of qualitative research. • Evaluate the strengths and weaknesses of a qualitative study. • Describe applicability of the findings of a qualitative study. • Construct a written critique of a qualitative study.
KEY TERMS
bracketing
phenomenology
auditability
235
credibility
phenomena
saturation
theme
trustworthiness
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
Qualitative and quantitative research methods vary in terms of purpose, approach, analysis, and conclusions. Therefore, the use of each requires an understanding of the traditions on which the methods are based. This chapter aims to provide a set of criteria that can be used to critique qualitative research studies through a process of critical analysis and evaluation.
The critical appraisal of qualitative research continues to be discussed in nursing and related health care professions, providing a framework that includes key concepts for evaluation (Beck, 2009; Bigby, 2015; Flannery, 2016; Horsburgh, 2003; Ingham-Broomfield, 2015; Pearson et al., 2015; Russell & Gregory, 2003; Sandelowski, 2015; Williams, 2015).
Critical appraisal and qualitative research considerations Qualitative research represents a basic level of inquiry that seeks to discover and understand concepts, phenomena, or cultures. In a qualitative study, you should not expect to find hypotheses; theoretical frameworks; dependent and independent variables; large, random samples; complex statistical procedures; scaled instruments; or definitive conclusions about how to use the findings. A primary reason for conducting a qualitative study is to develop a theory or to discover knowledge about a phenomenon. Sample size is expected to be small. This type of research is not generalizable, nor should it be. Findings are presented in a
236
narrative format with raw data used to illustrate identified themes. Thick, rich data are essential in order to document the rigor of the research, which is called trustworthiness in a qualitative research study. Ensuring trustworthiness in qualitative inquiry is critical, as qualitative researchers seek to have their work recognized in an evidence-driven world (Beck, 2009; Bigby, 2015).
Application of qualitative research findings The purpose of qualitative research is to describe, understand, or explain phenomenon important to nursing. Phenomena are those things that are perceived by our senses. For example, pain and losing a loved one are considered phenomena. In a qualitative study, the researcher gathers narrative data that uses the participants’ voices and experiences to describe the phenomenon under investigation. Barbour and Barbour (2003) offer that qualitative research can provide the opportunity to give voice to those who have been disenfranchised and have no history. Therefore, the application of qualitative findings will necessarily be context-bound (Russell & Gregory, 2003).
Qualitative research also has the ability to contribute to evidenced-based practice literature (Anthony & Jack, 2009; Cesario et al., 2002; Donnelly & Wiechula, 2013; Walsh & Downe, 2005). Describing the lived human experience of patients can contribute to the improvement of care, adding a dimension of understanding to our work as it is described by those who live it on a day-to-day basis. Fundamentally, principles for evaluating qualitative research are the same. Reviewers are concerned with the plausibility and trustworthiness of the researcher’s account of the findings and its potential and/or actual relevance to current or future theory and practice (Horsburgh, 2003; Ingham-Bloomfield, 2015; Pearson et al., 2015; Sandelowski, 2015; Williams, 2015). As a framework for understanding how the appraisal of qualitative research can support evidence-based practice, a published research report and critical appraisal criteria follow (Table 7.1). The critical appraisal criteria will be used to demonstrate the process of appraising a qualitative research report. For information on specific guidelines for appraisal of phenomenology, ethnography, grounded theory,
237
and action research, see Chapters 5 and 6 and Streubert and Carpenter (2011).
TABLE 7.1 Critical Appraisal of Qualitative Research
238
CRITICAL APPRAISAL CRITERIA Qualitative Research Study As evidenced by published works, phenomenology is one approach to qualitative research. From a nursing perspective, qualitative research allows caregivers to understand the life experience of the patients they care for. Excerpts from “A Woman’s Experience: Living With an Implantable Cardioverter Defibrillator” by Jaclyn Conelius are provided throughout this chapter as examples of phenomenological research. The article was published in Applied Nursing Research in 2015. The following sections critique Conelius’s study. The primary purpose of this critique is to carefully examine how each step of the research process has been articulated in the study and to examine how the research has contributed to nursing knowledge. The article by Conelius (2015) provides an example of a phenomenological study true to qualitative methods.
Critique of a qualitative research study
The research study The study “A Woman’s Experience: Living With an Implantable Cardioverter Defibrillator” by Jaclyn Conelius, published in Applied Nursing Research, is critiqued. The article is presented in its entirety and followed by the critique.
A woman’s experience: Living with an implantable cardioverter defibrillator Jaclyn Conelius, PhD, FNP-BC
Abstract The implantable cardioverter defibrillators (ICD) have decreased mortality rates from those who are at risk for sudden cardiac death or who have survived sudden cardiac death and has been shown to
239
be superior to antiarrhythmic medications (Greenburg et al., 2004). This advance in technology may improve physical health but can impose some challenges to patients, such as depression, anxiety, fear, and unpredictability. Published research on how ICD affects a woman’s life experience using phenomenology is limited. Therefore, the purpose of this article is to describe the experiences of women who have an ICD using Colaizzi’s method of phenomenology since their implant. Analysis of the three interviews resulted in five themes that described the essence of this experience. The results of this study could not only help clinicians understand what their patients are experiencing but also it can be used as an education tool.
© 2014 Elsevier Inc. All rights reserved.
Introduction Implantable cardioverter defibrillators (ICDs) have decreased mortality rates from those who are at risk for sudden cardiac death or who have survived sudden cardiac death and has been shown to be superior to anti-arrhythmic medications (Greenburg et al., 2004). ICDs have been supported by many clinical trials and it is now the treatment of choice in primary and secondary prevention for these patients (Bardy et al., 2005; Bristow et al., 2004; Moss et al., 2002). This mainstay of treatment has increased steadily from 486,025 implants from 2006 to 2009 to 850,068 from 2010 to 2011 (Hammill et al., 2010; Kremers et al., 2013). Of these implants approximately 28% were female only.
This advance in technology may improve physical health but can impose some challenges to patients. They include the adjustments to the device in their everyday living, such as; quality of life issues as well as psychological issues. Through quantitative research the following have been reported; a fear of physical activity and a fear of shock from the device to prevent the sudden cardiac arrest (Lampert et al., 2002; Wallace et al., 2002; Whang et al., 2005). Other studies have reported anxiety, fear, and depression in these patients. Some specific fears included; malfunctioning, unpredictability, and the inability to control events (Dickerson, 2005; Dunbar, 2005; Eckert & Jones, 2002; Kamphuis et al., 2004; Lemon, Edelman, & Kirkness, 2004). These quality of life and
240
psychological issues reported in the studies are not reported as gender specific; therefore, female specific challenges are not well studied. Furthermore, there have been few qualitative studies based on a patient’s experience of living with an ICD. Previous studies reported themes such as the feeling of gratitude, safety, belief in the future, adjustment to the device, lifesaving yet changing, fear of receiving a shock, physical/mental deterioration, confrontation with mortality and conditional acceptance (Dickerson, 2002; Fridlund et al., 2000; Kamphuis et al., 2004; Morken, Severinsson, & Karlsen, 2009; Tagney, James, & Alberran, 2003).
Based on the available research studies, there is very little reported data specific to females and specifically how an ICD affects a woman’s lived experience. A lived experience is how a person immediately experiences the world (Husserl, 1970). In order to understand a woman’s lived experience living with an ICD, phenomenology was used. Phenomenology is a philosophy and a research method used to understand everyday lived experiences. Therefore, the purpose of this study was to describe what those experiences were, specifically, to describe their thoughts, feelings, and perceptions that they have experienced since their implant. It is important to gain an understanding and formulate a description of what life is for a woman who had received an implantable cardioverter defibrillator in order to describe the universal essence of that experience. Descriptive phenomenology emphasizes describing universal essences, viewing the person as one representative of the world in which she lives, an assumption of self-reflection, a belief that the consciousness is what people share and a belief that stripping of previous knowledge (bracketing) helps prevent investigator bias and interpretation bias (Wojnar & Swanson, 2007). Specifically, Colaizzi’s (1978) descriptive phenomenological method uses seven steps as a method of analyzing data so that by the end of the study a description of the lived experience could be reported.
Method Descriptive phenomenology originated from the philosopher Husserl (1970), who believed that the meaning of a lived experience may be discovered though one to one interaction between the
241
researcher and the subject. It assumes that for any human experience, there are distinct structures that make up the phenomenon. Studying the individual experiences highlights these essential structures. It is an inductive method that describes a phenomenon as it is experienced by an individual rather than by transforming it into an operationally defined behavior. An important aspect of descriptive phenomenology, according to Husserl, is the process of bracketing in which he describes as separating the phenomenon from the world and having the researcher suspend all preconceptions (Wojnar & Swanson, 2007). The goal of descriptive phenomenology is to provide a universal description of the lived experience as described by the participants of the phenomenon. Colaizzi’s (1978) method of descriptive phenomenology is the method used for this study. In his method, interviewing is the selected strategy for collecting data, which is necessary for describing an experience. This method works well with a small sample size.
Sample. Ten women were asked to participate, of these, three women agreed to participate from a private cardiology office in the United States. This convenient sample of women were all Caucasian and their ages ranged from 34 to 50 years old. All three women had college degrees and have had the device over one year. None of the women were previously diagnosed with any psychiatric disease.
Procedure. After receiving approval from the university’s institutional review board (IRB), women were recruited from a private cardiology office in the United States for 4 months. The participant population only included women that had an implantable cardioverter defibrillator (ICD). Women needed to be 18 years or older, and speak English. Women of all ethnic backgrounds were eligible to participate. There was no cost to the participant and no compensation provided. Once the informed consent was signed, they were asked to stay for an interview that day. All women were interviewed privately in the office and each interview lasted approximately 45 minutes to an hour. They were asked to “describe their experiences after having
242
received an ICD, specifically, to describe their thoughts, feelings, and perceptions that they had experienced since their implant?” They were then asked to share as much of those experiences to the point that they did not have anything else to contribute. The interviews were recorded and then transcribed. The researcher conducted all of the interviews since the researcher in trained in the method. Interviews were conducted until an accurate description of the phenomenon had occurred, repetition of data and no new themes where described. This saturation of data did occurs after the three interviews. After each interview, follow up questions were asked in order to clarify any points the participant described. The researcher kept a journal to write down any notes needed during the interview.
In order for the description to be pure, the researcher’s prior knowledge was bracketed to capture the essence of the description without bias (Wojnar & Swanson, 2007). Husserl (1970) introduced the term, and it means to set aside one’s own assumption and preunderstanding. In order to be true to the method, the researcher reflected and kept a journal of all assumptions, clinical experiences, understandings and biases to reference during the entire study.
Significant statements and phrases pertaining to a woman’s experience living with an ICD were extracted from each transcript. These statements were written on separate sheets and coded. Meanings were formulated from the significant statements. Accordingly, each underlying meaning was coded into a specific category as it reflected an exhaustive description. Then the significant statements with the formulated meanings where grouped into themes.
To ensure confidentiality, the signed informed consent forms were kept separate from the transcripts. The recorded tapes and hard copy were in a locked cabinet. Identifying information was deleted and names were never used in any research reports. Audiotapes were destroyed once the pilot study was completed.
Data analysis. Each transcript was analyzed using Colaizzi’s (1978) method. The method of data analysis consisted of the following steps; (1) read all the participants’ descriptions of the phenomenon, (2) extract
243
significant statements that pertain directly to the phenomenon, (3) formulate meanings for each significant statement, (4) categorizing into clusters of themes and validation with the original transcript, (5) describing, (6) validate the description by returning to the participant to ask them how it compares with their experience, and (7) incorporate any changes offered by the participant into the final description of the essence of the phenomenon.
Rigor. There were efforts made to limit any potential bias of the researcher. One such effort was to bracket any of the researcher’s prior perspective and knowledge of the subject (Aher, 1999). To ensure the credibility of the data collected, two of the women in the study reviewed the description of the lived experiences as suggested by Lincoln and Guba (1985). This was performed as a validity check of the data. In order to address for auditability, a tape recorder was used and the researcher reviewed the transcripts and cross-referenced the field noted (Beck, 1993).
Additionally, the transcripts were transcribed verbatim by a secretary in order to ensure they were free of bias. Also, the data analysis and description of the lived experience were reviewed by an independent judge with phenomenological experience to ensure intersubjective agreement. All of the themes reported were agreed upon by the judge.
Finally, the researcher validated the description by returning to the participants to ask them how it compared with their experience and incorporated any changes offered by the participants into the final description of the essence of the phenomenon. This final description was reviewed by other women with ICDs who were not a part of the study to ensure fittingness.
Results At the conclusion of verifying and reviewing the transcripts, there were 46 significant statements extracted that pertained directly to the phenomenon. From each significant statement formulated meanings were created. These statements were then formed into five themes (Table 1) that described the essence of these experiences.
244
TABLE 1 Selected Examples of Significant Statements and their Formulated Meaning for Five Themes
Theme Number Significant Statement Formulated Meaning
1 Security blanket: lf it keeps me alive It’s worth it.
“I do not have anything to worry about anymore. I used to worry that if something happened, how soon I can get to a hospital or what could they do to try to save me.”
The women did not have to worry anymore about medical emergencies.
2 A piece of cake: I do more than before.
“Actually, I probably do a little more than before. But I can do everything that I did before. I have not eased up on anything.”
She felt as if nothing has changed. She does everything she did prior.
3 A constant reminder: I know it’s there.
“The children sometimes bump into that side and I am literally guarding that side all the time.”
She is aware of it and guards it when others come in contact with it.
4 Living on the edge: I do not want it to go off.
“I do have a little fear of that but so far, it hasn’t happened.”
She has an extreme fear of the device shocking her.
5 Catch 22: I’d rather not have it.
“I would rather not personally have it but I know medically, I need to have it, which is a good thing.”
She would rather not have to have it, but she knows she needs it.
Theme 1: Security blanket: If it keeps me alive it’s worth it. Women who had an ICD felt a sense of security with the device. They felt that this device acted as a security blanket. Prior to their device they had a constant worry about how soon they could get medical treatment and now that they had the device, that worry was lifted. The feeling of worry was no longer apparent for them. One woman said:
Now I just think this will keep me alive long enough for somebody to make a decision, at least it will give me a chance. I do not have anything to worry about anymore. I used to worry that if something happened, how soon I could get to a hospital or what could they do to try to save me.
245
The women also described how their worry decreased should they require medical treatment while they were with their family also was decreased. “Now I do not have to worry if I am with my family, I have ICD in my chest to give me treatment right away.”
Another woman felt that the device just being there saved her life. “If the device can save her life it’s worth it.” The device prevents the heart from having sustained lethal arrhythmias.
She explained: “I feel like it saved my life, I feel like it keeps my heart beating nice and smooth.”
There was an overall feeling that the device improved their lives. Based on their past medical history, the device was needed since it is the next step in their medical treatment. All the women were glad they were able to receive the device. One woman explained:
It could be both ways. I mean, I feel knowing what my family history is, yeah, I am glad I have it. I needed it. It made me feel that I can go anywhere and do anything because it acts like my insurance policy.
Theme 2: A piece of cake: I do more than before. The women did not have a decrease in physical functioning or quality of life. Their quality of life remained stable or improved once the post operative period was over.
One woman explained:
Actually, I probably do a little more than before. But I can do everything that I did before. I have not eased up on anything. I felt like after the surgery, I was tired for 2 days then I could go on and do everything I used to do; now I do not even think about it. I just go about my day as usual and even do more because I know I have this to protect me.
The women felt that the whole process of receiving an ICD was easy. Nothing much changed in their everyday lives. They live and
246
do everything that they did before with no restrictions. Another woman shared,
After that, I really have had no change in lifestyle. My life has been as normal as it was before. Physically, I see no change, or even see an improvement.
Theme 3: A constant reminder. I know it’s there. The women felt as if they had a constant reminder of the ICD. Their family was aware of the device in their body since they can see the scar. Some family members would comment on the device if they could feel it when given a hug. This in turn would remind the women that it was there. The device did affect their body image; it made them more conscious of the device in their chest.
One woman with school aged children explained:
And it is hard when the kids cuddle up to me and I have to say I can’t have you on my left side anymore. With four kids, you know the pile up, at least the two youngest ones, they want to lie next to me while watching TV or when we are praying or reading books or doing anything. I have to remind them that you can’t put your head up there. The children sometimes bump into that side and I am literally guarding that side all the time.
The most amount of pain that women had experienced was postoperative. After that, it varied when the pain decreased. The actual incision is “hardly noticeable” in all of the women although the knowledge that the device is in fact in their chest is a “constant reminder.” The degree at which it reminds them varies depending on body type.
One woman stated: “I am reminded of this all the time, I can feel it, I know it is there. Everyday activities like opening a jar, it pops and moves. Anytime I use my pectoral muscle, I know it is there, which is a lot of what I do during the day, like laundry.”
247
Another woman stated: “The only thing that bothers me a little bit sometimes, it feels like it moves in my chest when I am in bed. When I lay a certain way it sometimes feels like it is popping out or something.”
Yeah, I mean just being that it is there and it should not be there and it shows itself all the time. I especially know it’s there in the summer when you were fewer clothes, especially bathing suits. To me it is constant reminder that I may feel fine, but I am technically sick.
Theme 4: Living on the edge. I do not want it to go off. All of the women had a common fear that was constantly in their thoughts. They feared that the device would have to do its job; it would have to “fire.” They did not want this fear to become a reality. They feared that they would be somewhere in public and the device would have to administer therapy or shock them. The women stated such things as:
I do have a little fear of that but so far, it hasn’t happened. Oh! I don’t want it to go off! I am completely scared it will go off and no one will know what the heck happened.
The fear of the device firing has a significant impact on these women. The most concerning part, is the wonder on what it will actually feel like, the uncertainty. These women could not possibly know how it would feel like since none of them have ever received a shock. They have been told that it feels like an “animal kicking you in the chest.” None of them to date have yet to experience it. To them that is unimaginable until it becomes a reality.
I am scared. I am afraid it is going to kick off and I was told it would feel like a pair of boots kicking you in the chest. And I am afraid, but it has never gone off. You know, I am wondering what it would feel
248
like. The doctor explained it almost like getting kicked in the chest by a horse. Well, that would be a jolt, I guess? I am afraid that I will be doing something, not feel anything, then all of a sudden boom!
Theme 5: Catch 22: I’d rather not have it. The women received these ICDs because it was medically necessary for them to have it based on the current guidelines. They have various cardiac medical conditions that require an implant of a defibrillator. The women understood that it was essential and yet they would rather not have had to go through it. They would rather not have the heart disease that comes with needing the device.
I would rather not personally have it but I know it is medically, I need to have it, which is a good thing that I have it. Mentally it bothers me, mentally; I know I cannot avoid it.
The women felt that the experience was depressing. They were mostly depressed immediately preceding the implantation. Although, it had decreased over time, there was a constant reminder of the device still there. They needed to adjust to the device, which was hard for them. They felt as if they had no choice to adjust to this new situation.
One woman explained:
Well, I have adjusted to it, I had no choice. But in another aspect, no, I would rather not be going through this. Interestingly, no one has ever asked me how I feel about having one before. I just got it and the doctor does not even ask me about it. I mean it comes and goes, because a lot of things I know are happening are like, it could get depressing. I do feel anxious at times, then I feel depressed at times, then I am fine at times. So, I guess it depends on what is going on.
249
Discussion Aspects of the five themes that describe the essence of a woman’s experience living with an ICD have been reported in previous studies, but nowhere is there a study that is an exact comparison to this study. For instance, theme 1 (security blanket: if it keeps me alive it’s worth it) is similar to the concept in Fridlund et al. (2000), a feeling of gratitude, and a feeling of safety. The women in this study expressed a feeling of safety and appreciation since they received their ICDs. This sense of safety and trust in the device is consistent with other studies (Bilge et al., 2006; Dickerson, 2002; Morken et al., 2009).
Contrary to what is found in the literature, the women in this study reported how they have more energy than before and noticed an actual increase in physical functioning. Previous studies have identified decreased physical functioning (Dickerson, 2005; Kamphuis et al., 2004; Williams, Young, Nikoletti, & McRae, 2007) and a decrease in activity levels in their day-to-day lives (Bolse, Hamilton, Flanangan, Caroll, & Fridlund, 2005; Eckert & Jones, 2002). This contradiction can be related to the types of studies conducted. Previous studies have used questionnaires while this study focused on actual descriptions experienced by participants who had undergone the device implant.
Theme 3 (a constant reminder: I know it’s there) described the women “knowing that the device was in their chest,” and it was a reminder of their condition. They also described how it affected their body image. There were two other studies that had mentioned this as a concern for women. One study by Walker et al. (2004) reported body image concerns of women. The women in that study were more concerned on how the device appeared in their chest (i.e. the scar) than any other aspect. A second study by Tagney et al. (2003), also reported body image concerns in women since it can be seen in their chest which makes them aware of the device. There were similarities with respect to body image only. They were not concerned with the constant reminder aspect of the cardiac disease, only a constant reminder of their mortality (Dickerson, 2002).
The common concern as described in theme 4 (Living on the Edge: I do not want my device going off) was the fear of the device having to shock them as well as the uncertainty of when, where,
250
and who would be around for support. This was foremost in their thoughts. There have been common themes of fear of the device going off or shocking them in the literature reviewed. Dickerson (2002, 2005) reported that uncertainty of when and where shocks can be triggered was a prevailing concern of the male and female participants. Also, participants in Albarran, Tagney, and James (2004) study reported a feeling of uncertainty regarding the device firing.
The prevailing concern in theme 5 (catch 22: I’d rather not have it.) is the conflict women have after receiving a device. These women knew that they medically needed the device yet would have rather not have gone through with it. Dickerson (2005) reported the theme of conditional acceptance that touches on the same concept. Also, a greater acceptance of the new situation was reported in previous studies (Carroll & Hamilton, 2005; Kamphuis et al., 2004).
The women in this study offered specific experiences of living with an ICD which is not completely seen in any previous study as stated previously. Moreover, there were some similar aspects identified in other studies such as receiving a shock and feeling of safety but most were not specific to women (Bilge et al., 2006; Dickerson, 2002, 2005; Morken et al., 2009). This study was able to describe the essence of women who are living with an ICD. As stated previously, the majority of the patients who receive ICD s are male and all of the samples in previous studies have been predominantly male. This study is specific to women and allows special insight to women who are living with cardiac disease and more specifically cardiac disease requiring a medical device.
Clinical implications and future research This study can have an impact on clinical practice as a whole by helping clinicians understand what their patients are experiencing. The women in this study stated that they experienced a lot of uncertainty regarding the need for the device and its functionality. This uncertainty can be reduced or eliminated by educating the patients with respect to how the device operates. An increase in education pre and post operatively on device functionality would benefit patients by relieving some of that uncertainty. These
251
concerns are not being addressed properly in the healthcare system. This study can help clinicians gain the understanding of the experience these women are having and perhaps pay closer attention to these issues when they are seen in outpatient settings.
Furthermore, this study can also advocate for support groups for women. Support groups would allow these patients to converse with other women with the same health condition. There are multiple studies in the literature regarding the use of support groups in heart failure patients, however, there are very few studies involving patients with ICDs. Support groups can expose women to different types of resources in order to cope better, decrease anxiety and answer any questions that arise (Myers & James, 2008). Also, it would give them a security knowing that they would be able to have each other as a support system.
The women in this study were similar in that they were Caucasian from affluent areas with numerous resources available to them (Smeulders et al., 2010). An additional study involving women of various ethnical backgrounds and ages would allow capture of a wider range of experiences. Also, since the women have an outstanding fear of the device firing/shocking them, a noteworthy follow-up study would be to describe their experience post firing/shock. These studies would help clinicians understand what their patients are experiencing. lt would allow them to be more empathetic and identify the gaps in knowledge. The results would become a valuable teaching tool to help educate patients regarding their device function.
The critique This is a critical appraisal of the article, “A Woman’s Experience: Living With an Implantable Cardioverter Defibrillator” (Conelius, 2015) to determine its usefulness and applicability for nursing practice.
Abstract The purpose of the abstract is to provide a clear overview of the study and summarize the main features of the findings and recommendations. The abstract should accurately represent the
252
remainder of the article. Conelius (2015) summarized the research in the following narrative:
The implantable cardioverter defibrillators (ICD) have decreased mortality rates from those who are at risk for sudden cardiac death or who have survived sudden cardiac death and has been shown to be superior to antiarrhythmic medications (Greenburg et al., 2004). This advance in technology may improve physical health but can impose some challenges to patients, such as depression, anxiety, fear, and unpredictability. Published research on how an ICD affects a woman’s life experience using phenomenology is limited. Therefore, the purpose of this article is to describe the experiences of women who have an ICD using Colaizzi’s method of phenomenology since their implant. Analysis of the three interviews resulted in five themes that described the essence of this experience. The results of this study could not only help clinicians understand what their patients are experiencing but also it can be used as an education tool.
Introduction/review of literature All research requires the investigator to review the literature. This is the point at which gaps are identified with regard to what is known about a particular topic and what is not known.
In qualitative research, the literature review is generally brief, because there is not a great deal known about the topic; nor is there an existing body of research studies. This essentially means that the researcher needs to have an understanding of the substantive body of knowledge on the topic and a clear perspective of what areas still need to be explored. A clear rationale for why the research is needed should be established. The researcher must be clear that a gap in nursing knowledge was identified, there is a clear need for the study, and the selected research method is appropriate. Bracketing what is known about the phenomenon is one way to prevent bias and keep what is known about the topic separate, prior to data collection and analysis (see Chapter 6). Conelius (2015)
253
discusses bracketing in the data collection section of her research on women and implantable cardiac defibrillators. The background information provided in her introduction establishes a need for a qualitative study. Conelius (2015) emphasizes the fact that to date much of the research has been quantitative. She further notes that qualitative studies to date have not been gender specific, emphasizing the need for a study related to women’s experiences.
Implantable cardioverter defibrillators (ICDs) have decreased mortality rates from those who are at risk for sudden cardiac death or who have survived sudden cardiac death and has been shown to be superior to anti-arrhythmic medications (Greenburg et al., 2004). ICDs have been supported by many clinical trials and it is now the treatment of choice in primary and secondary prevention for these patients (Bardy et al., 2005; Bristow et al., 2004; Moss et al., 2002). This mainstay of treatment has increased steadily from 486,025 implants from 2006 to 2009 to 850,068 from 2010 to 2011 (Hammill et al., 2010; Kremers et al., 2013). Of these implants approximately 28% were female only. (Conelius, 2015)
This advance in technology may improve physical health but can impose some challenges to patients. They include the adjustments to the device in their everyday living, such as; quality of life issues as well as psychological issues. Through quantitative research the following have been reported; a fear of physical activity and a fear of shock from the device to prevent the sudden cardiac arrest (Lampert et al., 2002; Wallace et al., 2002; Whang et al., 2005). Other studies have reported anxiety, fear, and depression in these patients. Some specific fears included; malfunctioning, unpredictability, and the inability to control events (Dickerson, 2005; Dunbar, 2005; Eckert & Jones, 2002; Kamphuis et al., 2004; Lemon, Edelman, & Kirkness, 2004). These quality of life and psychological issues reported in the studies are not reported as gender specific; therefore, female specific challenges are not well studied. Furthermore, there have been few qualitative studies based
254
on a patient’s experience of living with an ICD. Previous studies reported themes such as the feeling of gratitude, safety, belief in the future, adjustment to the device, lifesaving yet changing, fear of receiving a shock, physical/mental deterioration, confrontation with mortality and conditional acceptance (Dickerson, 2002; Fridlund et al., 2000; Kamphuis et al., 2004; Morken, Severinsson, & Karlsen, 2009; Tagney, James, & Alberran, 2003). Based on the available research studies, there is very little reported data specific to females and specifically how an ICD affects a woman’s lived experience. (Conelius, 2015)
Phenomenology is a philosophy and a research method used to understand everyday lived experiences and is an appropriate methodology for the phenomena of interest. The subjective experience of women with an ICD is central to study and key to developing interventions to help these women cope. Conelius (2015) clearly articulates the focus of the study and makes a clear case for why a qualitative design is appropriate.
When critiquing the literature review of a qualitative study, it is important to remember that this component of the study must be critiqued within the context of the qualitative methodology selected. In phenomenological studies, the literature review may be delayed until the data analysis is complete in order to minimize bias. Conelius (2015) does not indicate that the review was delayed.
Philosophical underpinnings In addition to making a case for the study and qualitative approach, it is also important to give the reader perspective on the philosophical traditions of the method selected. Conelius (2015) describes the philosophical underpinnings of phenomenology and then relates the traditions to the method used in the study. In most published studies, the author is most concerned about sharing the findings of the study. This limits the space for in-depth literature reviews or discussion of the method used. Conelius (2015) discusses the work of Husserl (1970) as being an integral component of her philosophical grounding of phenomenology as method. She then connects this fundamental work to the method developed by
255
Colaizzi (1978).
A lived experience is how a person immediately experiences the world (Husserl, 1970). In order to understand a woman’s lived experience living with an ICD, phenomenology was used. Phenomenology is a philosophy and a research method used to understand everyday lived experiences. Descriptive phenomenology emphasizes describing universal essences, viewing the person as one representative of the world in which she lives, an assumption of self-reflection, a belief that the consciousness is what people share and a belief that stripping of previous knowledge (bracketing) helps prevent investigator bias and interpretation bias (Wojnar & Swanson, 2007). Specifically, Colaizzi’s (1978) descriptive phenomenological method uses seven steps as a method of analyzing data so that by the end of the study a description of the lived experience could be reported. (Conelius, 2015)
The specific qualitative research approach selected helps determine the focus of the research and the manner in which sampling, data collection, and analysis are undertaken. The qualitative research example provided here used phenomenology as method. Research studies using a qualitative approach other than phenomenology should be critiqued relative to the philosophical underpinnings of the method.
Purpose The author explained why the study was important and the significant contribution the study would make to nursing’s body of knowledge. The background information justified the use of a qualitative approach as well as why phenomenology was used.
The researcher states that “The purpose of this study was to describe a woman’s experience living with an ICD. More specifically to describe their thoughts, feelings and perceptions that they have experienced since their implant” (Conelius, 2015). The purpose is clearly articulated, first in the abstract and then in the
256
introduction of the study. Conelius (2015) makes it clear that there is a gap in nursing knowledge related to ICDs and the experience of women living with an ICD.
Ethical considerations Addressing the ethical aspect of a research report involves being able to know whether participants were told what the research entailed, how their autonomy and confidentiality were protected, and what arrangements were made to avoid harm. In qualitative research the data collection tools generally include interview and participant observation, making anonymity impossible. Because the interviews are open-ended, the possibility of disclosing personal information or uncomfortable experiences related to the topic may occur. Consent must be a process of continuous negotiation (Oye et al., 2016).
The study by Conelius (2015) was approved by the Institutional Review Board. The author clearly states how the participants were protected. “To ensure confidentiality the signed informed consent forms were kept separate from the transcripts. The recorded tapes and hard copy were in a locked cabinet. Identifying information was deleted and names were never used in any research reports. Audiotapes were destroyed once the pilot study was completed” (Conelius, 2015). Participants were fully informed about the nature of the research and were protected from harm; their autonomy and confidentiality were protected.
Conelius (2015) also made clear to the participants that they had the right to withdraw from the research at any time. This is true for any research; however, in a qualitative investigation, ethical issues may arise at any point in the study (Hegney & Chan, 2010). Conelius (2015) clearly articulated the ethical rigor of this study.
Sample In qualitative research, participants are recruited because of their life experience with the phenomena of interest. This is referred to as purposeful sampling. The goal is to ensure rich, thick data about the phenomenon of interest. Data are generally collected until no new material is emerging and data saturation has been reached. Cleary and colleagues (2014) discuss sampling in qualitative
257
research in relationship to sample size. Qualitative studies generally have a small sample. Following the steps for sampling in qualitative research, Conelius (2015) offers the following information related to participant selection:
After receiving approval from the university’s institutional review board (IRB), women were recruited from a private cardiology office in the United States for 4 months. The participant population only included women that had an implantable cardioverter defibrillator (ICD). Women needed to be 18 years or older, and speak English. Women of all ethnic backgrounds were eligible to participate. There was no cost to the participant and no compensation provided. Once the informed consent was signed, they were asked to stay for an interview that day. (Conelius, 2015)
In qualitative research, purposive sampling is the approach of choice. Participants must have experience with the phenomenon of interest and be appropriate to inform the research. In this case, Conelius (2015) needed women with an ICD. Her selection process supports a qualitative sampling paradigm that is appropriate for phenomenology.
Data generation The data generation approach should be sufficiently described so that it is clear to the reader why a particular strategy was selected.
Conelius (2015) clearly articulates that the data generation method supports a qualitative paradigm and allows for discovery, description, and understanding of the participants’ lived experience. The researcher uses open-ended questioning and asks each individual to exhaust their ideas and describe their experiences. She also completes three in-depth interviews with each participant, allowing for clarification of responses as well as an opportunity for the participants to add experiences that may have been omitted at the first interview. Recording and transcribing the interview verbatim helps maintain authenticity of the data. The following excerpts from the article illustrate these points:
258
All women were interviewed privately in the office and each interview lasted approximately 45 minutes to an hour. They were asked to “describe their experiences after having received an ICD, specifically, to describe their thoughts, feelings, and perceptions that they had experienced since their implant?” They were then asked to share as much of those experiences to the point that they did not have anything else to contribute. The interviews were recorded and then transcribed. The researcher conducted the interviews since the researcher was trained in the method. Interviews were conducted until an accurate description of the phenomenon had occurred, repetition of data and no new themes where described. This saturation of data did occur after the three interviews. (Conelius, 2015)
The researcher kept a journal to write down any notes needed during the interview. “In order for the description to be pure, the researcher’s prior knowledge was bracketed to capture the essence of the description without bias (Wojnar & Swanson, 2007). Husserl (1970) introduced the term, and it means to set aside one’s own assumption and preunderstanding. In order to be true to the method, the researcher reflected and kept a journal of all assumptions, clinical experiences, understandings and biases to reference during the entire study. Significant statements and phrases pertaining to a woman’s experience living with an ICD were extracted from each transcript. These statements were written on separate sheets and coded. Meanings were formulated from the significant statements. Accordingly, each underlying meaning was coded into a specific category as it reflected an exhaustive description. Then the significant statements with the formulated meanings where grouped into themes.” (Conelius, 2015)
Data generation was appropriate for this study and followed the steps described by Colaizzi (1978).
259
Data analysis The process of data analysis is fundamental to determining the credibility of qualitative research findings. Data analysis involves the transformation of raw data into a final description or narrative, identifying common thematic elements found in the raw data. The description should enable a reviewer to confirm the processes of concurrent data collection and analysis as well as steps in coding and identifying themes.
Data analysis followed the method described by Colaizzi (1978). The author developed a table to allow the reader to follow the line of thinking and establish thematic elements. The reader can clearly follow the researcher’s stated processes. Further, Conelius (2015) followed clear processes to establish authenticity and trustworthiness of the data. The findings reported demonstrate the participants’ realities. During data analysis the researcher made every effort to eliminate potential bias. Bracketing, verbatim transcription of taped interviews, and an independent reviewer were used to establish intersubjective agreement.
Authenticity and trustworthiness Critical to the meaning of the findings is the researcher’s ability to demonstrate that the data were authentic and trustworthy or valid. Rigor ensures there is a correlation between the steps of the research process and the actual study. Procedural rigor relates to accuracy of data collection and analysis. Rigor or trustworthiness is a means of demonstrating the credibility and integrity of the qualitative research process (Cope, 2014). A study’s rigor may be established if the reviewer is able to audit the actions and development of the researcher. It is at this point that the review of literature becomes critical and should be systematically related to the findings. This was addressed by the author, and every effort was clearly employed to reduce any bias or misinterpretation of findings.
Conelius (2015) was able to demonstrate rigor with regard to data analysis in multiple ways. She stated:
There were efforts made to limit any potential bias of the
260
researcher. One such effort was to bracket any of the researcher’s prior perspective and knowledge of the subject (Aher, 1999). To ensure the credibility of the data collected, two of the women in the study reviewed the description of the lived experiences as suggested by Lincoln and Guba (1985).
This was performed as a validity check of the data. In order to address for auditability, a tape recorder was used and the research was reviewed the transcripts and cross-referenced the field noted (Beck, 1993). Additionally, the transcripts were transcribed verbatim by a secretary in order to ensure they were free of bias. The data analysis and description of the lived experience were reviewed by an independent judge with phenomenological experience to ensure intersubjective agreement. All of the themes reported were agreed upon by the judge. Finally, the researcher validated the description by returning to the participants to ask them how it compared with their experience and incorporated any changes offered by the participants into the final description of the essence of the phenomenon were created.
Conelius (2015) provided clear evidence of rigor for the reader. Bracketing, having participants read the final description and thematic elements, taping and transcribing interviews verbatim, and using an independent judge to establish intersubjective agreement are key elements in a well done qualitative study. The author also left an audit trail illustrated in table format. This table establishes the researcher’s line of thinking. Examples of how raw data lead to the identification of thematic elements were provided and further establish rigor for this study.
Findings, conclusions, implications, and recommendations Findings from a qualitative study generally are discussed in a narrative format that tells the story of the experience through an
261
exhaustive description and thematic elements. Conelius (2015) summarized conclusions, implications, and recommendations from the study. The findings were also compared to prior research studies. In qualitative research, this is the area that must include a comprehensive incorporation of current research on the topic. According to Conelius (2015):
Aspects of the five themes that describe the essence of a woman’s experience living with an ICD have been reported in previous studies, but nowhere is there a study that is an exact comparison to this study. For instance, theme 1 (security blanket: if it keeps me alive it’s worth It) is similar to the concept in Fridlund et al. (2000), a feeling of gratitude, and a feeling of safety. The women in this study expressed a feeling of safety and appreciation since they received their ICDs. This sense of safety and trust in the device is consistent with other studies. (Bilge et al., 2006; Dickerson, 2002; Morken et al., 2009)
Contrary to what is found in the literature, the women in this study reported how they have more energy than before and noticed an actual increase in physical functioning. Previous studies have identified decreased physical functioning (Dickerson, 2005; Kamphuis et al., 2004; Williams, Young, Nikoletti, & McRae, 2007) and a decrease in activity levels in their day-to-day lives (Bolse, Hamilton, Flanangan, Caroll, & Fridlund, 2005; Eckert & Jones, 2002). This contradiction can be related to the types of studies conducted. Previous studies have used questionnaires while this study focused on actual descriptions experienced by participants who had undergone the device implant. Theme 3 (a constant reminder: I know it’s there) described the women “knowing that the device was in their chest,” and it was a reminder of their condition. They also described how it affected their body image. There were two other studies that had mentioned this as a concern for women. One study by Walker et al. (2004) reported body image concerns of women. The women in that study were more concerned on how the
262
device appeared in their chest (i.e., the scar) than any other aspect. A second study by Tangney et al. (2003), also reported body image concerns in women since it can be seen in their chest which makes them aware of the device. There were similarities with respect to body image only. They were not concerned with the constant reminder aspect of the cardiac disease, only a constant reminder of their mortality. (Dickerson, 2002)
The common concern as described in theme 4 (Living on the Edge: I do not want my device going off) was the fear of the device having to shock them as well as the uncertainty of when, where, and who would be around for support. This was foremost in their thoughts. There have been common themes of fear of the device going off or shocking them in the literature reviewed. Dickerson (2002, 2005) reported that uncertainty of when and where shocks can be triggered was a prevailing concern of the male and female participants. Also, participants in Albarran, Tagney, and James’ (2004) study reported a feeling of uncertainty regarding the device firing. The prevailing concern in theme 5 (catch 22: I’d rather not have it.) Is the conflict women have after receiving a device. These women knew that they medically needed the device yet would have rather not have gone through with it. Dickerson (2005) reported the theme of conditional acceptance that touches on the same concept. Also, a greater acceptance of the new situation was reported in previous studies. (Carroll & Hamilton, 2005; Kamphuis et al., 2004)
The women in this study offered specific experiences of living with an ICD which is not completely seen in any previous study. Moreover, there were some similar aspects identified in other studies such as receiving a shock and feeling of safety but most were not specific to women. (Bilge et al., 2006; Dickerson, 2002, 2005; Morken et al., 2009)
This study was able to describe the essence of women who are
263
living with an ICD. The study remained true to qualitative research design. The focus on women was important, as there have been no gender specific studies to date. Capturing the fear and uncertainty for women with an ICD can have an impact on clinical practice and patient education. The author emphasized that these concerns are not being addressed properly in the healthcare system. This study can help clinicians gain an understanding of the experience these women are having and perhaps pay closer attention to these issues when they are seen in outpatient settings (Conelius, 2015).
The research may also be helpful in the establishment of support groups for women with ICDs. “Support groups can expose women to different types of resources in order to cope better, decrease anxiety, and answer any questions that arise” (Myers & James, 2008). “Since the women have an outstanding fear of the device firing/shocking them, a noteworthy follow-up study would be to describe their experience post firing/shock” (Conelius, 2015). By capturing the experiences of women with ICDs, the potential for better sensitivity toward the patient experience exists. This may be critical to overall quality of life and extends beyond the actual purpose and operation of the device. Conelius (2015) has made an important contribution to the understanding of women’s experiences with an ICD.
The critical appraisal of a qualitative study involves an in-depth review of each step of the research process. The example of a qualitative critique in this chapter provides a foundation for the development of critiquing skills in qualitative research.
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Aher K. Pearls, pith, and provocation Ten tips forreflexive
bracketing. Qualitative Health Research 1999;9:407-411. 2. Albarran J.W, Tagney J, James J. Partners of ICD patients—
An exploratory study of their experiences. European Journal of Cardiovascular Nursing 2004;3(3):201-210.
264
3. Bardy G.H, Lee K.L, Mark D.B, Poole J.E, Packer D.L., Boineau R., et al. Sudden Cardiac Death in Heart Failure (SCD- HeFT). The New England Journal of Medicine 2005;352(3):225- 237.
4. Beck C.T. Qualitative research Evaluation of its credibility, auditability, and fittingness. Western Journal of NursingResearch 1993;15:263-326.
5. Bilge A.K, Ozben B, Demircan S, Cinar M, Yilmaz E, Adalet K. Depression and anxiety status of patients with implantable cardioverter defibrillators and precipitating factors. Pacing and Clinical Electrophysiology 2006;29:619-626.
6. Bolse K, Hamilton G, Flanangan J, Caroll D.L, Fridlund B. Ways of experiencing the life situation among United States patients with an implantable cardioverter defibrillator A qualitative study. Progress in CardiovascularNursing 2005;20:4-10.
7. Bristow M.R, Saxon L.A., Boechmer J, Krueger S, Kass D.A, DeMarco T, et al. Cardiac-resynchronization therapy with or without an implantable cardioverter in advanced chronic heart failure. The New England Journal of Medicine 2004;250:2140- 2150.
8. Carroll D.L, Hamilton G.A. Long-term effects of implanted cardioverter- defibrillators on health status, quality of life, and psychological state. American Journal of Critical Care 2005;17:222-230.
9. Colaizzi P. Psychological research as the phenomenologist view it. In: Valle R, King M. Existential phenomenological alternatives for psychology. New York: Oxford University Press 1978;48-71.
10. Dickerson S. Redefining life while forestalling death Living an implantable cardioverter defibrillator after a sudden cardiac death experience. Qualitative Health Research 2002;12:360- 372.
11. Dickerson S. Technology-patient interactions Internet use for gaining a healthy context for living with an implantable cardioverter defibrillator. Heart & Lung 2005;34(3):157-168.
12. Dunbar S.B. Psychological signs of patients with implantable cardioverter defibrillators. American Journal of Critical Care
265
2005;14(4):294-303. 13. Eckert M, Jones T. How does an implantable cardioverter
defibrillator (ICD) affect the lives of patients and their families?. International Journal of Nursing Practice 2002;8(3):152-157.
14. Fridlund B, Lindgren E, Ivarson A, Jinhage B.B, Bolse K, Flemme I, et al. Patients with implantable cardioverter defibrillators and their conceptions of the life situation A qualitative analysis. Journal of Clinical Nursing 2000;9:37-45.
15. Greenburg H, Case R.B, Moss A.J, Brown N.M, Carroll E.R, Andrews M.L, et al. Analysis of mortality events in the Multicenter Automatic Defibrillator Implantation Trial (MADIT-II). Journal of the American College of Cardiology 2004;43:1429-1465.
16. Hammill S.C, Kremers M.S, Stevenson LW., Heidenreich P.A., Lang C.M, Curtis J.P, et al. Review of the Registry’s fourth year, incorporating lead data and pediatric ICD procedures, and use as a national performance measure. Heart Rhythm 2010;7(9):1340-1345.
17. Husserl E. Crisis of European sciences and transcendentalphenomenology. Evanston: Northwestern University Press;1970.
18. Kamphuis H, Verhoeven N, Leeuw R, Derksen R, Hauer R, Winnubst J.A. ICD A qualitative study of patient experience the first year after implantation. Journal of Clinical Nursing 2004;13(8):1008-1031.
19. Kremers M.S, Hammill S.C., Berul C.I., Koutras C, Curtis J.S, Wang Y, et al. The National Registry Report Version 2.1 including leads and pediatrics for years 2010 and 2011. Heart Rhythm 2013;10(4):59-65.
20. Lampert R, Joska T, Burg M.M, Batsford W.P, McPherson C.A, Jain D. Emotional and physical precipitants of ventricular arrhythmias. Circulation 2002;106:1800-1805.
21. Lemon J, Edelman S, Kirkness A. Avoidance behaviors in patients with implantable cardioverter defibrillators. Heart &Lung 2004;33:176-182.
22. Lincoln Y, Guba E. Naturalistic Inquiry. Newbury Park: Sage Publications;1985.
23. Morken I, Severinsson E, Karlsen B. Reconstructing
266
unpredictability Experiences of living with an implantable cardioverter defibrillator over time. Journal of Clinical Nursing 2009;19:537-546.
24. Moss A, Zareba W, Hall J, Klein H, Wilbur D, Cannom D, et al. Prophylactic implantation of a defibrillator in patients with myocardial infarction and reduced ejection fraction. The New England Journal of Medicine 2002;346:877-883.
25. Myers G.M, James G.D. Social support, anxiety, and support group participation in patients with an implantable cardioverter defibrillator. Progress in Cardiovascular Nursing 2008;23:160- 167.
26. Smeulders J., van Haastregt T, Ambergen T, Uszko-Lencer N.H., Janssen-Boyne J.J., Gorgeis P, et al. Nurse-led self- management group programme for patients with congestive heart failure Randomized control trial. Journal of Advanced Nursing 2010;66:1487-1499.
27. Tagney J, James J, Alberran J. Exploring the patient’s experiences of learning to live with an implantable cardioverter defibrillator (ICD) from one UK centre A qualitative study. European Journal of Cardiovascular Nursing 2003;2:195-203.
28. Walker R., Campell K, Sears S, Glenn B, Sotile R, Curtis A, et al. Women and the implantable cardioverter defibrillator A lifespan perspective on key psychological issues. Clinical Cardiology 2004;27:543-546.
29. Wallace B, Sears S, Lewis T, Griffis J, Curtis A, Conti J. Predictors of quality of life in long-term recipients of implantable cardioverter defibrillators. Journal of Cardiopulmonary Rehabilitation 2002;22:278-281.
30. Whang W, Albert C.M, Sears S.F, Lampert R, Conti J.B, Wang P.J, et al. Depression as a predictor for appropriate shocks among patients with implantable cardioverter defibrillators Results from the Triggers of Ventricular Arrhythmias (TOVA) study. Journal of the American College of Cardiology 2005;45:1090-1095.
31. Williams A.M, Young J, Nikoletti S, McRae S. Getting on with life; Accepting the permanency of an implantable cardioverter defibrillator. International Journal of Nursing Practice 2007;13:166-172.
267
32. Wojnar D.M, Swanson K.M. Phenomenology An exploration. Journal of Holistic Nursing 2007;25:172-180.
References 33. Anthony S, Jack S. Qualitative case study methodology in
nursing research An integrative review. Journal of Advanced Nursing 2009;65(6):1171-1181 Available at: doi:10.1111/j.1365-2648.2009.04998.x.
34. Barbour R.S, Barbour M. Evaluating and synthesizing qualitative research The need to develop a distinctive approach. Journal of Evaluation in Clinical Practice 2003;9(2):179-186.
35. Beck C. Critiquing qualitative research. AORN Journal 2009;90(4):543-554 Available at: doi:10.1016/j.aorn.2008.12.023.
36. Bigby C. Preparing manuscripts that report qualitative research Avoiding common pitfalls and illegitimate questions. Australian Social Work 2015;68(3):384-391 Available at: doi:10.1080/0312407X.2015.1035663.
37. Cesario S, Morin K, Santa-Donato A. Evaluating the level of evidence of qualitative research. Journal of Obstetric, Gynecologic and Neonatal Nursing 2002;31(6):708-714.
38. Cleary M, Escott P, Horsfall J, et al. Qualitative research The optimal scholarly means of understanding the patient experience. Issues in Mental Health Nursing 2014;35(11):902- 904 Available at: doi:10.3109/01612840.2014.965619.
39. Cleary M, Horsfall J, Hayter M. Data collection and sampling in qualitative research Does size matter. Journal of Advanced Nursing 2014;70(3):473-475 Available at: doi:10.1111.
40. Colaizzi P. Psychological research as the phenomenologist view it. In: Valle R, King M. Existential phenomenological alternatives for psychology. New York: Oxford University Press 1978;48-71.
41. Cope D.G. Methods and meanings Credibility and trustworthiness of qualitative Research. Oncology Nursing Forum 2014;41(1):89-91 Available at: doi:10.1188/14.ONF.
42. Donnelly F, Wiechula R. An example of qualitative
268
comparative analysis in nursing research. Nurse Researcher 2013;20(6):6-11.
43. Flannery M. Common perspectives in qualitative research. Oncology Nursing Forum 2016;43(4):517-518 Available at: doi:10.1188/16.ONF.
44. Hegney D, Chan T.W. Ethical challenges in the conduct of qualitative research. Nurse Researcher 2010;18(1):4-7.
45. Horsburgh D. Evaluation of qualitative research. Journal of Clinical Nursing 2003;12:307-312.
46. Ingham-Broomfield R. A nurses’ guide to qualitative research. Australian Journal of Advanced Nursing 2015;32(3):34-40.
47. Oye C, Sørensen N.O, Glasdam S. Qualitative research ethics on the spot. Nursing Ethics 2016;23(4):455-464 Available at: doi:10.1177/0969733014567023.
48. Pearson A, Jordan Z, Lockwood C, et al. Notions of quality and standards for qualitative research reporting. International Journal of Nursing Practice 2015;21(5):670-676 Available at: doi:10.1111/ijn.12331.
49. Russell C.K, Gregory D.M. Evaluation of qualitative research studies. Evidence-Based Nursing 2003;6(2):36-40.
50. Sandelowski M. A matter of taste Evaluating the quality of qualitative research. Nursing Inquiry 2015;22(2):86-94 Available at: doi:10.1111/nin.12080.
51. Streubert H.J, Carpenter D.R. Qualitative nursing research Advancing the humanistic imperative. Philadelphia, PA: Wolters Klower Health;2011.
52. Walsh D, Downe S. Meta-synthesis method of qualitative research A literature review. Journal of Advanced Nursing 2005;50(2):204-211.
53. Williams B. How to evaluate qualitative research. American Nurse Today 2015;10(11):31-38.
269
PART I I I
Processes and Evidence Related to Quantitative Research Research Vignette: Elaine Larson
OUTLINE
Introduction
8. Introduction to quantitative research
9. Experimental and quasi-experimental designs
10. Nonexperimental designs
11. Systematic reviews and clinical practice guidelines
270
12. Sampling
13. Legal and ethical issues
14. Data collection methods
15. Reliability and validity
16. Data analysis: Descriptive and inferential statistics
17. Understanding research findings
18. Appraising quantitative research
271
Introduction
Research vignette
Sometimes the simplest things are the most complicated
Elaine Larson, PhD, RN, FAAN, CIC
Professor of Epidemiology
Associate Dean of Scholarship & Research
Columbia University
School of Nursing
New York, New York
Every nurse researcher has a story, which usually emanates from clinical experiences. I had several such experiences that instilled in me a passion for research. In the year following my graduation decades ago from a BSN program, I was working on a medical unit and a young patient of mine with mitral valve disease called me to her bedside to tell me that she did not feel well, was having trouble breathing, and that something was terribly wrong. I took her vital signs, did not detect anything serious, and set her up with a pillow on the bedside stand so she could breathe more easily. Within a few minutes she was in acute pulmonary edema, and within the hour she was dead. Of course, this would not happen today because of more sophisticated monitoring, but as a novice nurse I was
272
devastated and promised myself that I would do everything in my power to keep this from happening again. So I learned what I could about acute pulmonary edema and submitted a paper to the American Journal of Nursing about the case (Larson, 1986). The paper would never be published now, as it was primarily a summary of information from medical textbooks, but putting my thoughts down on paper was a helpful way for me to deal with my feelings of failure and wanting to be a better nurse. The editor of the journal wrote me a letter to say that she hoped more clinical nurses would submit articles addressing relevant practice issues. So I was hooked on publishing!
The second clinical experience that cinched my passion for research and dissemination of findings happened when I was a clinical nurse specialist in a surgical intensive care unit. At that time, the unit was designed with a central nursing station surrounded by five patient beds in a semicircle so that they could all be observed. The ICU had several sinks adjacent to the patient beds, but at least one of them was usually unavailable because it was hooked up to a dialysis machine. When plans were made for a new, updated unit with many more beds in separate rooms (for the stated purposes of improving patients’ privacy and ability to sleep and preventing transmission of infections), a colleague and I decided to formally evaluate the impact of this architectural change on rates of infection. We wrote a protocol, collected data before and after the ICU design change, did air sampling, monitored numbers of interactions between staff and patients and hand hygiene, and obtained cultures from patients for six surveillance organisms every 4 days. Rates of infection did not change after the ICU was redesigned, nor did staff infection prevention or hand hygiene practices, despite the fact that there was a sink available at the entrance to every patient room (Preston et al., 1981). It was clear from that project that just changing the physical environs of the ICU was insufficient to reduce the risk of infections; in fact, the problem seemed to be more behavioral than structural.
As a result of the ICU project, completed while I was working fulltime as a clinician, I returned to school for a PhD. With a small grant from the American Nurses Foundation (http://www.anfonline.org/), I studied the hand flora of patient care
273
staff and found that 21% of nosocomial infections over a 7-month period were caused by species found on personnel hands and that such organisms were much more prevalent on normal skin than generally thought (Larson, 1981). Ironically, I had to provide a strong rationale for choosing to study such a simple topic as hand hygiene, because the doctoral faculty of epidemiology at the time felt that there was really little to study about the issue that was not already known. Since that time, however, hand hygiene has become a major topic of interdisciplinary research and has resulted in the publication in this decade of two international evidence-based guidelines citing hundreds of publications (Boyce & Pittet, 2002; Pittet et al., 2009). The point is that our research must go full circle, from clinical observation, to scholarly and rigorous data collection, and then back to evidence-based practice. Sometimes nurse researchers stop at the second step, but evidence-based practice is the raison d’être for pursuing a research career in nursing.
Conducting a well-designed, rigorous study is a primary responsibility of the nurse researcher, but only one responsibility among many. Evidence-based practice and the increasing adoption of practice guidelines (similar to what was previously referred to as research utilization) help ensure that important research findings are translated into clinical practice and public policy (Melnyk & Gallagher-Ford, 2014; Melnyk et al., 2014). It is often at the translational gap between publishing findings, even in influential, peer-reviewed journals, and communicating these findings in meaningful ways that the potential impact of nursing research is lost. In reality, research matters only to the extent that it is communicated and that it results in improved practice and policy— in the work environment, in the quality of life of our individual patients, and in the general health of the public. For that reason, the dissemination of research is essential in all appropriate media and to all appropriate audiences, not just to other researchers.
For me, the simple research related to hand hygiene and infections has become increasingly complex over the years. Despite multiple, intensive interventions, international dissemination of practice guidelines, and changes in national policy and mandates from The Joint Commission and the Centers for Disease Control and Prevention, hand hygiene remains stubbornly resistant to change,
274
requiring more sophisticated interventions and conceptual underpinnings (Carter et al., 2016; Haas & Larson, 2007; Srigley et al., 2015). It is clearer now than it has been for several decades that new, emerging, and re-emerging infectious diseases will be a constant. While my research contributions have been primarily in one small field—that of the prevention and control of infectious diseases—the cumulative contributions of each of us to the broader scholarly community in our respective areas of concentration together make up the building blocks of a healthier world. That’s my fundamental belief and commitment—nursing research as part of a global collective to improve health. Sounds simple, but it’s not!
References 1. Boyce J. M, Pittet D. Guideline for hand hygiene in health-care
settings. Recommendations of the Healthcare Infection Control Practices Advisory Committee and the HIPAC/SHEA/APIC/IDSA Hand Hygiene Task Force. American Journal of Infection Control 2002;30(8):S1-S46.
2. Carter E. J, Wyer P., Giglio J., et al. Environmental factors and their association with emergency department hand hygiene compliance an observational study. BMJ Quality and Safety 2016;25(5):372-378.
3. Haas J. P, Larson E. L. Measurement of compliance with hand hygiene. Journal of Hospital Infection 2007;66(1):6-14.
4. Larson E. The patient with acute pulmonary edema. American Journal of Nursing 1986;68:1019-1022.
5. Larson E. L. Persistent carriage of gram-negative bacteria on hands. American Journal of Infection Control 1981;9(4):112-119.
6. Melnyk B. M, Gallagher-Ford L. Evidence-based practice as mission critical for healthcare quality and safety a disconnect for many nurse executives. Worldviews on Evidence-Based Nursing/Sigma Theta Tau International, Honor Society of Nursing 2014;11(3):145-146.
7. Melnyk B. M, Gallagher-Ford L., Long L. E, Fineout- Overholt E. The establishment of evidence-based practice competencies for practicing registered nurses and advanced practice nurses in real-world clinical settings proficiencies to
275
improve healthcare quality, reliability, patient outcomes, and costs. Worldviews on Evidence-Based Nursing/Sigma Theta Tau International, Honor Society of Nursing 2014;11(1):5-15.
8. Pittet D., Allegranzi B., Boyce J. World Health Organization World Alliance for Patient Safety First Global Patient Safety Challenge Core Group of E. The World Health Organization Guidelines on Hand Hygiene in Health Care and their consensus recommendations. Infection Control and Hospital Epidemiology 2009;30(7):611-622.
9. Preston G. A, Larson E. L, Stamm W. E. The effect of private isolation rooms on patient care practices, Colonization and infection in an intensive care unit. American Journal of Medicine 1981;70(3):641-645.
10. Srigley J. A, Corace K., Hargadon D. P, et al. Applying psychological frameworks of behaviour change to improve healthcare worker hand hygiene a systematic review. Journal of Hospital Infection 2015;91(3):202-210.
276
CHAPTER 8
Introduction to quantitative research Geri LoBiondo-Wood
Learning outcomes
After reading this chapter, you should be able to do the following:
• Define research design. • Identify the purpose of a research design. • Define control and fidelity as it affects research design and the outcomes of a study. • Compare and contrast the elements that affect fidelity and control. • Begin to evaluate what degree of control should be exercised in a study. • Define internal validity. • Identify the threats to internal validity. • Define external validity. • Identify the conditions that affect external validity. • Identify the links between study design and evidence-based practice. • Evaluate research design using critiquing questions.
277
KEY TERMS
bias
constancy
control
control group
dependent variable
experimental group
external validity
extraneous or mediating variable
generalizability
history
homogeneity
independent variable
instrumentation
internal validity
intervening variable
intervention fidelity
maturation
measurement effects
mortality
pilot study
randomization
reactivity
selection
278
selection bias
testing
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
The word design implies the organization of elements into a masterful work of art. In the world of art and fashion, design conjures up images that are used to express a total concept. When an individual creates a structure such as a dress pattern or blueprints for a house, the type of structure depends on the aims of the creator. The same can be said of the research process. The framework that the researcher creates is the design. When reading a study, you should be able to recognize that the literature review, theoretical framework, and research question or hypothesis all interrelate with, complement, and assist in the operationalization of the design (Fig. 8.1). The degree to which there is a fit between these elements and the steps of the research process strengthens the study and also your confidence in the evidence’s potential for applicability to practice.
279
FIG 8.1 Interrelationships of design, problem
statement, literature review, theoretical framework, and hypothesis.
How a researcher structures, implements, or designs a study affects the results of a study and ultimately its application to practice. For you to understand the implications and usefulness of a study for evidence-based practice, the key issues of research design must be understood. This chapter provides an overview of the meaning, purpose, and issues related to quantitative research design, and Chapters 9 and 10 present specific types of quantitative designs.
Research design—purpose Researchers choose from different design types. But the design choice must be consistent with the research question/hypotheses. Quantitative research designs include:
• A plan or blueprint
• Vehicle for systematically testing research questions and hypotheses
280
• Structure for maintaining control in the study
The design coupled with the methods and analysis provides control for the study. Control is defined as the measures that the researcher uses to hold the conditions of the study consistent and avoid possible potential of bias or error in the measurement of the dependent variable (outcome variable). Control measures help control threats to the validity of the study.
An example that demonstrates how the design can aid in the solution of a research question and maintain control is illustrated in the study by Nyamathi and colleagues (2015; Appendix A), whose aim was to evaluate the effectiveness of peer coaching, and hepatitis A and B vaccine completion in subjects who met the study’s inclusion criteria were randomly assigned to one of the three groups. The interventions were clearly defined. The authors also discuss how they maintained intervention fidelity or constancy of interventionists, data-collector training and supervision, and follow-up throughout the study. By establishing the sample criteria and subject eligibility (inclusion criteria; see Chapter 12) and by clearly describing and designing the experimental intervention, the researchers demonstrated that they had a well-developed plan and were able to consistently maintain the study’s conditions. A variety of considerations, including the type of design chosen, affect a study’s successful completion and utility for evidence-based practice. These considerations include the following:
• Objectivity in conceptualizing the research question or hypothesis
• Accuracy
• Feasibility (Table 8.1)
• Control and intervention fidelity
• Validity—internal
• Validity—external
TABLE 8.1
281
Pragmatic Considerations in Determining the Feasibility of a Research Question
Factor Pragmatic Considerations Time A question must be one that can be studied within a realistic time period. Subject availability
A researcher must determine if a sufficient number of subjects will be available and willing to participate. If one has a captive audience (e.g., students in a classroom), it may be relatively easy to enlist subjects. If a study involves subjects’ independent time and effort, they may be unwilling to participate when there is no apparent reward. Potential subjects may have fears about harm and confidentiality and be suspicious of research. Subjects with unusual characteristics may be difficult to locate. Dependent on the design, a researcher may consider enlisting more subjects than needed to prepare for subject attrition. At times, a research report may note how the inclusion criteria were liberalized or the number of subjects altered, as a result of some unforeseen recruitment or attrition consideration.
Facility and equipment availability
All research requires equipment such as questionnaires or computers. Most research requires availability of a facility for data collection (e.g., a hospital unit or laboratory space).
Money Research requires expenditure of money. Before starting a study, the researcher itemizes expenses and develops a budget. Study costs can include postage, printing, equipment, computer charges, and salaries. Expenses can range from about $1000 for a small study to hundreds of thousands of dollars for a large federally funded project.
Ethics Research that places unethical demands on subjects is not feasible for study. Ethical considerations affect the design and methodology choice.
There are statistical principles associated with the mechanisms of control, but it is more important that you have a clear conceptual understanding of these mechanisms.
The next two chapters present experimental, quasi-experimental, and nonexperimental designs. As you will recall from Chapter 1, a study’s type of design is linked to the level of evidence. As you appraise the design, you must also take into account other aspects of a study’s design and conduct. These aspects are reviewed in this chapter. How they are applied depends on the type of design (see Chapters 9 and 10).
Objectivity in the research question conceptualization Objectivity in the conceptualization of the research question is derived from a review of the literature and development of a theoretical framework (see Fig. 8.1). Using the literature, the researcher assesses the depth and breadth of available knowledge
282
on the question (see Chapters 3 and 4), which in turn affects the design chosen. Example: ➤ A research question about the length of a breastfeeding teaching program in relation to adherence to breastfeeding may suggest either a correlational or an experimental design (see Chapters 9 and 10), whereas a question related to coping of parents and siblings of adolescent cancer survivors may suggest a survey or correlational study (see Chapter 10).
HIGHLIGHT There is usually more than one threat to internal and external validity in a research study. It is helpful to have a team discussion to summarize specific threats that affect the overall strength and quality of evidence provided by the studies your team is critically appraising.
Accuracy Accuracy in determining the appropriate design is aided by a thoughtful theoretical framework and literature review (see Chapters 3 and 4). Accuracy means that all aspects of a study systematically and logically follow from the research question or hypothesis. The simplicity of a research study does not render it useless or of less value. You should feel that the researcher chose a design that was consistent with the research question or hypothesis and offered the maximum amount of control. Issues of control are discussed later in this chapter.
Many research questions have not yet been researched. Therefore, a preliminary or pilot study is also a wise approach. A pilot study can be thought of as a beginning study in an area conducted to test and refine a study’s data collection methods, and it helps to determine the sample size needed for a larger study. Example: ➤ Patterson (2016) published a report of a pilot study that tested the effect of an emotional freedom technique on stress and anxiety in nursing students. The key is the accuracy, validity, and objectivity used by the researcher in attempting to answer the question. Accordingly, when consulting research, you should read various types of studies and assess how and if the criteria for each step of the research process were followed.
283
Control and intervention fidelity A researcher chooses a design to maximize the degree of control, fidelity, or uniformity of the study methods. Control is maximized by a well-planned study that considers each step of the research process and the potential threats to internal and external validity. In a study that tests interventions (randomized controlled trial; see Chapter 9), intervention fidelity (also referred to as treatment fidelity) is a key concept. Fidelity means trustworthiness or faithfulness. In a study, intervention fidelity means that the researcher standardized the intervention and planned how to administer the intervention to each subject in the same manner under the same conditions. A study designed to address issues related to fidelity maximizes results, decreases bias, and controls preexisting conditions that may affect outcomes. The elements of control and fidelity differ based on the design type. Thus, when various research designs are critiqued, the issue of control is always raised but with varying levels of flexibility. The issues discussed here will become clearer as you review the various designs types discussed in later chapters (see Chapters 9 and 10).
Control is accomplished by ruling out mediating or intervening variables that compete with the independent variables as an explanation for a study’s outcome. An extraneous, mediating, or intervening variable is one that occurs in between the independent and dependent variable and interferes with interpretation of the dependent variable. An example would be the effect of the stage of cancer and depression during different phases of cancer treatment. Means of controlling mediating variables include the following:
• Use of a homogeneous sample
• Use of consistent data-collection procedures
• Training and supervision of data collectors and interventionists
• Manipulation of the independent variable
• Randomization
284
EVIDENCE-BASED PRACTICE TIP As you read studies, assess if the study includes an intervention and whether there is a clear description of the intervention and how it was controlled. If the details are not clear, it should make you think that the intervention may have been administered differently among the subjects, therefore affecting bias and the interpretation of the results.
Homogeneous sampling In a smoking cessation study, extraneous variables may affect the dependent variable. The characteristics of a study’s subjects are common extraneous variables. Age, gender, length of time smoked, amount smoked, and even smoking rules may affect the outcome in a smoking cessation study. These variables may therefore affect the outcome. As a control for these and other similar problems, the researcher’s subjects should demonstrate homogeneity, or similarity, with respect to the extraneous variables relevant to the particular study (see Chapter 12). Extraneous variables are not fixed but must be reviewed and decided on, based on the study’s purpose and theoretical base. By using a sample of homogeneous subjects, based on inclusion and exclusion criteria, the researcher has implemented a straightforward method of control.
Example: ➤ In the study by Nyamathi and colleagues (2015; see Appendix A), the researchers ensured homogeneity of the sample based on age, history of drug use, homelessness, and participation in a drug treatment unit. This step limits the generalizability or application of the findings to similar populations when discussing the outcomes (see Chapter 17). As you read studies, you will often see the researchers limit the generalizability of the findings to similar samples.
HELPFUL HINT When critiquing studies, it is better to have a “clean” study with clearly identified controls that enhance generalizability from the sample to the specific population than a “messy” one from which you can generalize little or nothing.
If the researcher feels that an extraneous variable is important, it
285
may be included in the design. In the smoking example, if individuals are working in an area where smoking is not allowed and this is considered to be important, the researcher could establish a control for it. This can be done by comparing two different work areas: one where smoking is allowed and one where it is not. The important idea to keep in mind is that before data are collected, the researcher should have identified, planned for, or controlled the important extraneous variables.
Constancy in data collection A critical component of control is constancy in data collection. Constancy refers to the notion that the data-collection procedures should reflect a cookbook-like recipe of how the researcher controlled the study’s conditions. This means that environmental conditions, timing of data collection, data-collection instruments, and data-collection procedures are the same for each subject (see Chapter 14). Constancy in data collection is also referred to as intervention fidelity. The elements of intervention fidelity (Breitstein et al., 2012; Gearing et al., 2011; Preyde & Burnham, 2011) are as follows:
• Design: The study is designed to allow an adequate testing of the hypothesis (or hypotheses) in relation to the underlying theory and clinical processes
• Training: Training and supervision of the data collectors and/or interventionists to ensure that the intervention is being delivered as planned and in a similar manner with all the subjects
• Delivery: Assessing that the intervention is delivered as intended, including that the “dose” (as measured by the number, frequency, and length of contact) is well described for all subjects and that the dose is the same in each group, and that there is a plan for possible problems
• Receipt: Ensuring that the treatment has been received and understood by the subject
• Enactment: Assessing that the intervention skills of the subject are
286
enlisted as intended
The study by Nyamathi and colleagues (Appendix A; see the “Interventions” section) is an example of how intervention fidelity was maintained. A review of this study shows that data were collected from each subject in the same manner and under the same conditions by trained data collectors. This type of control aided the investigators’ ability to draw conclusions, discuss limitations, and cite the need for further research. When interventions are implemented, researchers will often describe the training of and supervision of interventionists and/or data collectors that took place to ensure constancy. All study designs should demonstrate constancy (fidelity) of data collection, but studies that test an intervention require the highest level of intervention fidelity.
Manipulation of independent variable A third means of control is manipulation of the independent variable. This refers to the administration of a program, treatment, or intervention to one group within the study and not to the other subjects in the study. The first group is known as the experimental group or intervention group, and the other group is known as the control group. In a control group, the variables under study are held at a constant or comparison level. Example: ➤ Nyamathi and colleagues (2015; see Appendix A) manipulated the provision of three levels of peer coaching and nurse-delivered interventions.
Experimental and quasi-experimental designs are used to test whether a treatment or intervention affects patient outcomes. Nonexperimental designs do not manipulate the independent variable and thus do not have a control group. The use of a control group in an experimental or quasi-experimental design is related to the aim of the study (see Chapter 9).
HELPFUL HINT The lack of manipulation of the independent variable does not mean a weaker study. The type of question, amount of theoretical development, and the research that has preceded the study affects the researcher’s design choice. If the question is amenable to a design that manipulates the independent variable, it increases the
287
power of a researcher to draw conclusions—that is, if all of the considerations of control are equally addressed.
Randomization Researchers may also choose other forms of control, such as randomization. Randomization of subjects is used when the required number and type of subjects from the population are obtained in such a manner that each potential subject has an equal chance of being assigned to a treatment group. Randomization eliminates bias, aids in the attainment of a representative sample, and can be used in various designs (see Chapter 12). Nyamathi and colleagues (2015; see Appendix A) randomized subjects to intervention and control groups.
Randomization can also be accomplished with questionnaires. By randomly ordering items on the questionnaires, the investigator can assess if there is a difference in responses that can be related to the order of the items. This may be especially important in longitudinal studies where bias from giving the same questionnaire to the same subjects on a number of occasions can be a problem.
Quantitative control and flexibility The same level of control or elimination of bias cannot be exercised equally in all design types. When a researcher wants to explore an area in which little or no literature and/or research on the concept exists, the researcher may use a qualitative method or a nonexperimental design (see Chapters 5 through 7 and 10). In these types of studies, the researcher is interested in describing a phenomenon in a group of individuals.
Control must be exercised as strictly as possible in quantitative research. All studies should be evaluated for potential variables that may affect the outcomes; however, all studies, based on their design, exercise different levels of control. You should be able to locate in the research report how the researcher maintained control in accordance with its design.
EVIDENCE-BASED PRACTICE TIP Remember that establishing evidence for practice is determined by
288
assessing the validity of each step of the study, assessing if the evidence assists in planning patient care, and assessing if patients respond to the evidence-based care.
Internal and external validity When reading research, you must be convinced that the results of a study are valid, are obtained with precision, and remain faithful to what the researcher wanted to measure. For the findings of a study to be applicable to practice and provide the foundation for further research, the study should indicate how the researcher avoided bias. Bias can occur at any step of the research process. Bias can be a result of which research questions are asked (see Chapter 2), which hypotheses are tested (see Chapter 2), how data are collected or observations made (see Chapter 14), the number of subjects and how subjects are recruited and included (see Chapter 12), how subjects are randomly assigned in an experimental study (see Chapter 9), and how data are analyzed (see Chapter 16). There are two important criteria for evaluating bias, credibility, and dependability of the results: internal validity and external validity. An understanding of the threats to internal validity and external validity is necessary for critiquing research and considering its applicability to practice. Threats to validity are listed in Box 8.1, and discussion follows. BOX 8.1 Threats to Validity Internal validity
• History
• Maturation
• Testing
• Instrumentation
• Mortality
289
• Selection bias
External validity
• Selection effects
• Reactive effects
• Measurement effects
Internal validity Internal validity asks whether the independent variable really made the difference or the change in the dependent variable. To establish internal validity, the researcher rules out other factors or threats as rival explanations of the relationship between the variables— essentially sources of bias. There are a number of threats to internal validity. These are considered by researchers in planning a study and by clinicians before implementing the results in practice (Campbell & Stanley, 1966). You should note that threats to internal validity can compromise outcomes for all studies, and thereby the overall strength and quality of evidence of a study’s findings should be considered to some degree in all quantitative designs. How these threats may affect specific designs are addressed in Chapters 9 and 10. Threats to internal validity include history, maturation, testing, instrumentation, mortality, and selection bias. Table 8.2 provides examples of the threats to internal validity. Generally, researchers will note the threats to validity that they encountered in the discussion and/or limitations section of a research article.
TABLE 8.2 Examples of Internal Validity Threats
Threat Example History A study tested an exercise program intervention in a cardiac care
rehabilitation center at one center and compared outcomes to those of another center in which usual care was given. During the final months of data collection, the control hospital implemented an e-health physical activity intervention; as a result data from the control hospital (cohort) was not included in the analysis.
Maturation Hernandez-Martinez and colleagues (2016) evaluated the effects of prenatal
290
nicotine exposure on infants’ cognitive development at 6, 12, and 30 months. They noted that cognitive development and intelligence are clearly influenced by environment and genetics and not just by nicotine exposure.
Testing Nyamathi and colleagues (2015) discussed the lack of treatment differences found in terms of vaccine completion rates possibly due to the bundled nature of the program (see Appendix A).
Instrumentation Lee and colleagues (in press) acknowledged in a study of obesity and disability in young adults that “our measures of disability are not directly comparable to more traditional measures of disability used in studies of older adults.”
Mortality Nyamathi and colleagues (2015) noted that more than one-quarter (27%) did not complete the vaccine series, despite being informed of their risk for HBV infection (see Appendix A).
Selection bias Nyamathi and colleagues (2015) controlled for selection bias by establishing inclusion and exclusion participation criteria for participation. Subjects were also stratified using a specific procedure that ensured balance across the three groups (see Nyamathi et al., 2015, Appendix A, Fig. 1).
History In addition to the independent variable, another specific event that may have an effect on the dependent variable may occur either inside or outside the experimental setting; this is referred to as history. An example may be that of an investigator testing the effects of a research program aimed at young adults to increase bone marrow donations in the community. During the course of the educational program, an ad featuring a known television figure is released on television and Facebook about the importance of bone marrow donation. The release of this information on social media with a television figure engenders a great deal of media and press attention. In the course of the media attention, medical experts are interviewed widely, and awareness is raised regarding the importance of bone marrow donation. If the researcher finds an increase in the number of young adults who donate bone marrow in their area, the researcher may not be able to conclude that the change in behavior is the result of the teaching program, as the change may have been influenced by the result of the information on social media and the resultant media coverage. See Table 8.2 for another example.
Maturation Maturation refers to the developmental, biological, or psychological processes that operate within an individual as a
291
function of time and are external to the events of the study. Example: ➤ Suppose one wishes to evaluate the effect of a teaching method on baccalaureate students’ achievement on a skills test. The investigator would record the students’ abilities before and after the teaching method. Between the pretest and posttest, the students have grown older and wiser. The growth or change is unrelated to the study and may explain the differences between the two testing periods rather than the experimental treatment. It is important to remember that maturation is more than change resulting from an age-related developmental process, but could be related to physical changes as well. Example: ➤ In a study of new products to stimulate wound healing, one might ask whether the healing that occurred was related to the product or to the natural occurrence of wound healing. See Table 8.2 for another example.
Testing Taking the same test repeatedly could influence subjects’ responses the next time the test is completed. Example: ➤ The effect of taking a pretest on the subject’s posttest score is known as testing. The effect of taking a pretest may sensitize an individual and improve the score of the posttest. Individuals generally score higher when they take a test a second time, regardless of the treatment. The differences between posttest and pretest scores may not be a result of the independent variable but rather of the experience gained through the testing. Table 8.2 provides an example.
Instrumentation Instrumentation threats are changes in the measurement of the variables or observational techniques that may account for changes in the obtained measurement. Example: ➤ A researcher may wish to study types of thermometers (e.g., tympanic, oral, infrared) to compare the accuracy of using a digital thermometer to other temperature-taking methods. To prevent instrumentation threat, a researcher must check the calibration of the thermometers according to the manufacturer’s specifications before and after data collection.
Another example that fits into this area is related to techniques of observation or data collection. If a researcher has several raters
292
collecting observational data, all must be trained in a similar manner so that they collect data using a standardized approach, thereby ensuring interrater reliability (see Chapter 13) and intervention fidelity (see Table 8.2). At times, even though the researcher takes steps to prevent instrumentation problems, this threat may still occur and should be evaluated within the total context of the study.
Mortality Mortality is the loss of study subjects from the first data-collection point (pretest) to the second data-collection point (posttest). If the subjects who remain in the study are not similar to those who dropped out, the results could be affected. The loss of subjects may be from the sample as a whole or, in a study that has both an experimental and a control group, there may be differential loss of subjects. A differential loss of subjects means that more of the subjects in one group dropped out than the other group. See Table 8.2 for an example.
Selection bias If the precautions are not used to gain a representative sample, selection bias could result from how the subjects were chosen. Suppose an investigator wishes to assess if a new exercise program contributes to weight reduction. If the new program is offered to all, chances are only individuals who are more motivated to exercise will take part in the program. Assessment of the effectiveness of the program is problematic, because the investigator cannot be sure if the new program encouraged exercise behaviors or if only highly motivated individuals joined the program. To avoid selection bias, the researcher could randomly assign subjects to groups. In a nonexperimental study, even with clearly defined inclusion and exclusion criteria, selection bias is difficult to avoid completely. See Table 8.2 for an example.
HELPFUL HINT More than one threat can be found in a study, depending on the type of study design. Finding a threat to internal validity in a study does not invalidate the results and is usually acknowledged by the
293
investigator in the “Results” or “Discussion” or “Limitations” section of the study.
EVIDENCE-BASED PRACTICE TIP Avoiding threats to internal validity can be quite difficult at times. Yet this reality does not render studies that have threats useless. Take them into consideration and weigh the total evidence of a study for not only its statistical meaningfulness but also its clinical meaningfulness.
External validity External validity concerns the generalizability of the findings of one study to additional populations and other environmental conditions. External validity questions under what conditions and with what types of subjects the same results can be expected to occur.
The factors that may affect external validity are related to selection of subjects, study conditions, and type of observations. These factors are termed selection effects, reactive effects, and testing effects. You will notice the similarity in the names of the factors of selection and testing to those of the threats to internal validity. When considering internal validity threats factors as internal threats, you should assess them as they relate to the testing of independent and dependent variables within the study. When assessing external validity threats, you should consider them in terms of the generalizability or use outside of the study to other populations and settings. The internal validity threats ask if the independent variable changed or was related to the dependent variable or if was affected by something else. The Critical Thinking Decision Path for threats to validity displays the way threats to internal and external validity can interact with each other. It is important to remember that this decision path is not exhaustive of the type of threats and their interaction. Problems of internal validity are generally easier to control. Generalizability issues are more difficult to deal with because they indicate that the researcher is assuming that other populations are similar to the one being tested.
294
CRITICAL THINKING DECISION PATH Potential Threats to a Study’s Validity
EVIDENCE-BASED PRACTICE TIP Generalizability depends on who actually participates in a study. Not everyone who is approached actually participates, and not everyone who agrees to participate completes a study. As you review studies, think about how well the subjects represent the population of interest.
Selection effects Selection refers to the generalizability of the results to other populations. An example of selection effects occurs when the researcher cannot attain the ideal sample. At times, the numbers of available subjects may be low or not accessible (see Chapter 12).
295
Therefore, the type of sampling method used and how subjects are assigned to research conditions affect the generalizability to other groups, the external validity.
Examples of selection effects are reported when researchers note any of the following:
• “There are several limitations to the study. At 1 and 3 months’ post-death, parents were in early stages of grieving. Thus these findings may not be applicable to parents who are later in their grieving process” (Hawthorne et al., 2016, Appendix B).
• “The sample size was small, which could have limited the power and obscured significant effects that may have been revealed with a larger sample” (Turner-Sack et al., 2016, Appendix D).
These remarks caution you about potentially generalizing beyond the type of sample in a study, but also point out the usefulness of the findings for practice and future research aimed at building the research in these areas.
Reactive effects Reactivity is defined as the subjects’ responses to being studied. Subjects may respond to the investigator not because of the study procedures but merely as an independent response to being studied. This is also known as the Hawthorne effect, which is named after Western Electric Corporation’s Hawthorne plant, where a study of working conditions was conducted. The researchers developed several different working conditions (i.e., turning up the lights, piping in music loudly or softly, and changing work hours). They found that no matter what was done, the workers’ productivity increased. They concluded that production increased as a result of the workers’ realization that they were being studied rather than because of the experimental conditions.
In another study that compared daytime physical activity levels in children with and without asthma and the relationships among asthma, physical activity and body mass index, and child report of symptoms, the researchers noted, “Children may change their
296
behaviors due to the Hawthorne effect” (Tsai et al., 2012, p. 258). The researchers made recommendations for future studies to avoid such threats.
Measurement effects Administration of a pretest in a study affects the generalizability of the findings to other populations and is known as measurement effects. Pretesting can affect the posttest responses within a study (internal validity) and affects the generalizability outside the study (external validity). Example: ➤ Suppose a researcher wants to conduct a study with the aim of changing attitudes toward breast cancer screening behaviors. To accomplish this, an education program on the risk factors for breast cancer is incorporated. To test whether the education program changes attitudes toward screening behaviors, tests are given before and after the teaching intervention. The pretest on attitudes allows the subjects to examine their attitudes regarding cancer screening. The subjects’ responses on follow-up testing may differ from those of individuals who were given the education program and did not see the pretest. Therefore, when a study is conducted and a pretest is given, it may “prime” the subjects and affect the researcher’s ability to generalize to other situations.
HELPFUL HINT When reviewing a study, be aware of the internal and external validity threats. These threats do not make a study useless—but actually more useful—to you. Recognition of the threats allows researchers to build on data, and allows you to think through what part of the study can be applied to practice. Specific threats to validity depend on the design type.
There are other threats to external validity that depend on the type of design and methods of sampling used by the researcher, but these are beyond the scope of this text. Campbell and Stanley (1966) offer detailed coverage of the issues related to internal and external validity.
Appraisal for evidence-based practice
297
quantitative research Critiquing a study’s design requires you to first have knowledge of the overall implications that the choice of a design may have for the study as a whole (see the Critical Appraisal Criteria box). When researchers ask a question they design a study, decide how the data will be collected, what instruments will be used, what the sample’s inclusion and exclusion criteria will be, and how large the sample will be, to diminish threats to the study’s validity. These choices are based on the nature of the research question or hypothesis. Minimizing threats to internal and external validity of a study enhances the strength of evidence. In this chapter, the meaning, purpose, and important factors of design choice, as well as the vocabulary that accompanies these factors, have been introduced.
Several criteria for evaluating the design related to maximizing control and minimizing threats to internal/external validity and, as a result, sources of bias can be drawn from this chapter. Remember that the criteria are applied differently with various designs (see Chapters 9 and 10). The following discussion pertains to the overall appraisal of a quantitative design.
The research design should reflect that an objective review of the literature and establishment of a theoretical framework guided the development of the hypothesis and the design choice. When reading a study, there may be no explicit statement regarding how the design was chosen, but the literature reviewed will provide clues as to why the researcher chose the study’s design. You can evaluate this by critiquing the study’s theoretical framework and literature review (see Chapters 3 and 4). Is the question new and not extensively researched? Has a great deal of research been done on the question, or is it a new or different way of looking at an old question? Depending on the level of the question, the investigators make certain choices. Example: ➤ In the study by Nyamathi and colleagues (2015), the researchers wanted to test a controlled intervention; thus they developed a randomized controlled trial (Level II design). However, the purpose of the study by Turner- Sack and colleagues (2016) was much different. The Turner-Sack study examined the relationship between and among variables. The study did not test an intervention but explored how variables related to each other in a specific population (Level IV design).
298
CRITICAL APPRAISAL CRITERIA Quantitative Research
1. Is the type of design used appropriate?
2. Are the various concepts of control consistent with the type of design chosen?
3. Does the design used seem to reflect consideration of feasibility issues?
4. Does the design used seem to flow from the proposed research question, theoretical framework, literature review, and hypothesis?
5. What are the threats to internal validity or sources of bias?
6. What are the controls for the threats to internal validity?
7. What are the threats to external validity or generalizability?
8. What are the controls for the threats to external validity?
9. Is the design appropriately linked to the evidence hierarchy?
You should be alert for the means investigators use to maintain control (i.e., homogeneity in the sample, consistent data-collection procedures, how or if the independent variable was manipulated, and whether randomization was used). Once it has been established whether the necessary control or uniformity of conditions has been maintained, you must determine whether the findings are valid. To assess this aspect, the threats to internal validity should be reviewed. If the investigator’s study was systematic, well grounded in theory, and followed the criteria for each step of the research process, you will probably conclude that the study is internally valid. No study is perfect; there is always the potential for bias or threats to validity. This is not because the research was poorly conducted or the researcher did not think through the process completely; rather, it is that when conducting research with human
299
subjects there is always some potential for error. Subjects can drop out of studies, and data collectors can make errors and be inconsistent. Sometimes errors cannot be controlled by the researcher. If there are policy changes during a study, an intervention can be affected. As you read studies, note how each facet of the study was conducted, what potential errors could have arisen, and how the researcher addressed the sources of bias in the limitations section of the study.
Additionally, you must know whether a study has external validity or generalizability to other populations or environmental conditions. External validity can be claimed only after internal validity has been established. If the credibility of a study (internal validity) has not been established, a study cannot be generalized (external validity) to other populations. Determination of external validity of the findings goes hand in hand with sampling issues (see Chapter 12). If the study is not representative of any one group or one event of interest, external validity may be limited or not present at all. The issues of internal and external validity and applications for specific designs (see Chapters 9 and 10) provide the remaining knowledge to fully critique the aspects of a study’s design.
Key points • The purpose of the design is to provide the master plan for a
study.
• There are many types of designs.
• You should be able to locate within the study the question that the researcher wished to answer. The question should be proposed with a plan for the accomplishment of the study. Depending on the question, you should be able to recognize the steps taken by the investigator to ensure control, eliminate bias, and increase generalizability.
• The choice of a design depends on the question. The research question and design chosen should reflect the investigator’s attempts to maintain objectivity, accuracy, and, most important,
300
control.
• Control affects not only the outcome of a study but also its future use. The design should reflect how the investigator attempted to control both internal and external validity threats.
• Internal validity must be established before external validity can be established.
• The design, literature review, theoretical framework, and hypothesis should all interrelate.
• The choice of the design is affected by pragmatic issues. At times, two different designs may be equally valid for the same question.
• The choice of design affects the study’s level of evidence.
Critical thinking challenges • How do the three criteria for an experimental design,
manipulation, randomization, and control, minimize bias and decrease threats to internal validity?
• Argue your case for supporting or not supporting the following claim: “A study that does not use an experimental design does not decrease the value of the study even though it may influence the applicability of the findings in practice.” Include examples to support your rationale.
• Have your interprofessional team provide rationale for why evidence of selection bias and mortality are important sources of bias in research studies. As you critically appraise a study that uses an experimental or quasi-experimental design, why is it important for you to look for evidence of intervention fidelity? How does intervention fidelity increase the strength and quality of the evidence provided by the findings of a study using these types of designs?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for
301
review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Breitstein S, Robbins L, Cowell M. Attention to fidelity Why
is it important? Journal of School Nursing 2012;28(6):407-408 Available at: doi:1186/1748-5908-1-1.
2. Campbell D, Stanley J. Experimental and quasi-experimental designs for research. Chicago, IL: Rand-McNally;1966.
3. Gearing R.E, El-Bassel N, Ghesquiere A, et al. Major ingredients of fidelity A review and scientific guide to improving quality of intervention research implementation. Clinical Psychology Review 2011;31:79-88 Available at: doi:10.1016/jcpr.2010.09.007.
4. Hawthorne D.M, Youngblut J.M, Brooten D. Parent spirituality, grief, and mental health at 1 and 3 months after their infant/child’s death in an intensive care unit. Journal of Pediatric Nursing 2016;31:73-80.
5. Hernandez-Martinez Moreso N.V, Serra B.R, Val V.A, et al. Maternal Child Health Journal. Epub ahead of print;2016.
6. Lee H, Pantazis A, Cheng P, et al. The association between adolescent obesity and disability incidence in young adulthood. Journal of Adolescent Health 2016;59(4):472-478.
7. Nyamathi A, Salem B, Zhang S, et al. Nursing case management, peer coaching, and Hepatitis A and B vaccine completion among homeless men recently released on parole A randomized trial. Nursing Research 2015;64(3):177-189.
8. Patterson S.L. The effect of emotional freedom technique on stress and anxiety in nursing students. Nurse Education Today 2016;5(40):104-111.
9. Preyde M, Burnham P.V. Intervention fidelity in psychosocial oncology. Journal of Evidence-Based Social Work 2011;8:379-396 Available at: doi:10.1080/15433714.2011.54234.
10. Tsai S.Y, Ward T, Lentz M, Kieckhefer G.M. Daytime physical activity levels in school-age children with and without asthma. Nursing Research 2012;61(4):159-252.
11. Turner-Sack A.M, Menna R, Setchell S.R, et al. Psychological
302
functioning, post traumatic growth, and coping in parent and siblings of adolescent cancer survivors. Oncology Nursing Forum 2016;43(1):48-56.
303
CHAPTER 9
Experimental and quasi-experimental designs Susan Sullivan-Bolyai, Carol Bova
Learning outcomes
After reading this chapter, you should be able to do the following:
• Describe the purpose of experimental and quasi-experimental research. • Describe the characteristics of experimental and quasi- experimental designs. • Distinguish between experimental and quasi-experimental designs. • List the strengths and weaknesses of experimental and quasi- experimental designs. • Identify the types of experimental and quasi-experimental designs. • Identify potential internal and external validity issues associated with experimental and quasi-experimental designs. • Critically evaluate the findings of experimental and quasi-
304
experimental studies. • Identify the contribution of experimental and quasi-experimental designs to evidence-based practice.
KEY TERMS
after-only design
after-only nonequivalent control group design
antecedent variable
classic experiment
control
dependent variable
design
effect size
experimental design
extraneous variable
independent variable
intervening variable
intervention fidelity
manipulation
mortality
nonequivalent control group design
one-group (pretest-posttest) design
power analysis
quasi-experimental design
randomization (random assignment)
randomized controlled trial
305
Solomon four-group design
testing
time series design
treatment effect
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
Research process One purpose of research is to determine cause-and-effect relationships. In nursing practice, we are concerned with identifying interventions to maintain or improve patient outcomes, and base practice on evidence. We test the effectiveness of nursing interventions by using experimental and quasi-experimental designs. These designs differ from nonexperimental designs in one important way: the researcher does not observe behaviors and actions, but actively intervenes by manipulating study variables to bring about a desired effect. By manipulating an independent variable, the researcher can measure a change in behavior(s) or action(s), which is the dependent variable. Experimental and quasi- experimental studies provide the two highest levels of evidence, Level II and Level III, for a single study (see Chapter 1).
Experimental designs are particularly suitable for testing cause- and-effect relationships because they are structured to minimize potential threats to internal validity (see Chapter 8). To infer causality requires that these three criteria be met:
• The causal (independent) and effect (dependent) variables must be associated with each other.
• The cause must precede the effect.
• The relationship must not be explainable by another variable.
306
When critiquing experimental and/or quasi-experimental designs, the primary focus is on to what extent the experimental treatment, or independent variable, caused the desired effect on the outcome, the dependent variable. The strength of the conclusion depends on how well other extraneous study variables may have influenced or contributed to the findings.
The purpose of this chapter is to acquaint you with the issues involved in interpreting and applying to practice the findings of studies that use experimental and quasi-experimental designs (Box 9.1). The Critical Thinking Decision Path shows an algorithm that influences a researcher’s choice of experimental or quasi- experimental design. In the literature, these types of studies are often referred to as therapy or intervention articles. BOX 9.1 Summary of Experimental and Quasi- Experimental Research Designs Experimental designs
• True experiment (pretest-posttest control group) design
• Solomon four-group design
• After-only design
Quasi-experimental designs
• Nonequivalent control group design
• After-only nonequivalent control group design
• One group (pretest-posttest) design
• Time series design
CRITICAL THINKING DECISION PATH Experimental and Quasi-Experimental Designs
307
Experimental design An experimental design has three identifying properties:
• Randomization
• Control
• Manipulation
A study using an experimental design is commonly called a randomized controlled trial (RCT). In clinical settings, it may be referred to as a clinical trial and is commonly used in drug trials. An RCT is considered the “gold standard” for providing information about cause-and-effect relationships. An RCT generates Level II evidence (see Chapter 1) because randomization, control, and manipulation minimize bias or error. A well-controlled RCT using these properties provides more confidence that the intervention is effective and will produce the same results over time (see Chapters 1 and 8). Box 9.2 shows examples of how these properties were used in the study in Appendix A.
308
BOX 9.2 Experimental Design Exemplar: Nursing Case Management, Peer Coaching, and Hepatitis A and B Vaccine Completion among Homeless Men Recently Released on Parole, Randomized Clinical Trial
• This study reports specifically on whether seronegative parolees involved and randomized in the original education and support intervention study were more likely to complete the hepatitis A and B vaccination series and variable predictors of their adherence for completion. The study consisted of parolee participants randomization to one of three groups:
• Peer coaching–nurse case management over 6 months whereby a combination of a peer coach who provided weekly (∼45 minutes) interactions focused on using coping and communication skills, self-management, and access to community resources; and interactions with a nurse case manager (∼20 minutes over 8 consecutive weeks) focused on health promotion, completion of drug treatment, vaccination adherence, and reduction of risky drug and sexual behaviors
• Peer coaching alone as described in group 1 along with a one-time nurse interaction (20 minutes) focused on hepatitis and HIV risk reduction
• Usual care that consisted of encouragement by a nurse to complete the three-series HAV/HBV vaccine and a one-time 20-minute peer counselor
309
session on health promotion. A detailed power analysis for sample size was reported.
• Fig. 2 in Appendix A: The CONSORT diagram illustrates how N = 669 study participants were approached, of which 69 were excluded, and why; followed by the N of 600 participants who were randomized to one of the three study arms to control for confounding variables and to ensure balance across groups: n = 195 in peer coaching–nurse care manager group; n = 120 in peer coaching group; and n = 209 in usual care group.
• The researchers also statistically assessed whether random assignment produced groups that were similar; Table 1 illustrates that except for personal health status there were no differences in the baseline characteristics (each group had similar distribution of study participants) across the three intervention arms for demographics, social, situational, coping, and personal characteristics. Fair/poor health was more commonly reported for the usual care group (37.2%). Thus, we would want to consider the fact that randomization did not work for that variable. Subanalyses might be necessary (controlling for that variable) to determine if perception of health affected that group’s adherence for completion of the vaccination series.
• There is no report within this article of attention-control (all groups receiving same amount of attention time), so we do not know the average amount of time each study arm received. Thus time alone could explain adherence improvement (spending more time teaching/interacting with group members).
• Of the 345 study participants, the vaccination completion rate for three or more doses was 73% across all three groups with no differences across groups. In other words, there was not a higher rate of vaccination completion for the study participants who were in arm 1 or 2 compared to usual care.
• The authors identify several limitations that could have attributed to the findings, such as the fact that self-report has the potential
310
for bias.
Randomization Randomization, or random assignment, is required for a study to be considered an experimental design with the distribution of subjects to either the experimental or the control group on a random basis. As shown in Box 9.2, each subject has an equal chance of being assigned to one of the three groups. This ensures that other variables that could affect change in the dependent variable will be equally distributed among the groups, reducing systematic bias. It also decreases selection bias (see Chapter 8). Randomization may be done individually or by groups. Several procedures are used to randomize subjects to groups, such as a table of random numbers or computer-generated number sequences (Suresh, 2011). Note that random assignment to groups is different from random sampling as discussed in Chapter 12.
Control Control refers to the process by which the investigator holds conditions constant to limit bias that could influence the dependent variable(s). Control is acquired by manipulating the independent variable, randomly assigning subjects to a group, using a control group, and preparing intervention and data collection protocols that are consistent for all study participants (intervention fidelity) (see Chapters 8 and 14). Box 9.2 illustrates how a control group was used by Nyamathi and colleagues (2015; see Appendix A). In an experimental study, the control group (or in Nyamathi’s study, referred to as Usual Care group) receives the usual treatment or a placebo (an inert pill in drug trials).
Manipulation Manipulation is the process of “doing something,” a different dose of “something,” or comparing different types of treatment by manipulating the independent variable for at least some of the involved subjects (typically those randomly assigned to the experimental group). The independent variable might be a treatment, a teaching plan, or a medication. The effect of this
311
manipulation is measured to determine the result of the experimental treatment on the dependent variable compared with those who did not receive the treatment.
Box 9.2 provides an illustration of how the properties of experimental designs, randomization, control, and manipulation are used in an intervention study and how the researchers ruled out other potential explanations or bias (threats to internal validity) influencing the results. The description in Box 9.2 is also an example of how the researchers used control to minimize bias and its effect on the intervention (Nyamathi et al., 2015). This control helped rule out the following potential internal validity threats:
• Selection effect: Bias in the sample contributed to the results versus the intervention.
• History: External events may have contributed to the results versus the intervention.
• Maturation: Developmental processes that occur and potentially alter the results versus the intervention.
Researchers also tested statistically for differences among the groups and found that there were none, reassuring the reader that the randomization process worked. We have briefly discussed RCTs and how they precisely use control, manipulation, and randomization to test the effectiveness of an intervention.
• RCTs use an experimental and control group, sometimes referred to as experimental and control arms.
• Have a specific sampling plan, using clear-cut inclusion and exclusion criteria (see Chapter 12).
• Administer the intervention in a consistent way, called intervention fidelity.
• Perform statistical comparisons to determine any baseline and/or postintervention differences between groups.
• Calculate the sample size needed to detect a treatment effect.
312
It is important that researchers establish a large enough sample size to ensure that there are enough subjects in each study group to statistically detect differences among those who receive the intervention and those who do not. This is called the ability to statistically detect the treatment effect or effect size—that is, the impact of the independent variable/intervention on the dependent variable (see Chapter 12). The mathematical procedure to determine the number for each arm (group) needed to test the study’s variables is called a power analysis (see Chapter 12). You will usually find power analysis information in the sample section of the research article. Example: ➤ You will know there was an appropriate plan for an adequate sample size when a statement like the following is included: “With at least 114 men in each intervention condition there was 80% power to detect differences of 15-20 percentage points (e.g., 50% vs. 70%, 75% vs. 90%) for vaccine completion between either of the two intervention conditions and the UC intervention condition at p =.05” (Nyamathi et al., 2015). This information demonstrates that the researchers sought an adequate sample size. This information is critical to assess because with a small sample size, differences may not be statistically evident, thus creating the potential for a type II error—that is, acceptance of the null hypothesis when it is false (see Chapter 16). Carefully read the intervention and control group section of an article to see exactly what each group received and what the differences were between groups either at baseline or following the intervention.
In Appendix A, Nyamathi and colleagues (2015) provide a detailed description and illustration of the intervention. The discussion section reports that the patients’ self-report (they may report doing better than they really did) may have posed an accuracy bias in reporting health behaviors. When reviewing RCTs, you also want to assess how well the study incorporates intervention fidelity measures. Fidelity covers several elements of an experimental study (Gearing et al., 2011; Preyde & Burnham, 2011; Wickersham et al., 2011) that must be evaluated and that can enhance a study’s internal validity. These elements are as follows:
1. Well-defined intervention, sampling strategy, and data collection
313
procedures
2. Well-described characteristics of study participants and environment
3. Clearly described protocol for delivering the intervention systematically to all subjects in the intervention group
4. Discussion of threats to internal and external validity
Types of experimental designs There are numerous experimental designs (Campbell & Stanley, 1966). Each is based on the classic experimental design called the RCT (Fig. 9.1A). The classic RCT is conducted as follows:
1. The researcher recruits a sample from the accessible population.
2. Baseline measurements are taken of preintervention demographics, personal characteristics.
3. Baseline measurement is taken of the dependent variable(s).
4. Subjects are randomized to either the intervention or the control group.
5. Each group receives the experimental intervention or comparison/control intervention (usual care or standard treatment, or placebo).
6. Both groups complete postintervention measures to see which, if any, changes have occurred in the dependent variables (determining the differential effects of the treatment).
7. Reliability and validity data are clearly described for measurement instruments.
314
FIG 9.1 Experimental Designs. A, Classic randomized clinical trial. B, Solomon four-group
design. C, After-only experimental design.
EVIDENCE-BASED PRACTICE TIP The term RCT is often used to refer to an experimental design in health care research and is frequently used in nursing research as the gold standard design because it minimizes bias or threats to study validity. Because of ethical issues, rarely is “no treatment” acceptable. Typically, either “standard treatment” or another version or dose of “something” is provided to the control group. Only when there is no standard or comparable treatment available
315
is a no-treatment control group appropriate.
The degree of difference between the groups at the end of the study indicates the confidence the researcher has in a causal link (i.e., the intervention caused the difference) between the independent and dependent variables. Because random assignment and control minimizes the effects of many threats to internal validity or bias (see Chapter 8), it is a strong design for testing cause-and-effect relationships. However, the design is not perfect. Some threats to internal validity cannot be controlled in RCTs, including but not limited to:
• Mortality: People tend to drop out of studies, especially those that require participation over an extended period of time. When reading RCTs, examine the sample and the results carefully to see if excessive dropouts or deaths occurred, or one group had more dropouts than the other, which can affect the study findings.
• Testing: When the same measurement is given twice, subjects tend to score better the second time just by remembering the test items. Researchers can avoid this problem in one of two ways: They might use different or equivalent forms of the same test for the two measurements (see Chapter 15), or they might use a more complex experimental design called the Solomon four-group design.
Solomon four-group design. The Solomon four-group design, shown in Fig. 9.1B, has two groups that are identical to those used in the classic experimental design, plus two additional groups: an experimental after-group and a control after-group. As the diagram shows, subjects are randomly assigned to one of four groups before baseline data are collected. This design results in two groups that receive only a posttest (rather than pretest and posttest), which provides an opportunity to rule out testing biases that may have occurred because of exposure to the pretest (also called pretest sensitization). In other words, pretest sensitization suggests that those who take the pretest learn what to concentrate on during the study and may
316
score higher after the intervention is completed. Although this design helps evaluate the effects of testing, the threat of mortality (dropout) is a potential threat to internal validity.
Example: ➤ Ishola and Chipps (2015) used the Solomon four- group design to test a mobile phone intervention based on the theory of psychological flexibility to improve pregnant women’s mental health outcomes in Nigeria. They hypothesized that those who received the mobile phone intervention would have greater psychological flexibility (the ability to be present and act when necessary).
• The subjects were randomly assigned to one of four groups:
1. Pretest, mobile phone intervention, immediate posttest
2. Pretest, no mobile phone intervention, immediate posttest
3. No pretest, mobile phone intervention, posttest only
4. No pretest, no mobile phone intervention, posttest only
• The study found that although psychological flexibility was improved in the mobile phone intervention groups, this effect was influenced by a significant interaction between the pretests and the intervention; thus, pretest sensitization was present in this study.
After-only design. A less frequently used experimental design is the after-only design (see Fig. 9.1C). This design, which is sometimes called the posttest- only control group design, is composed of two randomly assigned groups, but unlike the classic experimental design, neither group is
317
pretested. The independent variable is introduced to the experimental group and not to the control group. The process of randomly assigning the subjects to groups is assumed to be sufficient to ensure lack of bias so that the researcher can still determine whether the intervention created significant differences between the two groups. This design is particularly useful when testing effects that are expected to be a major problem, or when outcomes cannot be measured beforehand (e.g., postoperative pain management).
When critiquing research using experimental designs, to help inform your evidence-based decisions, consider what design type was used; how the groups were formed (i.e., if the researchers used randomization); whether the groups were equivalent at baseline; if they were not equivalent, what the possible threats to internal validity were; what kind of manipulation (i.e., intervention) was administered to the experimental group; and what the control group received.
HELPFUL HINT Look for evidence of pre-established inclusion and exclusion criteria for the study participants.
Strengths and weaknesses of the experimental design Experimental designs are the most powerful for testing cause-and- effect relationships due to the control, manipulation, and randomization components. Therefore, the design offers a better chance of measuring if the intervention caused the change or difference in the two groups. Example: ➤ Nyamathi and colleagues (2015) tested several types of interventions (peer coaching with nurse case management, peer coaching alone, and usual care) with paroled men to examine hepatitis A and B vaccine completion rates and found no significant differences between the groups. If you were working in a clinic caring for this population, you would consider this evidence as a starting point for putting research findings into clinical practice.
Experimental designs have weaknesses as well. They are complicated to design and can be costly to implement. Example: ➤ There may not be an adequate number of potential study
318
participants in the accessible population. These studies may be difficult or impractical to carry out in a clinical setting. An example might be trying to randomly assign patients from one hospital unit to different groups when nurses might talk to each other about the different treatments. Experimental procedures also may be disruptive to the setting’s usual routine. If several nurses are involved in administering the experimental program, it may be impossible to ensure that the program is administered in the same way with each subject. Another problem is that many important variables that are related to patient care outcomes are not amenable to manipulation for ethical reasons. Example: ➤ Cigarette smoking is known to be related to lung cancer, but you cannot randomly assign people to smoking or nonsmoking groups. Health status varies with age and socioeconomic status. No matter how careful a researcher is, no one can assign subjects randomly by age or by a certain income level. Because of these problems in carrying out experimental designs, researchers frequently turn to another type of research design to evaluate cause-and-effect relationships. Such designs, which look like experiments but lack some of the control of the true experimental design, are called quasi-experimental designs.
Quasi-experimental designs Quasi-experimental designs also test cause-and-effect relationships. However, in quasi-experimental designs, random assignment or the presence of a control group is lacking. The characteristics of an experimental study may not be possible to include because of the nature of the independent variable or the available subjects.
Without all the characteristics associated with an experimental study, internal validity may be compromised. Therefore, the basic problem with the quasi-experimental approach is a weakened confidence in making causal assertions that the results occurred because of the intervention. Instead, the findings may be a result of other extraneous variables. As a result, quasi-experimental studies provide Level III evidence. Example: ➤ Letourneau and colleagues (2015) used a quasi-experimental design to evaluate the effect of telephone peer support on maternal depression and social support
319
with mothers diagnosed with postpartum depression. This one- group pretest-posttest design, where peer volunteers were trained and delivered phone social support, resulted in promising improvement in lower depression and higher perception of social support scores among the participants. However, there was a small group (11%) of mothers who had a “relapse” of depressive symptoms despite peer phone support. In this study there was no comparison group to see if the peer support was more effective than a comparison group and if the 11% of relapse to depression was a common occurrence among women with postpartum depression.
HELPFUL HINT Remember that researchers often make trade-offs and sometimes use a quasi-experimental design instead of an experimental design because it may be impossible to randomly assign subjects to groups. Not using the “purest” design does not decrease the value of the study even though it may decrease the strength of the findings.
Types of quasi-experimental designs There are many different quasi-experimental designs, but we will limit the discussion to only those most commonly used in nursing research. Refer back to the experimental design shown in Fig. 9.1A, and compare it with the nonequivalent control group design shown in Fig. 9.2A. Note that this design looks exactly like the true experiment, except that subjects are not randomly assigned to groups. Suppose a researcher is interested in the effects of a new diabetes education program on the physical and psychosocial outcomes of patients newly diagnosed with diabetes. Under certain conditions, the researcher might be able to randomly assign subjects to either the group receiving the new program or the group receiving the usual program, but for any number of reasons, that might not be possible.
• For example, nurses on the unit where patients are admitted might be so excited about the new program that they cannot help but include the new information for all patients.
320
• The researcher has two choices: to abandon the study or to conduct a quasi-experiment.
• To conduct a quasi-experiment, the researcher might use one unit as the intervention group for the new program, find a similar unit that has not been introduced to the new program, and study the newly diagnosed patients with diabetes who are admitted to that unit as a comparison group. The study would then involve a quasi-experimental design.
FIG 9.2 Quasi-experimental designs. A, Nonequivalent control group design. B, After-only nonequivalent control group design. C, One-group (pretest-posttest) design. D, Time series design.
Nonequivalent control group. The nonequivalent control group design is commonly used in nursing studies conducted in clinical settings. The basic problem with this design is the weakened confidence the researcher can have in assuming that the experimental and comparison groups are similar at the beginning of the study. Threats to internal validity,
321
such as selection effect, maturation, testing, and mortality, are possible. However, the design is relatively strong because by gathering pretest data, the researcher can compare the equivalence of the two groups on important antecedent variables before the independent variable is introduced. Antecedent variables are variables that occur within the subjects prior to the study, such as in the previous example, where the patients’ motivation to learn about their medical condition might be important in determining the effect of the diabetes education program. At the outset of the study, the researcher could include a measure of motivation to learn. Thus, differences between the two groups on this variable could be tested, and if significant differences existed, they could be controlled statistically in the analysis.
After-only nonequivalent control group. Sometimes the outcomes simply cannot be measured before the intervention, as with prenatal interventions that are expected to affect birth outcomes. The study that could be conducted would look like the after-only nonequivalent control group design shown in Fig. 9.2B. This design is similar to the after-only experimental design, but randomization is not used to assign subjects to groups and makes the assumption that the two groups are equivalent and comparable before the introduction of the independent variable. The soundness of the design and the confidence that we can put in the findings depend on the soundness of this assumption of preintervention comparability. Often it is difficult to support the assertion that the two nonrandomly assigned groups are comparable at the outset of the study, because there is no way of assessing its validity.
One-group (pretest-posttest). Another quasi-experimental design is a one-group (pretest- posttest) design (see Fig. 9.2C), such as the Letourneau and colleagues (2015) example described earlier. This is used when only one group is available for study. Data are collected before and after an experimental treatment on one group of subjects. In this design, there is no control group and no randomization, which are important characteristics that enhance internal validity. Therefore,
322
it becomes important that the evidence generated by the findings of this type of quasi-experimental design is interpreted with careful consideration of the design limitations.
Time series. Another quasi-experimental approach used by researchers when only one group is available to study over a longer period of time is called a time series design (see Fig. 9.2D). Time series designs are useful for determining trends over time. Data are collected multiple times before the introduction of the treatment to establish a baseline point of reference on outcomes. The experimental treatment is introduced, and data are collected on multiple occasions to determine a change from baseline. The broad range and number of data collection points help rule out alternative explanations, such as history effects. However, the internal validity of testing is always present because of multiple data collection points. Without a control group, the internal validity threats of selection and maturation cannot be ruled out (see Chapter 8).
HIGHLIGHT When your team is critically appraising studies that use experimental and quasi-experimental designs, it is important to make sure that your team members understand the difference between random selection and random assignment (randomization).
Strengths and weaknesses of quasi-experimental designs Quasi-experimental designs are used frequently because they are practical, less costly, and feasible, with potentially generalizable findings. These designs are more adaptable to the real-world practice setting than the controlled experimental designs. For some research questions and hypotheses, these designs may be the only way to evaluate the effect of the independent variable.
The weaknesses of the quasi-experimental approach involve the inability to make clear cause-and-effect statements.
EVIDENCE-BASED PRACTICE TIP Experimental designs provide Level II evidence, and quasi-
323
experimental designs provide Level III evidence. Quasi- experimental designs are lower on the evidence hierarchy because of lack of control, which limits the ability to make confident cause- and-effect statements that influence applicability to practice and clinical decision making.
Evidence-based practice As nursing science expands, and accountability for cost-effective quality clinical outcomes increases, nurses must become more cognizant of what constitutes best practice for their patient population. An understanding of the value of intervention studies that use an experimental or quasi-experimental design is critical for improving clinical outcomes. These study designs provide the strongest evidence for making informed clinical decisions. These designs are those most commonly included in systematic reviews (see Chapter 11).
One cannot assume that because an intervention study has been published that the findings apply to your practice population. When conducting an evidence-based practice project, the clinical question provides a guide for you and your team to collect the strongest, most relevant evidence related to your problem. If your search of the literature reveals experimental and quasi-experimental studies, you will need to evaluate them to determine which studies provide the best available evidence. The likelihood of changing practice based on one study is low, unless it is a large clinical RCT based on prior research evidence.
Key points for evaluating the evidence and whether bias has been minimized in experimental and quasi-experimental designs include:
• Random group assignment (experimental or intervention and control or comparison)
• Inclusion and exclusion criteria that are relevant to the clinical problem studied
• Equivalence of groups at baseline on key demographic variables
324
• Adequate sample size recruitment of a homogeneous sample
• Intervention fidelity and consistent data collection procedures
• Control of antecedent, intervening, or extraneous variables
CRITICAL APPRAISAL CRITERIA Experimental and Quasi-Experimental Designs
1. Is the design used appropriate to the research question or hypothesis?
2. Is there a detailed description of the intervention?
3. Is there a clear description of the intervention group treatment in comparison to the control group? How is intervention fidelity maintained?
4. Is power analysis used to calculate the appropriate sample size for the study?
Experimental designs
1. What experimental design is used in the study?
2. How are randomization, control, and manipulation implemented?
3. Are the findings generalizable to the larger population of interest?
Quasi-experimental designs
1. What quasi-experimental design is used in the study, and is it appropriate?
2. What are the most common threats to internal and external validity of the findings of this design?
325
3. What does the author say about the limitations of the study?
4. To what extent are the study findings generalizable?
Appraisal for evidence-based practice experimental and quasi-experimental designs Research designs differ in the amount of control the researcher has over the antecedent and intervening variables that may affect the study’s results. Experimental designs, which provide Level II evidence, provide the most possibility for control. Quasi- experimental designs, which provide Level III evidence, provide less control. When conducting an evidence-based practice or quality improvement project, you must always look for studies that provide the highest level of evidence (see Chapter 1). For some PICO questions (see Chapter 2), you will find both Level II and Level III evidence. You will want to determine if the choice of design, experimental or quasi-experimental, is appropriate to the purpose of the study and can answer the research question or hypotheses.
HELPFUL HINT When reviewing the experimental and quasi-experimental literature, do not limit your search only to your patient population. For example, it is possible that if you are working with adult caregivers, related parent caregiver intervention studies may provide you with strategies as well. Many times, with some adaptation, interventions used with one sample may be applicable for other populations.
Questions that you should pose when reading studies that test cause-and-effect relationships are listed in the Critical Appraisal Criteria box. These questions should help you judge whether a causal relationship exists.
For studies in which either experimental or quasi-experimental designs are used, first try to determine the type of design that was used. Often a statement describing the design of the study appears
326
in the abstract and in the methods section of the article. If such a statement is not present, you should examine the article for evidence of control, randomization, and manipulation. If all are discussed, the design is probably experimental. On the other hand, if the study involves the administration of an experimental treatment but does not involve the random assignment of subjects to groups, the design is quasi-experimental. Next, try to identify which of the experimental and quasi-experimental designs was used. Determining the answer to these questions gives you a head start, because each design has its inherent threats to internal and external validity. This step makes it a bit easier to critically evaluate the study. It is important that the author provide adequate accounts of how the procedures for randomization, control, and manipulation were carried out. The report should include a description of the procedures for random assignment to such a degree that the reader could determine just how likely it was for any one subject to be assigned to a particular group. The description of the intervention that each group received provides important information about what intervention fidelity strategies were implemented.
The inclusion of this information helps determine if the intervention group and control group received different treatments that were consistently carried out by trained interventionists and data collectors. The question of threats to internal validity, such as testing and mortality, is even more important to consider when critically evaluating a quasi-experimental study, because quasi- experimental designs cannot possibly feature as much control; there may be a lack of randomization or a control group. A well-written report of a quasi-experimental study systematically reviews potential threats to the internal and external validity of the findings. Your work is to decide if the author’s explanations make sense. For either experimental or quasi-experimental studies, you should also check for a reported power analysis that assures you that an appropriate sample size for detecting a treatment effect was planned.
Key points
327
• Experimental designs or RCTs provide the strongest evidence (Level II) for a single study that tests whether an intervention or treatment affects patient outcomes.
• Experimental designs are characterized by the ability of the researcher to control extraneous variation, to manipulate the independent variable, and to randomly assign subjects to intervention groups.
• Experimental studies conducted either in clinical settings or in the laboratory provide the best evidence in support of a causal relationship because the following three criteria can be met: (1) the independent and dependent variables are related to each other; (2) the independent variable chronologically precedes the dependent variable; and (3) the relationship cannot be explained by the presence of a third variable.
• Researchers turn to quasi-experimental designs to test cause-and- effect relationships because experimental designs may be impractical or unethical.
• Quasi-experiments may lack the randomization and/or the comparison group characteristics of true experiments. The usefulness of quasi-experiments for studying causal relationships depends on the ability of the researcher to rule out plausible threats to the validity of the findings, such as history, selection, maturation, and testing effects.
Critical thinking challenges • Describe the ethical issues included in a true experimental
research design used by a nurse researcher.
• Describe how a true experimental design could be used in a hospital setting with patients.
• How should a nurse go about critiquing experimental research articles in the research literature so that his or her evidence-based practice is enhanced?
328
• Discuss whether your QI team would use an experimental or quasi-experimental design for a quality improvement project.
• Identify a clinical quality indicator that is a problem on your unit (e.g., falls, ventilator-acquired pneumonia, catheter-acquired urinary tract infection), and consider how a search for studies using experimental or quasi-experimental designs could provide the foundation for a quality improvement project.
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Campbell D, Stanley J. Experimental and quasi-experimental
designs for research. Chicago, IL: Rand-McNally;1966. 2. Gearing R.E, El-Bassel N, Ghesquiere A, et al. Major
ingredients of fidelity A review and scientific guide to improving quality of intervention research implementation. Clinical Psychology Review 2011;31:79-88 Available at: doi:10.1016/jcpr.2010.09.007.
3. Ishola A.G, Chipps J. The use of mobile phones to deliver acceptance and commitment therapy in the prevention of mother- child HIV transmission in Nigeria. Journal of Telemedicine and Telecare 2015;21:423-426 Available at: doi:10.1177/1357633X15605408.
4. Letourneau N, Secco L, Colpitts J, et al. Quasi-experimental evaluation of a telephone-based peer support intervention for maternal depression. Journal of Advanced Nursing 2015;71:1587- 1599 Available at: doi:10.1111/jan.12622.
5. Nyamathi A, Salem B.E, Zhang S, et al. Nursing care management, peer coaching, and hepatitis A and B vaccine completion among homeless men recently released on parole. Nursing Research 2015;64(3):177-189.
6. Preyde M, Burnham P.V. Intervention fidelity in psychosocial oncology. Journal of Evidence-Based Social Work 2011;8:379-396 Available at: doi:10.1080/15433714.2011.54234.
329
7. Suresh K.P. An overview of randomization techniques An unbiased assessment of outcome in clinical research. Journal of Human Reproductive Science 2011;4:8-11.
8. Wickersham K, Colbert A, Caruthers D, et al. Assessing fidelity to an intervention in a randomized controlled trial to improve medication adherence. Nursing Research 2011;60:264- 269.
330
CHAPTER 10
Nonexperimental designs Geri LoBiondo-Wood, Judith Haber
Learning outcomes
After reading this chapter, you should be able to do the following:
• Describe the purpose of nonexperimental designs. • Describe the characteristics of nonexperimental designs. • Define the differences between nonexperimental designs. • List the advantages and disadvantages of nonexperimental designs. • Identify the purpose and methods of methodological, secondary analysis, and mixed method designs. • Identify the critical appraisal criteria used to critique nonexperimental research designs. • Evaluate the strength and quality of evidence by nonexperimental designs.
KEY TERMS
case control study
331
cohort study
correlational study
cross-sectional study
developmental study
ex post facto study
longitudinal study
methodological research
mixed methods
prospective study
psychometrics
repeated measures studies
retrospective study
secondary analysis
survey studies
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
Many phenomena relevant to nursing do not lend themselves to an experimental design. For example, nurses studying cancer- related fatigue may be interested in the amount of fatigue, variations in fatigue, and patient fatigue in response to chemotherapy. The investigator would not design an experimental study and implement an intervention that would potentially intensify an aspect of a patient’s fatigue just to study the fatigue experience. Instead, the researcher would examine the factors that contribute to the variability in a patient’s cancer-related fatigue experience using a nonexperimental design. Nonexperimental designs are used when a researcher wishes to explore events, people, or situations as they occur; or test relationships and
332
differences among variables. Nonexperimental designs construct a picture of variables at one point or over a period of time.
In nonexperimental research the independent variables have naturally occurred, so to speak, and the investigator cannot directly control them by manipulation. As the researcher does not actively manipulate the variables, the concepts of control and potential sources of bias (see Chapter 8) should be considered. Nonexperimental designs provide Level IV evidence. The information yielded by these types of designs is critical to developing an evidence base for practice and may represent the best evidence available to answer research or clinical questions.
Researchers are not in agreement on how to classify nonexperimental studies. A continuum of quantitative research design is presented in Fig. 10.1. Nonexperimental studies explore the relationships or the differences between variables. This chapter divides nonexperimental designs into survey studies and relationship/difference studies as illustrated in Box 10.1. These categories are somewhat flexible, and other sources may classify nonexperimental studies differently. Some studies fall exclusively within one of these categories, whereas other studies have characteristics of more than one category (Table 10.1). As you read the research literature, you will often find that researchers use several design classifications for one study. This chapter introduces the types of nonexperimental designs and discusses their advantages and disadvantages, the use of nonexperimental research, the issues of causality, and the critiquing process as it relates to nonexperimental research. The Critical Thinking Decision Path outlines the path to the choice of a nonexperimental design.
FIG 10.1 Continuum of quantitative research design.
TABLE 10.1 Examples of Studies With More Than One Design Label
333
Design Type Study’s Purpose Retrospective, predictive correlation
To identify predictors of initial and repeated unplanned hospitalizations and potential financial impact among Medicare patients with early stage (Stages I– III) colorectal cancer receiving outpatient chemotherapy using the SEER Medicare database (Fessele et al., 2016)
Descriptive, exploratory, secondary analysis of a randomized controlled trial
To describe drug use and sexual behavior (sex with multiple partners) prior to incarceration and 6 and 12 months after study enrollment using data obtained as part of a randomized controlled trial designed to study the effects of intensive peer coaching and nurse case management, intensive peer coaching, and brief nurse counseling on hepatitis A and B vaccination adherence (Nyamathi et al., 2015, 2016; see Appendix A)
Longitudinal, descriptive
This longitudinal, single group study was conducted to determine whether empirically selected and social cognitive theory-based factors, including baseline characteristics and modifiable behavioral and psychosocial factors, were determinants of PA maintenance in breast cancer survivors after a physical activity intervention (Lee, Von, et al., 2016).
PA, Physical activity; SEER, Surveillance, Epidemiology, and End Results.
BOX 10.1 Summary of Nonexperimental Research Designs I. Survey studies
A. Descriptive
B. Exploratory
C. Comparative
II. Relationship/difference studies
A. Correlational studies
B. Developmental studies
1. Cross-sectional
2. Cohort, longitudinal, and prospective
3. Case control, retrospective, and ex post facto
334
EVIDENCE-BASED PRACTICE TIPS When critically appraising nonexperimental studies, you need to be aware of possible sources of bias that can be introduced at any point in the study.
CRITICAL THINKING DECISION PATH Nonexperimental Design Choice
Survey studies The broadest category of nonexperimental designs is the survey study. Survey studies are further classified as descriptive, exploratory, or comparative. Surveys collect detailed descriptions of variables and use the data to justify and assess conditions and practices or to make plans for improving health care practices. You will find that the terms exploratory, descriptive, comparative, and survey are used either alone, interchangeably, or together to
335
describe this type of study’s design (see Table 10.1).
• A survey is used to search for information about the characteristics of particular subjects, groups, institutions, or situations, or about the frequency of a variable’s occurrence, particularly when little is known about the variable. Box 10.2 provides examples of survey studies.
• Variables can be classified as opinions, attitudes, or facts.
• Fact variables include gender, income level, political and religious affiliations, ethnicity, occupation, and educational level.
• Surveys provide the basis for the development of intervention studies.
• Surveys are described as comparative when used to determine differences between variables.
• Survey data can be collected with a questionnaire or an interview (see Chapter 14).
• Surveys have small or large samples of subjects drawn from defined populations, can be either broad or narrow, and can be made up of people or institutions.
• Surveys relate one variable to another or assess differences between variables, but do not determine causation.
BOX 10.2 Survey Design Examples
• Bender and colleagues (2016) developed and administered a survey to a nationwide sample (n = 585) of certified clinical nurse leaders (CNLs) and managers, leaders, educators, clinicians, and change agents involved in planning and integrating CNLs into a health system’s nursing care delivery model. Items addressed organizational and implementation characteristics and perceived level of CNL initiative success.
336
• Lee, Fan, and colleagues (2016) conducted a survey to investigate the differences between perceptions of injured patients and their caregivers. Participants completed the Chinese Illness Perception Questionnaire Revised–Trauma. Exploring the differences in illness perceptions between injured patients and their caregivers can help clinicians provide individualized care and design interventions that meets patients’ and caregivers’ needs.
The advantages of surveys are that a great deal of information can be obtained from a large population in a fairly economical manner, and that survey research information can be surprisingly accurate. If a sample is representative of the population (see Chapter 12), even a relatively small number of subjects can provide an accurate picture of the population.
Survey studies do have disadvantages. The information obtained in a survey tends to be superficial. The breadth rather than the depth of the information is emphasized.
EVIDENCE-BASED PRACTICE TIPS Evidence gained from a survey may be coupled with clinical expertise and applied to a similar population to develop an educational program, to enhance knowledge and skills in a particular clinical area (e.g., a survey designed to measure the nursing staff’s knowledge and attitudes about evidence-based practice where the data are used to develop an evidence-based practice staff development course).
HELPFUL HINT You should recognize that a well-constructed survey can provide a wealth of data about a particular phenomenon of interest, even though causation is not being examined.
Relationship and difference studies Investigators also try to assess the relationships or differences between variables that can provide insight into a phenomenon. These studies can be classified as relationship or difference studies. The following types of relationship/difference studies are
337
discussed: correlational studies and developmental studies.
Correlational studies In a correlational study the relationship between two or more variables is examined. The researcher does not test whether one variable causes another variable. Rather, the researcher is:
• Testing whether the variables co-vary (i.e., As one variable changes, does a related change occur in another variable?)
• Interested in quantifying the strength of the relationship between variables, or in testing a hypothesis or research question about a specific relationship
The direction of the relationship is important (see Chapter 16 for an explanation of the correlation coefficient). For example, in their correlational study, Turner-Sack and colleagues (2016) examined psychological functioning, post-traumatic growth (PTG), and coping and cancer–related characteristics of adolescent cancer survivors’ parents and siblings (see Appendix D). This study tested multiple variables to assess the relationship and differences among the sample. One finding was that parents’ psychological distress was negatively correlated with their survivor child’s active coping (r = −0.53, p <.001). The study findings revealed that younger age, higher life satisfaction, and less avoidant coping were strong predictors of lower psychological distress in parents of adolescent cancer survivors. Thus the variables were related to (not causal of) outcomes. Each step of this study was consistent with the aims of exploring relationships among variables.
When reviewing a correlational study, remember what relationship the researcher tested and notice whether the researcher implied a relationship that is consistent with the theoretical framework and research question(s) or hypotheses being tested. Correlational studies offer the following advantages:
• An efficient and effective method of collecting a large amount of data about a problem
• A potential for evidence-based application in clinical settings
338
• A potential foundation for future experimental research studies
• A framework for exploring the relationship between variables that cannot be manipulated
The following are disadvantages of correlational studies:
• Variables are not manipulated.
• Randomization is not used because the groups are preexisting, and therefore generalizability is decreased.
• The researcher is unable to determine a causal relationship between the variables because of the lack of manipulation, control, and randomization.
• The strength and quality of evidence is limited by the associative nature of the relationship between the variables.
Correlational studies may be further labeled as descriptive correlational or predictive correlational. In terms of evidence for practice, the researchers based on the literature review and findings, frame the utility of the results in light of previous research and therefore help establish the “best available” evidence that, combined with clinical expertise, informs clinical decisions regarding the study’s applicability to a specific patient population. A correlational design is a very useful design for clinical studies because many of the phenomena of clinical interest are beyond the researcher’s ability to manipulate, control, and randomize.
HIGHLIGHT When your QI team’s search of the literature for intervention studies reporting evidence-based strategies for preventing ventilator acquired pneumonia (VAP) yields only studies using nonexperimental designs, your team members should debate whether the evidence is of sufficient quality to be applied to answering your clinical question.
Developmental studies
339
There are also classifications of nonexperimental designs that use a time perspective. Investigators who use developmental studies are concerned not only with the existing status and the relationship and differences among phenomena at one point in time, but also with changes that result from elapsed time. The following types of designs are discussed: cross-sectional, cohort/longitudinal/prospective, and case control/retrospective/ex post facto. In the literature, studies may be designated by more than one design name. This practice is accepted because many studies have elements of several designs. Table 10.1 provides examples of studies classified with more than one design label.
EVIDENCE-BASED PRACTICE TIPS Replication of significant findings in nonexperimental studies with similar and/or different populations increases your confidence in the conclusions offered by the researcher and the strength of evidence generated by consistent findings from more than one study.
Cross-sectional studies A cross-sectional study examines data at one point in time; that is, data are collected on only one occasion with the same subjects rather than with the same subjects at several time points. For example, a cross-sectional study was conducted by Koc and Cinarli (2015) to determine knowledge, awareness, and practices of Turkish hospital nurses in relation to cervical cancer, human papillomavirus (HPV), and HPV vaccination. The researchers aimed to answer several research questions:
• What is the level of knowledge about cancer risk factors?
• What is the level of knowledge about early diagnosis?
• What are the awareness, knowledge, information sources, and practices regarding HPV infection and HPV vaccine administration?
• What are the relationships between the sociodemographic (age, willingness to receive HPV vaccination, willingness for their
340
children to receive HPV vaccination) and professional characteristics (education, belief that cervical cancer can be prevented by HPV vaccination), and overall level of knowledge about cervical cancer, HPV, and HPV vaccines?
As you can see, the variables were related to, not causal of, outcomes. Each step of this study was consistent with the aims of exploring the relationship and differences among variables in a cross-sectional design.
In this study the sample subjects participated on one occasion; that is, data were collected on only one occasion from each subject and represented a cross section of 464 Turkish nurses working in hospital settings, rather than the researchers following a group of nurses over time. The purpose of this study was not to test causality, but to explore the potential relationships between and among variables that can be related to knowledge about HPV, belief in the effectiveness of early cervical cancer screening, and HPV vaccination. The authors concluded that higher levels of knowledge among nurses may increase their willingness to recommend the HPV vaccine to patients. Cross-sectional studies can explore relationships and correlations, or differences and comparisons, or both. Advantages and disadvantages of cross-sectional studies are as follows:
• Cross-sectional studies, when compared to longitudinal/cohort/prospective studies are less time-consuming, less expensive, and thus more manageable.
• Large amounts of data can be collected at one point, making the results more readily available.
• The confounding variable of maturation, resulting from the elapsed time, is not present.
• The investigator’s ability to establish an in-depth developmental assessment of the relationships of the variables being studied is lessened. The researcher is unable to determine whether the change that occurred is related to the change that was predicted because the same subjects were not followed over a period of
341
time. In other words, the subjects are unable to serve as their own controls (see Chapter 8).
Cohort/prospective/longitudinal/repeated measures studies In contrast to the cross-sectional design, cohort studies collect data from the same group at different points in time. Cohort studies are also referred to as longitudinal, prospective, and repeated measures studies. These terms are interchangeable. Like cross- sectional studies, cohort studies explore differences and relationships among variables. An example of a longitudinal (cohort) study is found in the study by Hawthorne and colleagues (2016; see Appendix B). This study tested the relationships between spiritual/religious coping strategies and grief, mental health (depression and post-traumatic stress disorder), and personal growth for mothers and fathers at 1 and 3 months after their infant/child’s death in the NICU/PICU with and without control for race/ethnicity and religion. They concluded that spiritual strategies and activities were associated with lower symptoms of grief and depression in parents and post-traumatic stress in mothers but not post-traumatic stress in fathers.
Cohort designs have advantages and disadvantages. When assessing the appropriateness of a cross-sectional study versus a cohort study, first assess the nature of the research question or hypothesis: Cohort studies allow clinicians to assess the incidence of a problem over time and potential reasons for changes in the study’s variables. However, the disadvantages inherent in a cohort study also must be considered. Data collection may be of long duration; therefore, subject loss or mortality can be high due to the time it takes for the subjects to progress to each data collection point. The internal validity threat of testing is also present and may be unavoidable in a cohort study. Subject loss to follow-up or attrition, may lead to unintended sample bias affecting both the internal validity and external validity of the study.
These realities make a cohort study costly in terms of time, effort, and money. There is also a chance of confounding variables that could affect the interpretation of the results. Subjects in such a study may respond in a socially desirable way that they believe is congruent with the investigator’s expectations (Hawthorne effect).
342
Advantages of a cohort study are as follows:
• Each subject is followed separately and thereby serves as his or her own control.
• Increased depth of responses can be obtained and early trends in the data can be analyzed.
• The researcher can assess changes in the variables of interest over time, and both relationships and differences can be explored between variables.
In summary, cohort studies begin in the present and end in the future, and cross-sectional studies look at a broader perspective of a population at a specific point in time.
Case control/retrospective/ex post facto studies A case control study is essentially the same as an ex post facto study and a retrospective study. In these studies, the dependent variable already has been affected by the independent variable, and the investigator attempts to link present events to events that occurred in the past. When researchers wish to explain causality or the factors that determine the occurrence of events or conditions, they prefer to use an experimental design. However, they cannot always manipulate the independent variable, or use random assignments. When experimental designs that test the effect of an intervention or condition cannot be employed, case control (ex post facto or retrospective) studies may be used. Ex post facto literally means “from after the fact.” Case control, ex post facto, retrospective, or case control studies also are known as causal- comparative studies or comparative studies. As we discuss this design further, you will see that many elements of this category are similar to quasi-experimental designs because they explore differences between variables (Campbell & Stanley, 1963).
In case control studies, a researcher hypothesizes, for instance:
• That X (cigarette smoking) is related to and a determinant of Y (lung cancer).
343
• But X, the presumed cause, is not manipulated and subjects are not randomly assigned to groups.
• Rather, a group of subjects who have experienced X (cigarette smoking) in a normal situation is located and a control group of subjects who have not experienced X is chosen.
• The behavior, performance, or condition (lung tissue) of the two groups is compared to determine whether the exposure to X had the effect predicted by the hypothesis.
Table 10.2 illustrates this example. Examination of Table 10.2 reveals that although cigarette smoking appears to be a determinant of lung cancer, the researcher is still not able to conclude that a causal relationship exists between the variables, because the independent variable has not been manipulated and subjects were not randomly assigned to groups.
TABLE 10.2 Paradigm for the Ex Post Facto Design
Groups (Not Randomly Assigned)
Independent Variable (Not Manipulated by Investigator)
Dependent Variable
Exposed group: Cigarette smokers
X Ye
Cigarette smoking Lung cancer Control group: Nonsmokers — Yc
— No lung cancer
Kousha and Castner (2016) conducted a case control study to explore novel multipollutant exposure assessments using the Air Quality Health Index in relation to emergency department (ED) visits over a 6-year period for otitis media (OM). They used information from ED visits (n = 4815 children from 3 years of age and younger) for OM, air pollution, and weather databases. The findings indicate that there was an increase in ED visits with OM diagnoses 6 to 7 days after exposure to increased ozone and 3 to 4 days after exposure to particulate matter. These findings confirm that there is an association between changes in the Air Quality Index and ED visits for OM. These findings can be used to inform risk communication, patient education, and policy.
344
EVIDENCE-BASED PRACTICE TIPS The quality of evidence provided by a cohort/longitudinal/prospective study is stronger than that from other nonexperimental designs because the researcher can determine the incidence of a problem and its possible causes.
The advantages of the case control/retrospective/ex post facto design are similar to those of the correlational design. The additional benefit is that it offers a higher level of control than a correlational study, thereby increasing the confidence the research consumer would have in the evidence provided by the findings. For example, in the cigarette smoking study, a group of nonsmokers’ lung tissue samples are compared with samples of smokers’ lung tissue. This comparison enables the researcher to establish the existence of a differential effect of cigarette smoking on lung tissue. However, the researcher remains unable to draw a causal linkage between the two variables, and this inability is the major disadvantage of the case control/retrospective/ex post facto/case control design.
Another disadvantage is the problem of an alternative hypothesis being the reason for the documented relationship. If the researcher obtains data from two existing groups of subjects, such as one that has been exposed to X and one that has not, and the data support the hypothesis that X is related to Y, the researcher cannot be sure whether X or some extraneous variable is the real cause of the occurrence of Y. As such, the impact or effect of the relationship cannot be estimated accurately. Finding naturally occurring groups of subjects who are similar in all respects except for their exposure to the variable of interest is very difficult. There is always the possibility that the groups differ in some other way, such as exposure to other lung irritants (e.g., asbestos), that can affect the findings of the study and produce spurious or unreliable results. Consequently, you need to cautiously evaluate the conclusions drawn by the investigator.
HELPFUL HINT When reading research reports, you will note that at times researchers classify a study’s design with more than one design
345
type label. This is correct because research studies often reflect aspects of more than one design label.
Cohort/longitudinal/prospective studies are considered to be stronger than case control/retrospective studies because of the degree of control that can be imposed on extraneous variables that might confound the data and lead to bias.
HELPFUL HINT Remember that nonexperimental designs can test relationships, differences, comparisons, or predictions, depending on the purpose of the study.
Prediction and causality in nonexperimental research A concern of researchers and research consumers is the issues of prediction and causality. Researchers are interested in explaining cause-and-effect relationships—that is, estimating the effect of one phenomenon on another without bias. Historically, researchers thought that only experimental research could support the concept of causality. For example, nurses are interested in discovering what causes anxiety in many settings. If we can uncover the causes, we could develop interventions that would prevent or decrease the anxiety. Causality makes it necessary to order events chronologically; that is, if we find in a randomly assigned experiment that event 1 (stress) occurs before event 2 (anxiety) and that those in the stressed group were anxious whereas those in the unstressed group were not, we can say that the hypothesis of stress causing anxiety is supported by these empirical observations. If these results were found in a nonexperimental study where some subjects underwent the stress of surgery and were anxious and others did not have surgery and were not anxious, we would say that there is an association or relationship between stress (surgery) and anxiety. But on the basis of the results of a nonexperimental study, we could not say that the stress of surgery caused the anxiety.
346
EVIDENCE-BASED PRACTICE TIPS Studies that use nonexperimental designs often precede and provide the foundation for building a program of research that leads to experimental designs that test the effectiveness of nursing interventions.
Many variables (e.g., anxiety) that nurse researchers wish to study cannot be manipulated, nor would it be wise or ethical to manipulate them. Yet there is a need to have studies that can assert a predictive or causal sequence. In light of this need, many nurse researchers are using several analytical techniques that can explain the relationships among variables to establish predictive or causal links. These analytical techniques are called causal modeling, model testing, and associated causal analysis techniques (Kline, 2011; Plichta & Kelvin, 2013).
When reading studies, you also will find the terms path analysis, LISREL, analysis of covariance structures, structural equation modeling (SEM), and hierarchical linear modeling (HLM) to describe the statistical techniques (see Chapter 16) used in these studies. These terms do not designate the design of a study, but are statistical tests that are used in many nonexperimental designs to predict how precisely a dependent variable can be predicted based on an independent variable. For example, SEM was used to understand risk and promotive factors for youth violence and bullying in a sample of US seventh grade students who completed a survey containing items about future expectations, attitudes towards violence, past 30-day bullying experiences, and violent behavior. SEM was used to establish a model of how the variables related to one another. The findings supported the hypothesis that more positive future expectations would be related to lower levels of both physical and relational bullying and that relational bullying would be mediated by attitudes towards violence (Stoddard et al., 2015). This sophisticated design aids understanding of bullying behavior and the positive aspects of early adolescents’ lives that may help them avoid such behavior and provide useful direction for professionals like school nurses and other school-based mental health professionals when developing interventions focused on decreasing bullying. Sometimes researchers want to make a forecast
347
or prediction about how patients will respond to an intervention or a disease process or how successful individuals will be in a particular setting or field of specialty. In this case, a model may be tested to assess which physical activity scores were not significant.
Many nursing studies test models. The statistics used in model- testing studies are advanced, but you should be able to read the article, understand the purpose of the study, and determine if the model generated was logical and developed with a solid basis from the literature and past research. This section cites several studies that conducted sound tests of theoretical models.
HELPFUL HINT Nonexperimental research studies have progressed to the point where prediction models are often used to explore or test relationships between independent and dependent variables.
Additional types of quantitative methods Other types of quantitative studies complement the science of research. The additional research methods provide a means of viewing and interpreting phenomena that give further breadth and knowledge to nursing science and practice. The additional types include methodological research, secondary analysis, and mixed methods.
Methodological research Methodological research is the development and evaluation of data collection instruments, scales, or techniques. As you will find in Chapters 14 and 15, methodology greatly influences research and the evidence produced.
The most significant and critically important aspect of methodological research addressed in measurement development is called psychometrics. Psychometrics focuses on the theory and development of measurement instruments (such as questionnaires) or measurement techniques (such as observational techniques) through the research process. Nurse researchers have used the principles of psychometrics to develop and test measurement instruments that focus on nursing phenomena. Many of the phenomena of interest to practice and research are intangible, such
348
as interpersonal conflict, resilience, quality of life, coping, and symptom experience. The intangible nature of various phenomena —yet the recognition of the need to measure them—places methodological research in an important position. Methodological research differs from other designs of research in two ways. First, it does not include all of the research process steps as discussed in Chapter 1. Second, to implement its techniques, the researcher must have a sound knowledge of psychometrics or must consult with a researcher knowledgeable in psychometric techniques. The methodological researcher is not interested in the relationship of the independent variable and dependent variable or in the effect of an independent variable on a dependent variable. The methodological researcher is interested in identifying an intangible construct (concept) and making it tangible with a paper-and-pencil instrument or observation protocol.
A methodological study basically includes the following steps:
• Defining the concept or behavior to be measured
• Formulating the instrument’s items
• Developing instructions for users and respondents
• Testing the instrument’s reliability and validity
These steps require a sound, specific, and exhaustive literature review to identify the theories underlying the concept. The literature review provides the basis of item formulation. Once the items have been developed, the researcher assesses the tool’s reliability and validity (see Chapter 15). As an example of methodological research, Rini (2016) identified that the concept of a women’s experience of childbirth had not been adequately measured. In order to measure this concept, Rini’s (2016) review of the literature and an earlier concept analysis provided the basis for the development of the instrument, the Women’s Experience in Childbirth Survey (WECS). The instrument was developed in order to “provide a comprehensive measure of a women’s perception of the childbirth experience and its effects on maternal and neonatal outcomes” (Rini, 2016, p. 269). Having developed a conceptual
349
definition, Rini followed through by testing the instrument for reliability and validity (see Chapter 15). Common considerations that researchers incorporate into methodological research are outlined in Table 10.3. Many more examples of methodological research can be found in the research literature. The specific procedures of methodological research are beyond the scope of this book, but you are urged to closely review the instruments used in studies.
TABLE 10.3 Common Considerations in the Development of Measurement Tools
Consideration Example A well-constructed scale, test, or interview schedule should consist of an objective, standardized measure of a behavior that has been clearly defined.
Observations should be made on a small but carefully chosen sampling of the behavior of interest, thus permitting the reader to feel confident that the samples are representative.
Rini (2016) provided a comprehensive literature review and definitions of the concepts that she operationalized for the WECS.
Rini (2016) piloted the instrument with 11 mothers to determine the clarity and sufficiently of the items as well as a preferred scaling method (Likert or Semantic Differential Scale).
An instrument should be standardized. It should be a set of uniform items and response possibilities, uniformly administered and scored.
Based on the initial pilot test of the instrument. The 49-item scale was developed using a 5-point Likert scale. Thirteen of the items are reversed scored and the answers summed. A higher score indicates a more positive birth experience. Potential scores range from 49 to 245.
The items should be unambiguous; clear-cut, concise, exact statements with only one idea per item.
A pilot study was conducted to evaluate the WECS items and the administration procedures. The pilot data indicated that several items needed to be dropped.
The item types should be limited in the type of variations. Subjects who are expected to shift from one type of item to another may fail to provide a true response as a result of the distraction of making such a change.
Mixing true-or-false items with questions that require a yes-or-no response and items that provide a response format of five possible answers is conducive to a high level of measurement error. The WECS contained only a 5-point Likert scale.
Items should not provide irrelevant clues. Unless carefully constructed, an item may furnish an indication of the expected response or answer. Furthermore, the correct answer or expected response to one item should not be given by another item.
An item that provides a clue to the expected answer may contain value words that convey cultural expectations, such as the following: “A good wife enjoys caring for her home and family.”
Instruments should not be made difficult by requiring unnecessarily complex or exact operations. Furthermore, the difficulty of an instrument should be appropriate to the level of the subjects being assessed. Limiting each
A test constructed to evaluate learning in an introductory course in research methods may contain an item that is inappropriate for the designated group, such as the following: “A nonlinear transformation of data to linear data is
350
item to one concept or idea helps accomplish this objective.
a useful procedure before testing a hypothesis of curvilinearity.”
An instrument’s diagnostic, predictive, or measurement value depends on the degree to which it serves as an indicator of a relatively broad and significant behavior area, known as the universe of content for the behavior. A behavior must be clearly defined before it can be measured. The extent to which test items appear to accomplish this objective is an indication of the instrument’s content and/or construct validity.
The WECS development included establishment of acceptable content validity. The WECS items were submitted to a panel of experts of a nurse midwife, two maternal infant nursing instructors and a nurse with instrument development experience. The Content Validity Index = 0.75–1.0, which means that the items are deemed to reflect the universe of content related to collaborative trust. Construct validity was established using factor analysis.
An instrument should adequately cover the defined behavior. A primary consideration is whether the number and nature of items are adequate. If there are too few items, the accuracy or reliability of the measure must be questioned.
Rini (2016) presented a complete overview of the validity and reliability testing for the scale and provided a detailed discussion of the findings and needed future testing.
The measure must prove its worth empirically through tests of reliability and validity.
A researcher should demonstrate that a scale is accurate and measures what it purports to measure (see Chapter 15). Rini (2016) provided the data on the reliability and validity testing of the WECS scale.
WECS, Women’s experience in childbirth survey.
Secondary analysis Secondary analysis is also not a design but rather a research method in which the researcher takes previously collected and analyzed data from one study and reanalyzes the data or a subset of the data for a secondary purpose. The original study may be either an experimental or a nonexperimental design. As large data sets become more available, secondary analysis has become more prominent and a useful methodology for answering questions related to population health issues. Data for secondary analysis may be derived from a large clinical trial and data available through large health care organizations and databases. For example, Knight and colleagues (2016) conducted a secondary analysis of data from a larger observational prospective study (DeVon et al., 2014). The aim of the secondary analysis was to identify common trajectories of symptom severity in the 6 months following an ED visit for potential acute coronary syndrome (ACS). In the parent study, a convenience sample of participants was recruited from the ED of four academic medical centers and one community hospital. Data from a total of 1005 male (62.6%) and female (37.4%) participants with a mean age of 60.2 years (SD =
351
14.17 years) were analyzed for common trajectories of symptom severity using the validated 13-item ACS Symptom Checklist. Findings from this secondary analysis identified seven types of trajectories across eight symptoms, labeled “tapering off,” “mild/persistent,” “moderate/worsening,” “moderate/improving,” “late onset,” and “severe/improving.” Trajectories differed by age, gender, and diagnosis. The data from this study allowed further in- depth exploration of distinct symptoms trajectories in the 6 months after an ED visit for potential ACS. This has the potential to improve clinical assessment of ongoing symptoms and patient education. Identification of at-risk patients can target specific subpopulations for individualized education, post-ED discharge support and evidence-based symptom management plans, and gender differences in risky sexual behavior among urban adolescents exposed to violence.
Mixed methods Over the years, mixed methods have been defined in various ways. Historically mixed methods included the use of multimethod research or thought, which means including in one study use of a variety of data sources, such as use of different investigators, use of multiple theories in one study, or use of multiple methods (Denzin, 1978). Over the years these terms and methods have been refined and clarified (Johnson et al., 2007). The definition and core characteristics that integrate the diverse meaning of mixed methods research are as follows:
In mixed methods, the researcher:
• “Based on the research question collects and analyzes rigorously both qualitative and quantitative data
• Mixes the two forms of data concurrently by combining the data
• Gives priority to one or both forms of data in terms of emphasis
• Uses the procedures of both in one study or in multiple phases of a program of study
• Frames the procedures within philosophical worldviews and
352
theoretical lenses
• Combines the procedures into specific research designs that direct the plan for conducting the study” (Creswell & Plano Clark, 2011, pp. 5–6).
The order of data collection in a mixed methods study varies depending on the question that a researcher wishes to answer. In a mixed methods study the quantitative data may be collected simultaneously with the qualitative data, or one may follow the other. Studying a question using both methods can contribute to a better understanding of an area of research. An example of a mixed methods study was completed by Christian and colleagues (2016). The aim of the study was to assess the feasibility of overcoming barriers to physical activity in a group of teenagers over a period of 1 year using a voucher system of rewards. The qualitative portion of the study included three focus groups on three different occasions at baseline, 6 months, and post-intervention 1 year. The purpose of the groups was to understand the effects of physical activity, fitness, and motivation, as well as barriers to the use of the vouchers during the study with students and teachers. The quantitative portion included the use of an aerobic fitness test, a self-reported activity scale, and a physical activity measure using an accelerometer. The measurement instruments and interviews were administered on three occasions over a year. The design of this study allowed the research team to assess how well the voucher program supported physical activity, aerobic fitness, and increased motivation using multiple methods in a group of adolescents. The study’s findings supported that the use of vouchers provided access to more physical activity, increased socialization, and improved fitness activity in the adolescents during the year.
There is a diversity of opinion on how to evaluate mixed methods studies. Evaluation can include analyzing the quantitative and qualitative designs of the study separately, or as proposed by Creswell and Plano Clark (2011), there should be a separate set of criteria for mixed methods studies dependent on the designs and methods used.
HELPFUL HINT
353
As you read the literature, you will find labels such as outcomes research, needs assessments, evaluation research, and quality assurance. These studies are not designs per se. These studies use either experimental or nonexperimental designs. Studies with these labels are designed to test the effectiveness of health care techniques, programs, or interventions. When reading such a research study, the reader should assess which design was used and if the principles of the design, sampling strategy, and analysis are consistent with the study’s purpose.
Appraisal for evidence-based practice nonexperimental designs Criteria for appraising nonexperimental designs are presented in the Critical Appraisal Criteria box. When appraising nonexperimental research designs, you should keep in mind that such designs offer the researcher a lower level of control and an increased risk of bias. The level of evidence provided by nonexperimental designs is not as strong as evidence generated by experimental designs; however, there are other important clinical research questions that need to be answered beyond the testing of interventions and experimental or quasi-experimental designs.
The first step in critiquing nonexperimental designs is to determine which type of design was used in the study. Often a statement describing the design of the study appears in the abstract and in the methods section of the report. If such a statement is not present, you should closely examine the paper for evidence of which type of design was employed. You should be able to discern that either a survey or a relationship design was used. For example, you would expect an investigation of self-concept development in children from birth to 5 years of age to be a relationship study using a cohort/prospective/longitudinal design. If a cohort/prospective/longitudinal study was used, you should assess for possible threats to internal validity or bias, such as mortality, testing, and instrumentation. Potential threats to internal or external validity should be recognized by the researchers at the end of the study and, in particular, the limitations section.
Next, evaluate the literature review of the study to determine if a
354
nonexperimental design was the most appropriate approach to the research question or hypothesis. For example, many studies on pain (e.g., intensity, severity, perception) are suggestive of a relationship between pain and any of the independent variables (diagnosis, coping style, and ethnicity) under consideration where the independent variable cannot be manipulated. As such, these studies suggest a nonexperimental correlational, longitudinal/prospective/cohort, a retrospective/ex post facto/case control, or a cross-sectional design. Investigators will use one of these designs to examine the relationship between the variables in naturally occurring groups. Sometimes you may think that it would have been more appropriate if the investigators had used an experimental or a quasi-experimental design. However, you must recognize that pragmatic or ethical considerations also may have guided the researchers in their choice of design (see Chapters 8 through 18).
CRITICAL APPRAISAL CRITERIA Nonexperimental Designs
1. Based on the theoretical framework, is the rationale for the type of design appropriate?
2. How is the design congruent with the purpose of the study?
3. Is the design appropriate for the research question or hypothesis?
4. Is the design suited to the data collection methods?
5. Does the researcher present the findings in a manner congruent with the design used?
6. Does the research go beyond the relational parameters of the findings and erroneously infer cause-and-effect relationships between the variables?
7. Where appropriate, how does the researcher discuss the threats to internal validity (bias) and external validity (generalizability)?
355
8. How does the author identify the limitations of the study?
9. Does the researcher make appropriate recommendations about the applicability based on the strength and quality of evidence provided by the nonexperimental design and the findings?
Finally, the factor or factors that actually influence changes in the dependent variable can be ambiguous in nonexperimental designs. As with all complex phenomena, multiple factors can contribute to variability in the subjects’ responses. When an experimental design is not used for controlling some of these extraneous variables that can influence results, the researcher must strive to provide as much control as possible within the context of a nonexperimental design, to decrease bias. For example, when it has not been possible to randomly assign subjects to treatment groups as an approach to controlling an independent variable, the researchers will use strict inclusion and exclusion criteria and calculate an adequate sample size using power analysis that will support a valid testing of the research question or hypothesis (see Chapter 12). Threats to internal and external validity or potential sources of bias represent a major influence when interpreting the findings of a nonexperimental study because they impose limitations to the generalizability of the results. It is also important to remember that prediction of patient clinical outcomes is of critical value for clinical researchers. Nonexperimental designs can be used to make predictions if the study is designed with an adequate sample size (see Chapter 12), collects data consistently, and uses reliable and valid instruments (see Chapter 15).
If you are appraising methodological research, you need to apply the principles of reliability and validity (see Chapter 15). A secondary analysis needs to be reviewed from several perspectives. First, you need to understand if the researcher followed sound scientific logic in the secondary analysis completed. Second, you need to review the original study that the data were extracted from to assess the reliability and validity of the original study. Even though the format and methods vary, it is important to remember that all research has a central goal: to answer questions scientifically and provide the strongest, most consistent evidence possible, while
356
controlling for potential bias.
Key points • Nonexperimental designs are used in studies that construct a
picture or make an account of events as they naturally occur. Nonexperimental designs can be classified as either survey studies or relationship/difference studies.
• Survey studies and relationship/difference studies are both descriptive and exploratory in nature.
• Survey research collects detailed descriptions of existing phenomena and uses the data either to justify current conditions and practices or to make more intelligent plans for improving them.
• Correlational studies examine relationships.
• Developmental studies are further broken down into categories of cross-sectional studies, cohort/longitudinal/prospective studies, and case control/retrospective/ex post facto studies.
• Methodological research, secondary analysis, and mixed methods are examples of other means of adding to the body of nursing research. Both the researcher and the reader must consider the advantages and disadvantages of each design.
• Nonexperimental research designs do not enable the investigator to establish cause-and-effect relationships between the variables. Consumers must be wary of nonexperimental studies that make causal claims about the findings unless a causal modeling technique is used.
• Nonexperimental designs also offer the researcher the least amount of control. Threats to validity impose limitations on the generalizability of the results and as such should be fully assessed by the critical reader.
357
• The critiquing process is directed toward evaluating the appropriateness of the selected nonexperimental design in relation to factors, such as the research problem, theoretical framework, hypothesis, methodology, and data analysis and interpretation.
• Though nonexperimental designs do not provide the highest level of evidence (Level I), they do provide a wealth of data that become useful pieces for formulating both Level I and Level II studies that are aimed at developing and testing nursing interventions.
Critical thinking challenges • The mid-term assignment for your interprofessional research
course is to critically appraise an assigned study on the relationship of perception of pain severity and quality of life in advanced cancer patients. You and your nursing student colleagues think it is a cross-sectional design, but your medical student colleagues think it is a quasi-experimental design because it has several specific hypotheses. How would each group of students support their argument, and how would they collaborate to resolve their differences?
• You are completing your senior practicum on a surgical unit, and for preconference your student group has just completed a search for studies related to the effectiveness of handwashing in decreasing the incidence of nosocomial infections, but the studies all use an ex post facto/case control design. You want to approach the nurse manager on the unit to present the evidence you have collected and critically appraised, but you are concerned about the strength of the evidence because the studies all use a nonexperimental design. How would you justify that this is the “best available evidence”?
• You are a member of a journal club at your hospital. Your group is interested in the effectiveness of smoking cessation interventions provided by nurses. An electronic search indicates
358
that 12 individual research studies and one meta-analysis meet your inclusion criteria. Would your group begin with critically appraising the 12 individual studies or the one meta-analysis? Provide rationale for your choice, including consideration of the strength and quality of evidence provided by individual studies versus a meta-analysis.
• A patient in a primary care practice who had a history of a “heart murmur” called his nurse practitioner for a prescription for an antibiotic before having a periodontal (gum) procedure. When she responded that according to the new American Heart Association (AHA) clinical practice guideline, antibiotic prophylaxis is no longer considered appropriate for his heart murmur, the patient got upset, stating, “But I always take antibiotics! I want you to tell me why I should believe this guideline. How do I know my heart will not be damaged by listening to you?” What is the purpose of a clinical practice guideline, and how would you as a nurse practitioner respond to this patient?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Bender M, Williams M, Su W, et al. Clinical nurse leader
integrated care delivery to improve care quality Factors influencing perceived success. Journal of Nursing Scholarship 2016;48(4):414-422.
2. Campbell D.T, Stanley J.C. Experimental and quasi- experimental designs for research. Chicago, IL: Rand- McNally;1963.
3. Christian D, Todd C, Hill R, et al. Active children through incentive vouchers-evaluation (ACTIVE) A mixed methods feasibility study. BMC Public Health 2016;16:890.
4. Creswell J.W, Plano V.L. Designing and conducting mixed methods research. Thousand Oaks, CA: Sage
359
Publications;2011. 5. DeVon H.A, Burke L.A, Nelson H, et al. Disparities in
patients presenting to the emergency department with potential acute coronary syndrome It matters if you are black or white. Heart & Lung 2014;43:270-277.
6. Denzin N.K. The research act A theoretical introduction to sociological methods. New York: McGraw Hill;1978.
7. Fessele K.L, Hayat M.J, Mayer D.K, Atkins R.L. Factors associated with unplanned hospitalizations among patients with nonmetastatic colorectal cancers intended for treatment in ambulatory settings. Nursing Research 2016;65(1):24-33.
8. Hawthorne D.M, Youngblut J.M, Brooten D. Parent spirituality, grief, and mental health at 1 and 3 months after their infant’s/child’s death in an intensive care unit. Journal of Pediatric Nursing 2016;(31):73-80.
9. Johnson R.B, Onwuegbuzie A.J, Turner L.A. Toward a definition of mixed methods research. Journal of Mixed Methods Research 2007;1(2):122-133.
10. Kline R. Principles and practices of structural equation modeling. 3rd ed. New York, NY: Guilford Press;2011.
11. Knight E.P, Shea K, Rosenfeld A.G, et al. Symptom trajectories after an emergency department visit for potential acute coronary syndrome. Nursing Research 2016;65(4):268-289.
12. Koc Z, Cinarli T. Cervical cancer, human papillomavirus, and vaccination. Nursing Research 2015;64(6):452-465.
13. Kousha T, Castner J. The air quality health index and emergency department visits for otitis media. Journal of Nursing Scholarship 2016;48(2):163-171.
14. Lee B, Fan J, Hung C, et al. Illness representations of injury a comparison of patients and their caregivers. Journal of Nursing Scholarship 2016;48(3):254-264.
15. Lee C.E, Von Ah D, Szuck B, Lau Y.J. Determinants of physical activity maintenance in breast cancer survivors after a community-based intervention. Oncology Nursing Forum 2016;43(1):93-102.
16. Nyamathi A, Salem B.E, Zhang S, et al. Nursing case management, peer coaching, and hepatitis A and B vaccine completion among homeless men recently released on parole;
360
randomized clinical trial. Nursing Research 2015;64(3):177-189. 17. Nyamathi A.M, Zhang S.X, Wall S, et al. Drug use and
multiple sex partners among homeless ex-offenders secondary findings from an experimental study. Nursing Research 2016;65(3):179-190.
18. Plichta S, Kelvin E. Munro’s statistical methods for health care research. 6th ed. Philadelphia: Lippincott, Williams & Wilkins;2013.
19. Rini E.V. The development and psychometric analysis of the women’s experience in childbirth survey. Journal of Nursing Measurement 2016;124(2):268-280.
20. Stoddard S.A, Varela J.J, Zimmerman M.A. Future expectations, attitude, toward violence, and bullying perpetration during early adolescence. Nursing Research 2015;64(6):422-433.
21. Turner-Sack A.M, Menna R, Setchell S.R, et al. Psychological functioning, post-traumatic growth, and coping in parents and siblings of adolescent cancer survivors. Oncology Nursing Forum 2016;43(1):48-56.
361
CHAPTER 11
Systematic reviews and clinical practice guidelines Geri LoBiondo-Wood
Learning outcomes
After reading this chapter, you should be able to do the following:
• Describe the types of research reviews. • Describe the components of a systematic review. • Differentiate between a systematic review, meta-analysis, and integrative review. • Describe the purpose of clinical guidelines. • Differentiate between an expert- and an evidence-based clinical guideline. • Critically appraise systematic reviews and clinical practice guidelines.
KEY TERMS
362
AGREE II
clinical practice guidelines
effect size
evidence-based practice guidelines
expert-based practice guidelines
forest plot
integrative review
meta-analysis
systematic review
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
The breadth and depth of clinical research has grown. As the number of studies focused on a similar area conducted by multiple research teams has increased, it has become important to have a means of organizing and assessing the quality, quantity, and consistency among the findings of a group of like studies. The previous chapters have introduced the types of qualitative and quantitative designs and how to critique these studies for quality and applicability to practice. The purpose of this chapter is to acquaint you with systematic reviews and clinical guidelines that assess multiple studies focused on the same clinical question, and how these reviews and guidelines can support evidence-based practice. Terminology used to define systematic reviews and clinical guidelines has changed as this area of research and literature assessment has grown. The definitions used in this textbook are consistent with the definitions from the Cochrane Collaboration and the Preferred Reporting for Systematic Reviews and Meta-Analyses (PRISMA) Group (Higgins & Green, 2011; Moher et al., 2009; Stroup et al., 2000). Systematic reviews and clinical guidelines are critical and meaningful for the development of quality improvement practices.
363
Systematic review types A systematic review is a summation and assessment of research studies found in the literature based on a clearly focused question that uses systematic and explicit criteria and methods to identify, select, critically appraise, and analyze relevant data from the selected studies to summarize the findings in a focused area (Liberati et al., 2009; Moher et al., 2009; Moher, Shamseer, et al., 2015). Statistical methods may or may not be used to analyze the studies reviewed. Multiple terms and methods are used to systematically review the literature, depending on the review’s purpose. See Box 11.1 for the components of a systematic review. Some terms are used interchangeably. The terms systematic review and meta-analysis are often used interchangeably or together. The only review type that can be labeled a meta-analysis is one that reviewed studies using statistical methods. When evaluating a systematic review, it is important to assess how well each of the studies in the review minimized bias or maintained the elements of control (see Chapters 8 and 9). BOX 11.1 Systematic Review Components With or Without Meta-Analysis Introduction Review of rationale and a clear clinical question (PICO)
Methods Information sources, databases used, and search strategy identified: how studies were selected and data extracted as well as the variables extracted and defined
Description of methods used to assess risk of bias, summary measures identified (e.g., risk, ratio); identification of how data are combined, if studies are graded, what quality appraisal system was used (see Chapters 1, 17, and 18)
Results Number of studies screened and characteristics, risk of bias within studies; if a meta-analysis there will be a synthesis of results
364
including confidence intervals, risk of bias for each study, and all outcomes considered
Discussion Summary of findings, including the strength, quality, quantity, and consistency of the evidence for each outcome
Any limitations of the studies; conclusions and recommendations of findings for practice
Funding Sources of funding for the systematic review
You will also find reviews of an area of research or theory synthesis termed integrative reviews. Integrative reviews critically appraise the literature in an area but without a statistical analysis and are the broadest category of review (Whittemore, 2005; Whittemore & Knafl, 2005). Recently new types of reviews have been developed. These include rapid reviews, scoping reviews, and realist reviews (Moher, Stewart, et al., 2015). Systematic, integrative, and additional types of reviews are not designs per se, but methods for searching and integrating the literature related to a specific clinical issue. These methods take the results of many studies in a specific area; assess the studies critically for reliability and validity (quality, quantity, and consistency) (see Chapters 1, 7, 17, and 18); and synthesize findings to inform practice. No matter what type of review you are reading, it is important that the authors have clearly detailed the methods that were used and that those methods can be replicated (Moher, Stewart, et al., 2015). Meta-analysis provides Level I evidence as the studies in the review are statistically analyzed and integrates the results of many studies. Systematic reviews and meta-analyses also grade the level of design or evidence of the studies reviewed. The Critical Thinking Decision Path outlines the path for completing a systematic review.
CRITICAL THINKING DECISION PATH Completing a Systematic Review
365
Systematic review A systematic review is a summary of a search of quantitative studies that use similar designs based on a focused clinical question (PICO). The goal is to assess the strength and quality of the evidence found in the literature on a clinical subject. The review uses rigorous inclusion and exclusion criteria, an explicit reproducible methodology to identify all studies that meet the eligibility criteria, and an assessment of the validity of the findings from the included studies (Moher et al., 2009). The goal is to bring together all of the studies related to a focused clinical question in order to assess the strength and quality of the evidence provided by the chosen studies in relation to:
• Sampling issues
• Internal validity (bias) threats
• External validity
• Data analysis
• Applicability of findings to practice
The purpose is to report, in a consolidated fashion, the most
366
current and valid research on intervention effectiveness and clinical knowledge, which will ultimately inform evidence-based decision making about the applicability of findings to practice.
Once the studies in a systematic review are gathered from a comprehensive literature search (see Chapter 3), assessed for quality, and synthesized according to quality or focus, then practice recommendations are made and presented in an article. More than one person independently evaluates the studies to be included or excluded in the review. The articles critically appraised are discussed and presented in a table format within the article, which helps you to easily identify the studies gathered for the review and their quality (Moher et al., 2009). The most important principle to assess when reading a systematic review is how the author(s) identified the studies evaluated and how they systematically reviewed and appraised the literature that led to the reviewers’ conclusions.
The components of a systematic review are the same as a meta- analysis (see Box 11.1) except for the analysis of the studies. An example of a systematic review was completed by Conley and Redeker (2016) on the self-management interventions for inflammatory bowel disease. In this review, the authors:
• Synthesized studies from the literature on self-management interventions for inflammatory disease
• Included a clear clinical question; all of the sections of a systematic review were presented, except there was no statistical meta-analysis (combination of studies data) of the studies as a whole because the interventions and outcomes varied across the studies reviewed
• Summarized studies according to the health-related outcomes and assessed for quality
Each study in this review was considered individually, not analyzed collectively, for its sample size, effect size, and its contribution to knowledge in the area based on a set of criteria. Although systematic reviews are highly useful, they also have to be reviewed for potential bias and carefully critiqued for scientific
367
rigor.
Meta-analysis A meta-analysis is a systematic summary using statistical techniques to assess and combine studies of the same design to obtain a precise estimate of effect (impact of an intervention on the dependent variable/outcomes or association between variables). The terms meta-analysis and systematic review are often used interchangeably. The main difference is only a meta-analysis includes a statistical assessment of the studies reviewed. A meta- analysis treats all the studies reviewed as one large data set in order to obtain a precise estimate of the effect (impact) of the results (outcomes) of the studies in the review.
Meta-analysis uses a rigorous process of summary and determines the impact of a number of studies rather than the impact derived from a single study alone (see Chapter 10). After the clinical question is identified and the search of the review of published and unpublished literature is completed, a meta-analysis is conducted in two phases:
Phase I: The data are extracted (i.e., outcome data, sample sizes, quality of the studies, and measures of variability from the identified studies).
Phase II: The decision is made as to whether it is appropriate to calculate what is known as a pooled average result (effect) of the studies reviewed.
Effect sizes are calculated using the difference in the average scores between the intervention and control groups from each study (Cochrane Handbook of Systematic Reviews for Interventions, 2016). Each study is considered a unit of analysis. A meta-analysis takes the effect size (see Chapter 12) from each of the studies reviewed to obtain an estimate of the population (or the whole) to create a single effect size of all the studies. Thus the effect size is an estimate of how large of a difference there is between intervention and control groups in the summarized studies. Example: ➤ The meta-analysis in Appendix E studied the question
368
“What is the impact of nurse-led clinics (NLCs) on the mortality and morbidity of patients with cardiovascular disease (CVD)?” In this review, the authors synthesized the literature from studies on the effectiveness of NLCs in terms of mortality and morbidity outcomes (Al-Mallah et al., 2015). The studies that assessed this question were reviewed and each weighted for its impact or effect on improving mortality and morbidity. This estimate helps health care providers decide which intervention, if any, was more useful for improving well-being. Detailed components of a systematic review with or without meta-analysis (Moher et al., 2009) are listed in Box 11.1.
In addition to calculating effect sizes, meta-analyses use multiple statistical methods to present and depict the data from studies reviewed (see Chapters 19 and 20). One of these methods is a forest plot, sometimes called a blobbogram. A forest plot graphically depicts the results of analyzing a number of studies.Fig. 11.1 is an example of a forest plot from Al-Mallah and colleagues’ meta- analysis (Al-Mallah et al., 2015; Appendix E, Fig. 2, Box A). This review identified that the available evidence suggests a favorable effect of NLCs on all-cause mortality, rate of major adverse cardiac events, and adherence to medications in patients with CVD.
FIG 11.1 An example of a forest plot. Source: (Adapted from
Al-Mallah, M. H., Farah, I., Al-Madani, W., et al. [2015]. The impact of nurse- led clinics on the mortality and morbidity of patients with cardiovascular
diseases: A systematic review and meta-analysis. Journal of Cardiovascular Nursing, 31[1], 89–95.)
369
EVIDENCE-BASED PRACTICE TIP Evidence-based practice methods such as meta-analysis increase your ability to manage the ever-increasing volume of information produced to develop the best evidence-based practices.
Fig. 11.1 displays nine studies that compared all-cause mortality in nurse-led groups versus the control groups (usual care). Each study analyzed is listed. To the right of the listed study is a horizontal line that identifies the effect size estimate for each study. The box on the vertical line represents the effect size of each study, and the diamond is the effect or significance of the combined studies. The boxes to the left of the zero line mean that NLC care was favored or produced a significant effect. The box to the right of the line indicates studies in which usual care was not favored or significant. The diamond is a more precise estimate of the interventions as it combines the data from all the studies. The exemplar provided is basic, as meta-analysis is a sophisticated methodology. For a fuller understanding, several references are provided (Borenstein et al., 2009; da Costa & Juni, 2014; Higgins & Green, 2011); see also Chapters 19 and 20.
A well-done meta-analysis assesses for bias in studies and provides clinicians a means of evaluating the merit of a body of clinical research. The Cochrane Library published by the Cochrane Collaboration provides a repository of sound meta-analyses. Example: ➤ Martineau and colleagues (2016) completed a meta- analysis to assess the use of vitamin D to prevent asthma exacerbation and improve asthma control in children and adults. The report presents an introduction, details of the methods used to search the literature (databases, search terms, and years), data extraction, and analysis. The report also includes an evidence table of the studies reviewed, a description of how the data were summarized, results of the meta-analysis, a forest plot of the reviewed studies (see Chapter 19), conclusions, and implications for practice and research.
Cochrane collaboration The largest repository of meta-analyses is the Cochrane
370
Collaboration/Review. The Cochrane Collaboration prepares and maintains a body of systematic reviews that focus on health care interventions (Box 11.2). The reviews are found in the Cochrane Database of Systematic Reviews. The Cochrane Collaboration collaborates with a wide range of health care individuals with different skills and backgrounds for developing reviews. These partnerships assist with developing reviews that minimize bias while keeping current with assessment of health care interventions, promoting access to the database, and ensuring the quality of the reviews (Cochrane Handbook for Systematic Reviews, 2016). The steps of a Cochrane Report mirror those of a meta-analysis except for the inclusion of a plain language summary. This useful feature is a straightforward summary of the meta-analysis. The Cochrane Library also publishes several other useful databases (Box 11.3). BOX 11.2 Cochrane Review Sections
Review information: Authors and contact person
Abstract
Plain language summary
The review
Background of the question
Objectives of the search
Methods for selecting studies for review
Type of studies reviewed
Types of participants, types of intervention, types of outcomes in the studies
371
Search methods for finding studies
Data collection
Analysis of the located studies, including effect sizes
Results including description of studies, risk of bias, intervention effects
Discussion
Implications for research and practice
References and tables to display the data
Supplementary information (e.g., appendices, data analysis)
BOX 11.3 Cochrane Library Databases
• Cochrane Database of Systematic Reviews: Full-text Cochrane reviews
• DARE: Critical assessments and abstracts of other systematic reviews that conform to quality criteria
• CENTRAL: Information of studies published in conference proceedings and other sources not available in other databases
• CMR: Bibliographic information on articles and books on reviewing research and methodological studies
CENTRAL, Cochrane Central Register of Controlled Trials; CMR, Cochrane Methodology Register; DARE, Database of Abstracts of Review of Effects.
372
Integrative review You will also find critical reviews of an area of research without a statistical analysis or a theory synthesis, termed integrative reviews. An integrative review is the broadest category of review (Whittemore, 2005; Whittemore & Knafl, 2005). It can include theoretical literature, research literature, or both. An integrative review may include methodology studies, a theory review, or the results of differing research studies with wide-ranging clinical implications (Whittemore, 2005). An integrative review can include quantitative or qualitative research, or both. Statistics are not used to summarize and generate conclusions about the studies. Several examples of an integrative review are found in Box 11.4. Recommendations for future research are suggested in each review. BOX 11.4 Integrative Review Examples
• Brady and colleagues (2014) published an integrative review on the management and effects of steroid-induced hyperglycemia in hospitalized patients with cancer with or without preexisting diabetes. This review included a purpose, description of the methods used (databases searched, years included), key terms used, and parameters of the search. These components allow others to evaluate and replicate the search. Eighteen studies that assessed steroid-induced hyperglycemia in hospitalized patients with cancer were reviewed in the text and via a table format.
• Kestler and LoBiondo-Wood (2012) published an integrative review of symptom experience in children and adolescents with cancer. The review was a follow-up of a 2003 review published by Docherty (2003) and was completed to assess the progress that has been made since the 2003 research publication on the symptoms of pediatric oncology patients. The review included a description of the search strategy used including databases, years searched, terms used, and the results of the search. Literature on each symptom was described, and a table of the 52 studies reviewed was included.
373
Reporting guidelines: Systematic reviews and meta-analysis Systematic reviews and meta-analysis publications are found widely in the research literature. As these resources present an accumulation of potentially clinically relevant knowledge, there was also a need to develop a standard for what information should be included in these reviews. There are several guidelines available for reporting systematic reviews. These are the PRISMA (Moher et al., 2009) and MOOSE (Meta-analysis of Observational Studies in Epidemiology) (Stroup et al., 2000). A review of these guidelines will help you critically read meta-analyses and interpret if there is any bias in the review.
Tools for evaluating individual studies As the importance of practicing from a base of evidence has grown, so has the need to have tools or instruments available that can assist practitioners in evaluating studies of various types. When evaluating studies for clinical evidence, it is first important to assess if the study is valid. At the end of each chapter of this text are critiquing questions that will aid you in assessing if studies are valid and if the results are applicable to your practice. In addition to these questions, there are standardized appraisal tools that can assist with appraising the evidence. The Center for Evidence Based Medicine (CEBM), whose focus is on teaching critical appraisal, developed tools known as Critical Appraisal Tools that provide an evidence-based approach for assessing the quality, quantity, and consistency of specific study designs (CEBM, 2016). These instruments are part of an international network that provides consumers with specific questions to help assess study quality. Each checklist has a number of general questions as well as design- specific questions. The tools center on assessing a study’s methodology, validity, and reliability. The questions focus on the following:
1. Does this study address a clearly focused question?
2. Did the study use valid methods to address the question?
374
3. Are the valid results of the study important?
4. Are these valid, important results applicable to my patient or population?
There are four critical appraisal worksheets with targeted questions relevant to a specific design. The checklist with instructions can be found at http://www.cebm.net/critical appraisal. The design-specific CEBM tools with critical evaluative information for each design are available online and include:
• Systematic reviews
• Randomized controlled studies
• Diagnostic studies
• Prognosis
Clinical practice guidelines Clinical practice guidelines are systematically developed statements or recommendations that link research and practice and serve as a guide for practitioners. Guidelines have been developed to assist in bridging practice and research. Guidelines are developed by professional organizations, government agencies, institutions, or convened expert panels. Guidelines provide clinicians with an algorithm for clinical management or decision making for specific diseases (e.g., colon cancer) or treatments (e.g., pain management). Not all guidelines are well developed, and, like research, they must be assessed before implementation (see Chapter 9). Guidelines should present scope and purpose of the practice, detail who the development group included, demonstrate scientific rigor, be clear in their presentation, demonstrate clinical applicability, and demonstrate editorial independence. An example is the National Comprehensive Cancer Network, which is an interdisciplinary consortium of 21 cancer centers around the world. Interdisciplinary groups develop practice guidelines for practitioners and education guidelines for patients. These
375
guidelines are accessible at www.nccn.org. Practice guidelines can be either expert-based or evidence-based.
Evidence-based practice guidelines are those developed using a scientific process. This process includes first assembling a multidisciplinary group of experts in a specific field. This group is charged with completing a rigorous search of the literature and completing an evidence table that summarizes the quality and strength of the evidence from which the practice guideline is derived (see Chapters 19 and 20). For various reasons, not all areas of clinical practice have a sufficient research base; therefore, expert- based practice guidelines are developed. Expert-based guidelines depend on having a group of nationally known experts in the field who meet and solely use opinions of experts along with whatever research evidence is developed to date. If limited research is available for such a guideline, a rationale should be presented for the practice recommendations.
Many national organizations develop clinical practice guidelines. It is important to know which one to apply to your patient population. Example: ➤ There are numerous evidence-based practice guidelines developed for the management of pain. These guidelines are available from organizations such as the Oncology Nurses Society, American Academy of Pediatrics, National Comprehensive Cancer Network, National Cancer Institute, American College of Physicians, and American Academy of Pain Medicine. You need to be able to evaluate each of the guidelines and decide which is the most appropriate for your patient population.
The Agency for Healthcare Research and Quality supports the National Guideline Clearinghouse (NGC). The NGC’s mission is to provide health care professionals from all disciplines with objective, detailed information on clinical practice guidelines that are disseminated, implemented, and issued. The NGC encourages groups to develop guidelines for implementation via their site; it is a very useful site for finding well-developed clinical guidelines on a wide range of health- and illness-related topics. Specific guidelines can be found on the AHRQ Effective Health Care Program website.
HIGHLIGHT When evaluating a Clinical Practice Guideline (CPG), it is
376
important for an interprofessional team to use an evidence-based critical appraisal tool like AGREE II to determine the strength and quality of the CPG for applicability to practice.
Evaluating clinical practice guidelines As evidence-based practice guidelines proliferate, it becomes increasingly important that you critique these guidelines with regard to the methods used for guideline formulation and consider how they might be used in practice. Critical areas that should be assessed when critiquing evidence-based practice guidelines include the following:
• Date of publication or release and authors
• Endorsement of the guideline
• Clear purpose of what the guideline covers and patient groups for which it was designed
• Types of evidence (research, theoretical) used in guideline formulation
• Types of research included in formulating the guideline (e.g., “We considered only randomized and other prospective controlled trials in determining efficacy of therapeutic interventions.”)
• Description of the methods used in grading the evidence
• Search terms and retrieval methods used to acquire evidence used in the guideline
• Well-referenced statements regarding practice
• Comprehensive reference list
• Review of the guideline by experts
• Whether the guideline has been used or tested in practice and, if so, with what types of patients and in which types of settings
377
Evidence-based practice guidelines that are formulated using rigorous methods provide a useful starting point for understanding the evidence base of practice. However, more research may be available since the publication of the guideline, and refinements may be needed. Although information in well-developed, national, evidence-based practice guidelines are a helpful reference, it is usually necessary to localize the guideline using institution-specific evidence-based policies, procedures, or standards before application within a specific setting.
There are several tools for appraising the quality of clinical practice guidelines. The Appraisal of Guidelines Research and Evaluation II (AGREE II) instrument is one of the most widely used to evaluate the applicability of a guideline to practice (Brouwers et al., 2010, AGREE Collaboration). The AGREE II was developed to assist in evaluating guideline quality, provide a methodological strategy for guideline development, and inform practitioners about what information should be reported in guidelines and how it should be reported. The AGREE II is available online. The instrument focuses on six domains, with a total of 23 questions rated on a seven-point scale and two final assessment items that require the appraiser to make overall judgments of the guideline based on how the 23 items were rated. Along with the instrument itself, the AGREE Enterprise website offers guidance on tool usage and development. The AGREE II has been tested for reliability and validity. The guideline assesses the following components of a practice guideline:
1. Scope and purpose of the guideline
2. Stakeholder involvement
3. Rigor of the guideline development
4. Clarity and presentation of the guideline
5. Applicability of the guideline to practice
6. Demonstrated editorial independence of the developers
378
CRITICAL APPRAISAL CRITERIA Systematic reviews
1. Does the PICO question match the studies included in the review?
2. Are the review methods clearly stated and comprehensive?
3. Are the dates of the review’s inclusion clear and relevant to the area reviewed?
4. Are the inclusion and exclusion criteria for studies in the review clear and comprehensive?
5. What criteria were used to assess each of the studies in the review for quality and scientific merit?
6. If studies were analyzed individually, were the data clear?
7. Were the methods of study combination clear and appropriate?
8. If the studies were reviewed collectively, how large was the effect?
9. Are the clinical conclusions drawn from the studies relevant and supported by the review?
Clinical practice guidelines, although they are systematically developed and make explicit recommendations for practice, may be formatted differently. Practice guidelines should reflect the components listed. Guidelines can be located on an organization’s website, at the AHRQ, on the NGC website (www.AHRQ.gov), or on MEDLINE (see Chapters 3 and 20). Well-developed guidelines are constructed using the principles of a systematic review.
Appraisal for evidence-based practice systematic reviews and clinical guidelines
379
For each of the review methods described—systematic, meta- analysis, integrative, and clinical guidelines—think about each method as one that progressively sifts and sorts research studies and the data until the highest quality of evidence is used to arrive at the conclusions. First the researcher combines the results of all the studies based on a focused, specific question. The studies that do not meet the inclusion criteria are then excluded and the data assessed for quality. This process is repeated sequentially, excluding studies until only the studies of highest quality available are included in the analysis. An alteration in the overall results as an outcome of this sorting and separating process suggests how sensitive the conclusions are to the quality of studies included (Whittemore, 2005). No matter which type of review is completed, it is important to understand that the research studies reviewed still must be examined through your evidence-based practice lens. This means that evidence that you have derived through your critical appraisal and synthesis or derived through other researchers’ reviews must be integrated with an individual clinician’s expertise and patients’ wishes.
CRITICAL APPRAISAL CRITERIA Critiquing clinical guidelines
1. Is the date of publication or release current?
2. Are the authors of the guideline clear and appropriate to the guideline?
3. Is the clinical problem and purpose clear in terms of what the guideline covers and patient groups for which it was designed?
4. What types of evidence were used in formulating the guideline, and are they appropriate to the topic?
5. Is there a description of the methods used to grade the evidence?
6. Were the search terms and retrieval methods used to acquire research and theoretical evidence used in the guideline clear and relevant?
380
7. Is the guideline well-referenced and comprehensive?
8. Are the recommendations in the guideline sourced according to the level of evidence for its basis?
9. Has the guideline been reviewed by experts in the appropriate field of discipline?
10. Who funded the guideline development?
You should note that a researcher who uses any of the systematic review methods of combining evidence does not conduct the original studies or analyze the data from each study, but rather takes the data from all the published studies and synthesizes the information by following a set of systematic steps. Systematic methods for combining evidence are used to synthesize both nonexperimental and experimental research studies.
Finally, evidence-based practice requires that you determine— based on the strength and quality of the evidence provided by the systematic review coupled with your clinical expertise and patient values—whether or not you would consider a change in practice. For example, the meta-analysis by Al-Mallah and colleagues (2015) in Appendix E details the important findings from the literature, some of which could be used in nursing practice and some that need further research.
Systematic reviews that use multiple randomized controlled trials (RCTs) to combine study results offer stronger evidence (Level I) in estimating the magnitude of an effect for an intervention (see Chapter 2, Table 2.3). The strength of evidence provided by systematic reviews is a key component for developing a practice based on evidence. The qualitative counterpart to systematic reviews is meta-synthesis, which uses qualitative principles to assess qualitative research and is described in Chapter 6.
Key points • A systematic review is a summary of a search of quantitative
studies that use similar designs based on a PICO question.
381
• A meta-analysis is a systematic summary of studies using statistical techniques to assess and combine studies of the same design to obtain a precise estimate of the impact of an intervention.
• The terms systematic review and meta-analysis are used interchangeably, but only a meta-analysis includes a statistical assessment of the studies reviewed.
• An integrative review is the broadest category of reviews and can include a theoretical literature review, or a review of both quantitative and qualitative research literature.
• The Cochrane Collaboration prepares and maintains a body of up-to-date systematic reviews focused on health care interventions.
• There are standardized tools available for evaluating individual studies. An example of such tools are available from the Centre for Evidence Based Medicine.
• Clinical practice guidelines are systematically developed statements or recommendations that link research and practice. There are two types of clinical practice guidelines: evidence- based practice guidelines and expert-based practice guidelines.
• Evidence-based guidelines are practice guidelines developed by experts who assess the research literature for the quality and strength of the evidence for an area of practice.
• Expert-based guidelines are developed typically by a nationally known group of experts in an area using opinions of experts along with whatever research evidence is available to date.
• The Appraisal of Guidelines Research and Evaluation II is a tool for appraising the quality of clinical practice guidelines.
Critical thinking challenges
382
• An assignment for your research class is to critically appraise the systematic review in Appendix E by Malwallah and colleagues using the Systematic Review Critical Appraisal Tool from the Center for Evidence-based Medicine (CEBM) using the following link, www.cebm.net, to determine whether the effect size reveals a significant difference between the intervention and control group in the summarized studies. How does the effect size pertain to applicability of findings to practice.
• Your interprofessional primary care team is asked to write an evidence-based policy that will introduce depression screening as a required part of the admission protocol in your practice. Debate the pros and cons of considering the evidence to inform your protocol provided by a meta-analysis of 10 RCT studies with a combined sample size of n = 859, in comparison to 10 individual RCTs, only 2 of which have a sample size of n = 100.
• Explain why it is important to have an interprofessional team conducting a systematic review.
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Al-Mallah M. H., Farah I., Al-Madani W., et al. The impact
of nurse-led clinics on the mortality and morbidity of patients with cardiovascular diseases A systematic review and meta- analysis. Journal of Cardiovascular Nursing 2015;31(1):89-95.
2. Borenstein M., Hedges L. V., Higgins J. P. T., Rothstein H. R. Introduction to meta-analysis. United Kingdom: Wiley 2009;
3. Brady V. J., Grimes D., Armstrong T., LoBiondo-Wood G. Management of steroid-induced hyperglycemia in hospitalized patients with cancer A review. Oncology Nursing Forum 2014;41:E355-E365.
4. Brouwers M., Kho M. E., Browman G. P., for the AGREE
383
Next Steps Consortium, et al. AGREE II Advancing guideline development, reporting and evaluation in healthcare. Canadian Medical Association Journal 2010;182:E839-E842 Available at: doi:10.1503/090449
5. Center for Evidence-Based Medicine Critical Appraisal Tools. Available at: www.cebm.net/critical-appraisal 2016;
6. Cochrane Handbook for Systematic Reviews. Available at: http://www.cochrane-handbook.org 2016;
7. Conley S., Redeker N. A systematic review of management interventions for inflammatory bowel disease. Journal of Nursing Scholarship 2016;48(2):118-127.
8. da Costa B. R., Juni P. Systematic reviews and meta-analyses of randomized trials Principles and pitfalls. European Heart Journal 2014;35:3336-3345.
9. Docherty S. L. Symptom experiences of children and adolescents with cancer. Annual Review Nursing Research 2003;21(2):123-149.
10. Higgins J. P. T., Green S. Cochrane handbook for systematic reviews of interventions version 5.1.0. Available at: http://www.cochrane-handbook.org 2011;
11. Kestler S. A., LoBiondo-Wood G. Review of symptom experiences in children and adolescents with cancer. Cancer Nursing 2012;35(2):E31-E49 Available at: doi:10.1097/NCC.0b013e3182207a2a
12. Liberati A., Altman D. G., Tetzlaff J., et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions Explanation and elaboration. Annuals of Internal Medicine 2009;151(4):w65- w94.
13. Martineau A. R., Cates C. J., Urashima M., et al. Vitamin D for the management of asthma. Cochrane Database of Systematic Reviews 2016;9:CD011511 Available at: doi:10.1002/14651858.CD011511.pub2
14. Moher D., Liberati A., Tetzlaff J., Altman D. G. Preferred reporting items for systematic reviews and meta-analyses The PRISMA statement. PLOS Medicine 2009;62(10):1006-1012 Available at: doi:10.1016/j.jclinepi.2009.06.005
15. Moher D., Shamseer L., Clarke M., et al. Preferred reporting
384
items for systematic review and meta-analysis protocols (PRISMA—P) 2015 Statement. Systematic Reviews 2015;4(1):1.
16. Moher D., Stewart L., Shekelle P. All in the family Systematic reviews, rapid reviews, scoping reviews, realist reviews and more. Systematic Reviews 2015;4:183.
17. Stroup D. F., Berlin J. A., Morton S. C., et al. Meta-analysis of observational studies in epidemiology A proposal for reporting. Meta-analysis of observational studies in epidemiology (MOOSE) group. The Journal of the American Medical Association 2000;283:2008-2012.
18. Whittemore R. Combining evidence in nursing research Methods and implications. Nursing Research 2005;54(1):56- 62.
19. Whittemore R., Knafl K. The integrative review Updated methodology. Journal of Advanced Nursing 2005;52(5):546- 553.
385
CHAPTER 12
Sampling Judith Haber
Learning outcomes
After reading this chapter, you should be able to do the following:
• Identify the purpose of sampling. • Define population, sample, and sampling. • Compare a population and a sample. • Discuss the importance of inclusion and exclusion criteria. • Define nonprobability and probability sampling. • Identify the types of nonprobability and probability sampling strategies. • Compare the advantages and disadvantages of nonprobability and probability sampling strategies. • Discuss the contribution of nonprobability and probability sampling strategies to strength of evidence provided by study findings. • Discuss the factors that influence sample size. • Discuss potential threats to internal and external validity as sources of sampling bias. • Use the critical appraisal criteria to evaluate the “Sample” section of a research report.
KEY TERMS
386
accessible population
convenience sampling
data saturation
delimitations
element
eligibility criteria
exclusion criteria
inclusion
multistage (cluster) sampling
network (snowball) sampling
nonprobability sampling
pilot study
population
probability sampling
purposive sampling
quota sampling
random selection
representative sample
sample
sampling
sampling frame
sampling unit
simple random sampling
snowballing
stratified random sampling
387
target population
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
The sampling section of a study is usually found in the “Methods” section of a research article. You will find it important to understand the sampling process and the elements that contribute to a researcher using the most appropriate sampling strategy for the type of research being conducted. Equally important is knowing how to critically appraise the sampling section of a study to identify how the strengths and weaknesses of the sampling process contributed to the overall strength and quality of evidence provided by the findings of a study.
When you are critically appraising the sampling section of a study, the threats to internal and external validity as sources of bias need to be considered (see Chapter 8). Your evaluation of the sampling section is very important in your overall critical appraisal of a study’s findings and applicability to practice.
Sampling is the process of selecting representative units of a population in a study. Many problems in research cannot be solved without employing rigorous sampling procedures. Example: ➤ When testing the effectiveness of a medication for patients with type 2 diabetes, the drug is administered to a sample of the population for whom the drug is potentially appropriate. The researcher must come to conclusions without giving the drug to every patient with diabetes or laboratory animal. Because human lives are at stake, the researcher cannot afford to arrive casually at conclusions that are based on the first dozen patients available for study.
The impact of arriving at conclusions that are not accurate or making generalizations from a small nonrepresentative sample is much more severe in research than in everyday life. Essentially, researchers sample representative segments of the population because it is rarely feasible or necessary to sample the entire population of interest to obtain relevant information.
This chapter will familiarize you with the basic concepts of
388
sampling as they primarily pertain to the principles of quantitative research design, nonprobability and probability sampling, sample size, and the related critical appraisal process. Sampling issues that relate to qualitative research designs are discussed in Chapters 5, 6, and 7.
Sampling concepts Population A population is a well-defined set with specified properties. A population can be composed of people, animals, objects, or events. Examples of populations might be all of the female patients older than 65 years admitted to a specific hospital for congestive heart failure (CHF) during the year 2017, all of the children with asthma in the state of New York, or all of the men and women with a diagnosis of clinical depression in the United States. These examples illustrate that a population may be broadly defined and potentially involve millions of people or narrowly specified to include only several hundred people.
The population criteria establish the target population—that is, the entire set of cases about which the researcher would like to make generalizations. A target population might include all undergraduate nursing students enrolled in accelerated baccalaureate programs in the United States. Because of time, money, and personnel, however, it is often not feasible to pursue a study using a target population.
An accessible population, one that meets the target population criteria and that is available, is used instead. Example: ➤ An accessible population might include all full-time accelerated baccalaureate students attending school in Oregon. Pragmatic factors must also be considered when identifying a potential population of interest.
It is important to know that a population is not restricted to humans. It may consist of hospital records; blood, urine, or other specimens taken from patients at a clinic; historical documents; or laboratory animals. Example: ➤ A population might consist of all the HgbA1C blood test specimens collected from patients in the City
389
Hospital diabetes clinic or all of the patient charts on file who had been screened during pregnancy for HIV infection. A population can be defined in a variety of ways. The basic unit of the population must be clearly defined because the generalizability of the findings will be a function of the population criteria.
Inclusion and exclusion criteria When reading a research report, you should consider whether the researcher has identified the population characteristics that form the basis for the inclusion (eligibility) or exclusion (delimitations) criteria used to select the sample—whether people, objects, or events. The terms inclusion or eligibility criteria and exclusion criteria or delimitations define characteristics that limit the population to a homogenous group of subjects. The population characteristics that provide the basis for inclusion (eligibility) criteria should be evident in the sample—that is, the characteristics of the population and the sample should be congruent in order to assess the representativeness of the sample. Examples of inclusion or eligibility criteria and exclusion criteria or delimitations include the following:
• gender
• age
• marital status
• socioeconomic status
• religion
• ethnicity
• level of education
• age of children
• health status
• diagnosis
390
Think about the concept of inclusion or eligibility criteria applied to a study where the subjects are patients. Example: ➤ Participants in a study investigating the effectiveness of a nurse practitioner (NP) delivered symptom management intervention for patients initiating chemotherapy for nonmetastatic cancer compared to standard oncology care. The aim was to reduce patient reported symptom burden by facilitating patient-NP collaboration and early management of symptoms (Traeger et al., 2015). Participants had to meet the following inclusion (eligibility) criteria:
1. Age: At least 18 years
2. Newly diagnosed with Stage I to Stage III breast cancer (BC), lung cancer (LC), or colorectal cancer (CRC)
3. Scheduled to initiate chemotherapy for nonmetastatic disease
4. Able to respond to questionnaires in English
Inclusion and exclusion criteria are established to control for extraneous variability or bias that would limit the strength of evidence contributed by the sampling plan in relation to the study’s design. Each inclusion or exclusion criterion should have a rationale, presumably related to a potential contaminating effect on the dependent variable. Example: ➤ Subjects were excluded from this study if they had:
• A concurrent cognitive or psychiatric condition or substance abuse problem that would prevent adherence to the protocol
• Evidence of metastatic cancer
• Had already received chemotherapy for their malignancy
The careful establishment of sample inclusion or exclusion criteria will increase a study’s precision and strength of evidence, thereby contributing to the accuracy and generalizability of the findings (see Chapter 8). Fig. 12.1 provides an example of a flow chart that illustrates how potential study participants were screened using the above inclusion (eligibility) and exclusion criteria for
391
enrollment in the NP delivered symptom management intervention study by Traeger and colleagues (2015).
FIG 12.1 Subject selection using a proportional
stratified random sampling strategy.
HELPFUL HINT Researchers may not clearly identify the population under study, or the population is not clarified until the “Discussion” section when the effort is made to discuss the group (population) to which the study findings can be generalized.
Samples and sampling Sampling is the selection of a portion or subset of the designated population that represents the entire population. A sample is a set of elements that make up the population; an element is the most basic unit about which information is collected. The most common element in nursing research is individuals, but other elements (e.g., places, objects) can form the basis of a sample or population. Example: ➤ A researcher was planning a study that investigated barriers that may underlie the decline in girls’ physical activity
392
(PA), beginning at the onset of adolescence. Eight midwestern US schools were randomly assigned to either receive a multicomponent PA intervention called “Girls on the Move” or serve as a control. The schools were identified as the sampling units rather than the treatment alone (Vermeesch et al., 2015). The purpose of sampling is to increase a study’s efficiency. If you think about it, you will realize that it is not feasible to examine every element in the population. When sampling is done properly, the researcher can draw inferences and make generalizations about the population without examining each element in the population. Sampling procedures identify specific selection criteria to ensure that the characteristics of the phenomena of interest will be, or are likely to be, present in all of the units being studied. The researcher’s efforts to ensure that the sample is representative of the target population strengthens the evidence generated by the sample, which allows the researcher to draw conclusions that are generalizable to the population and applicable to practice (see Chapter 8).
After having reviewed a number of research studies, you will recognize that samples and sampling procedures vary in terms of merit. The foremost criterion in appraising a sample is its representativeness. A representative sample is one whose key characteristics closely match those of the population. If 70% of the population in a study of child-rearing practices consisted of women and 40% were full-time employees, a representative sample should reflect these characteristics in the same proportions.
EVIDENCE-BASED PRACTICE TIP Consider whether the choice of participants was biased, thereby influencing the strength of evidence provided by the outcomes of the study.
Types of samples Sampling strategies are generally grouped into two categories: nonprobability sampling and probability sampling. In nonprobability sampling, elements are chosen by nonrandom methods. The drawback of this strategy is that there is no way of estimating each element’s probability of being included in a particular sample. Essentially, there is no way of ensuring that
393
every element has a chance for inclusion in a nonprobability sample.
Probability sampling uses some form of random selection when the sample is chosen. This type of sample enables the researcher to estimate the probability that each element of the population will be included in the sample. Probability sampling is the more rigorous type of sampling strategy and is more likely to result in a representative sample. A summary of sampling strategies appears in Table 12.1 and is discussed in the following sections.
TABLE 12.1 Summary of Sampling Strategies
EVIDENCE-BASED PRACTICE TIP Determining whether the sample is representative of the population being studied will influence your interpretation of the evidence provided by the findings and decision making about their relevance to the patient population and practice setting.
HELPFUL HINT A research article may not be explicit about the sampling strategy used. If the sampling strategy is not specified, assume that a convenience sample was used for a quantitative study and a purposive sample was used for a qualitative study.
394
Nonprobability sampling Because of lack of random selection, the findings of studies using nonprobability sampling are less generalizable than those using a probability sampling strategy, and they tend to produce less representative samples. When a nonprobability sample is carefully chosen to reflect the target population through the careful use of inclusion and exclusion criteria and adequate sample size, you can have more confidence in the sample’s representativeness and the external validity of the findings (see Chapter 8). The three major types of nonprobability sampling are convenience, quota, and purposive sampling strategies.
Convenience sampling Convenience sampling is the use of the most readily accessible persons or objects as subjects. The subjects may include volunteers, the first 100 patients admitted to hospital X with a particular diagnosis, all of the people enrolled in program Y during the month of September, or all of the students enrolled in course Z at a particular university during 2014. The subjects are convenient and accessible to the researcher and are thus called a convenience sample. Example: ➤ A study evaluating an NP-led intensive behavioral treatment program for obesity implemented in an adult primary care practice used a convenience sample of obese adults (18 years and older) who were primary care patients of a patient-centered medical home (PCMH) practice who met the eligibility criteria and volunteered to participate in the study (Thabault et al., 2016).
The advantage of a convenience sample is that generally it is easier to obtain subjects. The researcher will still have to be concerned with obtaining a sufficient number of subjects who meet the inclusion criteria. The major disadvantage of a convenience sample is that the risk of bias is greater than in any other type of sample (see Table 12.1). The fact that convenience samples use voluntary participation increases the probability of researchers recruiting those people who feel strongly about the issue being studied, which may favor certain outcomes. In this case, ask yourself the following as you think about the strength and quality of evidence contributed by the sampling component of a study:
395
• What motivated some people to participate and others not to participate (self-selection)?
• What kind of data would have been obtained if nonparticipants had also responded?
• How representative are the people who did participate in relation to the population?
• What kind of confidence can you have in the evidence provided by the findings?
Researchers may recruit subjects in clinic settings, stop people on a street corner to ask their opinion on some issue, place advertisements in the newspaper, or place signs in local churches, community centers, or supermarkets, indicating that volunteers are needed for a particular study. To assess the degree to which a convenience sample approximates a random sample, the researcher checks for the representativeness of the convenience sample by comparing the sample to population percentages and, in that way, assesses the extent to which bias is or is not evident (Sousa et al., 2004).
Because acquiring research subjects is a problem that confronts many researchers, innovative recruitment strategies may be used. A unique method of accessing and recruiting subjects is the use of online computer networks (e.g., disease-specific chat rooms, blogs, and bulletin boards). Example: ➤ In the study by Traeger et al. (2015) that implemented a nursing intervention to enhance outpatient chemotherapy symptom management, trained staff screened chemotherapy schedules and electronic health record data to identify all potential participants. When you appraise a study you should recognize that the convenience sampling strategy, although most common, is the weakest sampling strategy with regard to strength of evidence and generalizability (external validity) unless it is followed by random assignment to groups, as you will find in studies that are randomized clinical trials (RCT) (see Chapter 9). When a convenience sample is used, caution should be exercised in interpreting the data and assessing the researcher’s comments about the external validity and applicability
396
of the findings (see Chapter 8).
Quota sampling Quota sampling refers to a form of nonprobability sampling in which subjects who meet the inclusion criteria are recruited and consecutively enrolled until the target sample size is reached. The study by Traeger and colleagues provides an example of quota sampling when trained study coordinators approached eligible chemotherapy patients during their first chemotherapy visit to introduce the study, obtained informed consent, and enrolled interested and eligible consecutive patients until the target enrollment was reached.
Sometimes knowledge about the population of interest is used to build some representativeness into the sample (see Table 12.1). A quota sample can identify the strata of the population and proportionally represents the strata in the sample. Example: ➤ The data in Table 12.2 reveal that 40% of the 5000 nurses in city X are associate degree graduates, 20% are 4-year baccalaureate degree graduates, and 40% are accelerated second-degree baccalaureate graduates. Each stratum of the population should be proportionately represented in the sample. In this case, the researcher used a proportional quota sampling strategy and decided to sample 10% of a population of 5000 (i.e., 500 nurses). Based on the proportion of each stratum in the population, 200 associate degree graduates, 100 4-year baccalaureate graduates, and 200 accelerated baccalaureate graduates were the quotas established for the three strata. The researcher recruited subjects who met the study’s eligibility criteria until the quota for each stratum was filled. In other words, once the researcher obtained the necessary 200 associate degree graduates, 100 4-year baccalaureate degree graduates, and 200 accelerated baccalaureate degree graduates, the sample was complete.
TABLE 12.2 Numbers and Percentages of Students in Strata of a Quota Sample of 5000 Graduates of Nursing Programs in City X
397
The characteristics chosen to form the strata are selected according to a researcher’s knowledge of the population and the literature review. The criterion for selection should be a variable that reflects important differences in the dependent variables under investigation. Age, gender, religion, ethnicity, medical diagnosis, socioeconomic status, level of completed education, and occupational rank are among the variables that are likely to be important stratifying variables in nursing research studies.
The researcher systematically ensures that proportional segments of the population are included in the sample. The quota sample is not randomly selected (i.e., once the proportional strata have been identified, the researcher recruits and enrolls subjects until the quota for each stratum has been filled) but does increase the sample’s representativeness. This sampling strategy addresses the problem of overrepresentation or underrepresentation of certain segments of a population in a sample.
As you critically appraise a study, your aim is to determine whether the sample strata appropriately reflect the population under consideration and whether the stratifying variables are homogeneous enough to ensure a meaningful comparison of differences among strata. Establishment of strict inclusion and exclusion criteria and using power analysis to determine appropriate sample size increase the rigor of a quota sampling strategy by creating homogeneous subject categories that facilitate making meaningful comparisons across strata.
Purposive sampling Purposive sampling is a common strategy. The researcher selects subjects who are considered to be typical of the population. Purposive sampling can be found in both quantitative and qualitative studies. When a researcher is considering the sampling strategy for a randomized clinical trial focusing on a specific diagnosis or patient population, the sampling strategy is often purposive in nature. In such studies the researcher first purposively selects subjects who are then randomized to groups.
398
Purposive sampling is commonly used in qualitative research studies. Example: ➤ The objective of the qualitative study by van Dijk et al. (2015) was to examine how patients assign a number to their currently experienced postoperative pain. They selected a purposive sample of patients who had surgery the day before and were experiencing postoperative pain with a score of at least 4 on the Numeric Rating Scale (NRS). Subjects were selected until the new information obtained did not provide further insight into the themes or no new themes emerged (data saturation; see Chapters 5, 6, and 14). A purposive sample is used also when a highly unusual group is being studied, such as a population with a rare genetic disease (e.g., Huntington chorea). In this case, the researcher would describe the sample characteristics precisely to ensure that the reader will have an accurate picture of the subjects in the sample.
Today, computer networks (e.g., online services) can be a valuable resource in helping researchers access and recruit subjects for purposive samples. Online support group bulletin boards that facilitate recruitment of subjects for purposive samples exist for people with cancer, rheumatoid arthritis, multiple sclerosis, human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS), postpartum depression, Lyme disease, and many others.
The researcher who uses a purposive sample assumes that errors of judgment in overrepresenting or underrepresenting elements of the population in the sample will tend to balance out. As indicated in Table 12.1, there may be conscious bias in the selection of subjects; the ability to generalize from the evidence provided by the findings is very limited. Box 12.1 lists examples of when a purposive sample may be appropriate. BOX 12.1 Criteria for Use of a Purposive Sampling Strategy
• Effective pretesting of newly developed instruments with a purposive sample of divergent types of people
• Validation of a scale or test with a known-group technique
399
• Collection of exploratory data in relation to an unusual or highly specific population, particularly when the total target population remains an unknown to the researcher
• Collection of descriptive data (e.g., as in qualitative studies) that seek to describe the lived experience of a particular phenomenon (e.g., postpartum depression, caring, hope, surviving childhood sexual abuse)
• Focus of the study population relates to a specific diagnosis (e.g., type 1 diabetes, ovarian cancer) or condition (e.g., legal blindness, terminal illness) or demographic characteristic (e.g., same-sex twin pairs)
Network sampling Network sampling, sometimes referred to as snowballing, is used for locating samples that are difficult or impossible to locate in other ways. This strategy takes advantage of social networks and the fact that friends tend to have characteristics in common. When a few subjects with the necessary eligibility criteria are found, the researcher asks for their assistance in getting in touch with others with similar criteria. Example: ➤ Online computer networks, as described in the section on purposive sampling and in this last example, can be used to assist researchers in acquiring otherwise difficult to locate subjects, thereby taking advantage of the networking or snowball effect. In a study that aimed to gain consensus from experts on the priorities for clinical nursing and midwifery research in southern and eastern African countries, the researchers used contacts with networks of regional nursing colleagues and leaders, snowball sampling, to compile a list of potential research experts who met the inclusion criteria and agreed to participate by responding to the Delphi research priority survey. To expand their network of experts, they asked survey respondents for referrals to others who met the criteria and might be willing to participate. Surveys were sent to the new potential participants who were identified (Sun et al., 2015).
HELPFUL HINT When convenience or purposive sampling is used as the first step
400
in recruiting a sample for a randomized clinical trial, as illustrated in Fig. 12.1, it is followed by random assignment of subjects to an intervention or control group, which increases the generalizability of the findings.
Probability sampling The primary characteristic of probability sampling is the random selection of elements from the population. Random selection occurs when each element of the population has an equal and independent chance of being included in the sample. When probability sampling is used, you have greater confidence that the sample is representative of the population being studied rather than biased. Three commonly used probability sampling strategies are simple random, stratified random, and cluster.
Random selection of sample subjects should not be confused with randomization or random assignment of subjects. The latter, discussed earlier in this chapter and in Chapter 8, refers to the assignment of subjects to either an experimental or a control group on a random basis. Random assignment is most closely associated with RCT.
Simple random sampling Simple random sampling is a carefully controlled process. The researcher defines the population (a set), lists all of the units of the population (a sampling frame), and selects a sample of units (a subset) from which the sample will be chosen. Example: ➤ If American hospitals specializing in the treatment of cancer were the sampling unit, a list of all such hospitals would be the sampling frame. If certified school nurses constituted the accessible population, a list of those nurses would be the sampling frame.
Once a list of the population elements has been developed, the best method of selecting a random sample is to use a computer program that generates the order in which the random selection of subjects is to be carried out.
The advantages of simple random sampling are as follows:
• Sample selection is not subject to the conscious biases of the researcher.
401
• Representativeness of the sample in relation to the population characteristics is maximized.
• Differences in the characteristics of the sample and the population are purely a function of chance.
• Probability of choosing a nonrepresentative sample decreases as the size of the sample increases.
Example: ➤ Simple random sampling was used in a study testing the feasibility of collecting hair for cortisol measurement from a probability sample of 516 racially and socioeconomically diverse urban adolescents aged 11 to 17 years participating in a larger prospective study on adolescent health and well-being (Ford et al., 2016). The sampling frame was based on a combination of eligible households and public school data from the study area. The addresses were sorted by zip code, and random replicates of 500 participants were drawn. The randomly selected households were contacted to solicit participation in the study.
The major disadvantage of simple random sampling is that it can be a time-consuming and inefficient method of obtaining a random sample. Example: ➤ Consider the task of listing all of the baccalaureate nursing students in the United States. With random sampling, it may also be impossible to obtain an accurate or complete listing of every element in the population. Example: ➤ Imagine trying to obtain a list of all suicides in New York City for the year 2016. It often is the case that although suicide may have been the cause of death, another cause (e.g., cardiac failure) appears on the death certificate. It would be difficult to estimate how many elements of the target population would be eliminated from consideration. The issue of bias would definitely enter the picture despite the researcher’s best efforts. In the final analysis, you, as the evaluator of a research article, must be cautious about generalizing from findings, even when random sampling is the stated strategy or if the target population has been difficult or impossible to list completely.
EVIDENCE-BASED PRACTICE TIP When thinking about applying study findings to your clinical
402
practice, consider whether the participants making up the sample are similar to your own patients.
Stratified random sampling Stratified random sampling requires that the population be divided into strata or subgroups as illustrated in Fig. 12.1. The subgroups or subsets that the population is divided into are homogeneous. An appropriate number of elements from each subset are randomly selected on the basis of their proportion in the population. The goal of this strategy is to achieve a greater degree of representativeness. Stratified random sampling is similar to the proportional stratified quota sampling strategy discussed earlier in the chapter. The major difference is that stratified random sampling uses a random selection procedure for obtaining sample subjects.
The population is stratified according to any number of attributes, such as age, gender, ethnicity, religion, socioeconomic status, or level of education completed. The variables selected to form the strata should be adaptable to homogeneous subsets with regard to the attributes being studied. Example: ➤ A study by Wong et al. (2016) examined whether high-comorbidity patients had larger increases in primary care provider (PCP) visits attributable to primary care medical home (PCMH) implementation in a large integrated health system in comparison to other patients enrolled in primary care. The data were obtained from the Veterans Health Association (VHA) Corporate Data Warehouse (CDW), which contains comprehensive administrative data tracking patient utilization, demographics, and clinical measures including ICD-9 diagnostic codes. For each quarter of the study, they identified a 1% random sample of all VHA primary care patients in the database that quarter. The final sample consisted of 8.4 million patient quarter observations. All analyses were stratified by age group (under 65 and age 65+), comorbidity burden score, and outpatient visits. As illustrated in Table 12.1, several advantages to a stratified random sampling strategy include (1) representativeness of the sample is enhanced; (2) researcher has a valid basis for making comparisons among subsets; and (3) researcher is able to oversample a disproportionately small stratum to adjust for their underrepresentation, statistically weigh the data accordingly, and
403
continue to make legitimate comparisons. The obstacles encountered by a researcher using this strategy
include (1) difficulty of obtaining a population list containing complete critical variable information, (2) time-consuming effort of obtaining multiple enumerated lists, (3) challenge of enrolling proportional strata, and (4) time and money involved in carrying out a large-scale study using a stratified sampling strategy.
Multistage sampling (cluster sampling) Multistage (cluster) sampling involves a successive random sampling of units (clusters) that progress from large to small and meet sample eligibility criteria. The first-stage sampling unit consists of large units or clusters. The second-stage sampling unit consists of smaller units or clusters. Third-stage sampling units are even smaller. Example: ➤ If a sample of critical care nurses is desired, the first sampling unit would be a random sample of hospitals, obtained from an American Hospital Association list, that meet the eligibility criteria (e.g., size, type). The second-stage sampling unit would consist of a list of critical care nurses practicing at each hospital selected in the first stage (i.e., the list obtained from the vice president for nursing at each hospital). The criteria for inclusion in the list of critical care nurses would be as follows:
1. Certified as a Certified Critical Care Registered Nurse (CCRN) with at least 3 years’ experience as a critical care nurse
2. At least 75% of the CCRN’s time spent in providing direct patient care in a critical care unit
3. Full-time employment at the hospital
The second-stage sampling unit would obtain a random selection of 10 CCRNs from each hospital who met the previously mentioned eligibility criteria.
When multistage sampling is used in relation to large national surveys, states are used as the first-stage sampling unit; followed by successively smaller units such as counties, cities, districts, and blocks as the second-stage sampling unit; and finally households as
404
the third-stage sampling unit. Sampling units or clusters can be selected by simple random or
stratified random sampling methods. Example: ➤ Sun et al. (2015) conducted a survey using the Delphi method to gain consensus about regional clinical nursing and midwifery research priorities from experts in participating eastern and southern African countries. Clinical nursing and midwifery experts from 13 countries participated in the first round of the survey by completing the electronic survey, and experts from 14 countries participated in the second round. This approach to multistage sampling was chosen because the electronic format facilitates obtaining consensus from a large panel of experts in a wide geographic region by providing anonymity, eliminating the potential for leaders to dominate the process, and providing a chance in Round 2 to change their mind after considering the group opinion. The main advantage of cluster sampling, as illustrated in Table 12.1, is that it can be more economical in terms of time and money than other types of probability sampling. There are two major disadvantages: (1) more sampling errors tend to occur than with simple random or stratified random sampling, and (2) appropriate handling of the statistical data from cluster samples is very complex. When you are critically appraising a study, you will need to consider whether the use of cluster sampling is justified in light of the research design, as well as other pragmatic matters, such as economy.
EVIDENCE-BASED PRACTICE TIP The sampling strategy, whether probability or nonprobability, must be appropriate to the design and evaluated in relation to the level of evidence provided by the design.
CRITICAL THINKING DECISION PATH Assessing the Relationship Between the Type of Sampling Strategy and the Appropriate Generalizability
405
The Critical Thinking Decision Path illustrates the relationship between the type of sampling strategy and the appropriate generalizability.
Sample size There is no single rule that can be applied to the determination of a sample’s size. When arriving at an estimate of sample size, many factors, such as the following, must be considered:
• Type of design
• Type of sampling procedure
• Type of formula used for estimating optimum sample size
• Degree of precision required
406
• Heterogeneity of the attributes under investigation
• Relative frequency that the phenomenon of interest occurs in the population (i.e., a common versus a rare health problem)
• Projected cost of using a particular sampling strategy
HELPFUL HINT Look for a brief discussion of a study’s sampling strategy in the “Methods” section of a research article. Sometimes there is a separate subsection with the heading “Sample,” “Subjects,” or “Study Participants.” A statistical description of the characteristics of the actual sample often does not appear until the “Results” section of a research article. You may also find a table in the Results section that summarizes the sample characteristics using descriptive statistics (see Chapter 14).
The sample size should be determined before a study is conducted. A general rule is always to use the largest sample possible. The larger the sample, the more representative of the population it is likely to be; smaller samples produce less accurate results.
One exception to this principle occurs when using qualitative designs. In this case, sample size is not predetermined. Sample sizes in qualitative research tend to be small because of the large volume of verbal data that must be analyzed and because this type of design tends to emphasize intensive and prolonged contact with subjects (Speziale & Carpenter, 2011). Subjects are added to the sample until data saturation is reached (i.e., new data no longer emerge during the data-collection process). Fittingness of the data is a more important concern than representativeness of subjects (see Chapters 5, 6, and 7).
Another exception is in the case of a pilot study, which is defined as a small sample study conducted as a prelude to a larger scale study that is often called the “parent study.” The pilot study is typically a smaller scale of the parent study, with similar methods and procedures that yield preliminary data to determine the feasibility of conducting a larger scale study and establish that
407
sufficient scientific evidence exists to justify subsequent, more extensive research.
The principle of “larger is better” holds true for both probability and nonprobability samples. Results based on small samples (under 10) tend to be unstable—the values fluctuate from one sample to the next, and it is difficult to apply statistics meaningfully. Small samples tend to increase the probability of obtaining a markedly nonrepresentative sample. As the sample size increases, the mean more closely approximates the population values, thus introducing fewer sampling errors.
HIGHLIGHT Remember to have your interprofessional Journal Club evaluate the appropriateness of the generalizations made about the studies you critically appraise in light of the sampling procedure and any sources of bias that affect applicability of the findings to your patient population.
It is possible to estimate the sample size needed with the use of a statistical procedure known as power analysis (Cohen, 1988). Power analysis is an advanced statistical technique that is commonly used by researchers and is a requirement for external funding. When it is not used, you will have less confidence provided by the findings because the study may be based on a sample that is too small. A researcher may commit a type II error of accepting a null hypothesis when it should have been rejected if the sample is too small (see Chapter 16). No matter how high a research design is located on the evidence hierarchy (e.g., Level II—experimental design consisting of a randomized clinical trial), the findings of a study and their generalizability are weakened when power analysis is not calculated to ensure an adequate sample size to determine the effect of the intervention.
It is beyond the scope of this chapter to describe this complex procedure in great detail, but a simple example will illustrate its use. Nyamathi and colleagues (2015) wanted to assess the impact of three interventions: peer coaching with nurse case management (PC-NCM), peer coaching (PC), and usual care (UC) on completion of hepatitis A and B vaccination series. How would a research team such as Nyamathi and colleagues know the appropriate number of
408
subjects that should be used in the study? When using power analysis, the researcher must estimate how large an impact (effect) will be observed between the three intervention groups (i.e., to test differences among PC-NCM, PC., and UC groups in terms of vaccination completion rates). If a moderate difference is expected, a conventional effect size of.20 is assumed. With a significance level of.05, a total of 114 participants would be needed for each intervention group to detect a statistically significant difference between the groups with a power of.80. The total sample in this study (n = 600) exceeded the minimum number of 114 per intervention group.
HELPFUL HINT Remember to evaluate the appropriateness of the generalizations made about the study findings in light of the target population, the accessible population, the type of sampling strategy, and the sample size.
When calculating sample size using power analysis, the total sample size needs to consider that attrition, or dropouts, will occur and build in approximately 15% extra subjects to make sure that the ability to detect differences between groups or the effect of an intervention remains intact. When expected differences are large, it does not take a very large sample to ensure that differences will be revealed through statistical analysis.
When critically appraising a study, you should evaluate the sample size in terms of the following: (1) how representative the sample is relative to the target population, and (2) to whom the researcher wishes to generalize the study’s results. The goal is to have a sample as representative as possible with as little sampling error as possible. Unless representativeness is ensured, all the data in the world become inconsequential. When an appropriate sample size, including power analysis for calculation of sample size, and sampling strategy have been used, you can feel more confident that the sample is representative of the accessible population rather than biased (Fig. 12.2) and the potential for generalizability of findings is greater (see Chapter 8).
409
FIG 12.2 Summary of general sampling procedure.
EVIDENCE-BASED PRACTICE TIP Research designs and types of samples are often linked. When a nonprobability purposive sampling strategy is used to recruit participants to a study using an experimental design, you would expect random assignment of subjects to an intervention or control group to follow.
Appraisal for evidence-based practice sampling The criteria for critical appraisal of a study’s sample are presented in the Critical Appraisal Criteria box. As you evaluate the sample section of a study, you must raise two questions:
1. If this study were to be replicated, would there be enough information presented about the nature of the population, the sample, the sampling strategy, and sample size of another investigator to carry out the study?
2. What are the sampling threats to internal and external validity that are sources of bias?
The answers to these questions highlight the important link of the sample to the findings and the strength of the evidence used to make clinical decisions about the applicability of the findings to clinical practice (see Chapter 8).
In Chapter 8, we talked about how selection effect as a threat to internal validity could occur in studies where a convenience, quota, or purposive sampling strategy was used. In these studies, individuals themselves decide whether or not to participate. Subject mortality or attrition is another threat to internal validity related to sampling (see Chapter 8). Mortality is the loss of subjects from the study, usually from the first data-collection point to the second. If
410
the subjects who remain in the study are different from those who drop out, the results can be affected. When more of the subjects in one group drop out than the other group, the results can also be influenced. It is common for journals to require authors reporting on research results to include a flow chart that diagrams the screening, recruitment, enrollment, random assignment, and attrition process and results. Threats to external validity related to sampling are concerned with the generalizability of the results to other populations. Generalizability depends on who actually participates in a study. Not everyone who is approached meets the inclusion criteria, agrees to enroll, or completes the study. Bias in sample representativeness and generalizability of findings are important sampling issues that have generated national concern because the presence of these factors decreases confidence in the evidence provided by the findings and limits applicability. Historically, many of the landmark adult health studies (e.g., the Framingham heart study, the Baltimore longitudinal study on aging) excluded women as subjects. Despite the all-male samples, the findings of these studies were generalized from males to all adults, in spite of the lack of female representation in the samples. Similarly, the use of largely European-American subjects in clinical trials limits the identification of variant responses to interventions or drugs in ethnic or racially distinct groups (Ward, 2003). Findings based on European-American data cannot be generalized to African Americans, Asians, Hispanics, or any other cultural group.
CRITICAL APPRAISAL CRITERIA Sampling
1. Have the sample characteristics been completely described?
2. Can the parameters of the study population be inferred from the description of the sample?
3. To what extent is the sample representative of the population as defined?
4. Are the eligibility/inclusion criteria for the sample clearly identified?
411
5. Have sample exclusion criteria/delimitations for the sample been established?
6. Would it be possible to replicate the study population?
7. How was the sample selected? Is the method of sample selection appropriate?
8. What kind of bias, if any, is introduced by this sampling method?
9. Is the sample size appropriate? How is it substantiated?
10. Are there indications that rights of subjects have been ensured?
11. Does the researcher identify limitations in generalizability of the findings from the sample to the population? Are they appropriate?
12. Is the sampling strategy appropriate for the design of the study and level of evidence provided by the design?
13. Does the researcher indicate how replication of the study with other samples would provide increased support for the findings?
When appraising the sample of a study, you must remember that despite the use of a carefully controlled sampling procedure that minimizes error, there is no guarantee that the sample will be representative. Factors such as sample heterogeneity and subject dropout may jeopardize the representativeness of the sample despite the most stringent random sampling procedure.
When a purposive sample is used in experimental and quasi- experimental studies, you should determine whether or how the subjects were randomly assigned to groups. If criteria for random assignment have not been followed, you have a valid basis for being cautious about the strength of evidence provided by the proposed conclusions of the study.
Although random selection may be the ideal in establishing the representativeness of a study population, more often realistic barriers (e.g., institutional policy, inaccessibility of subjects, lack of
412
time or money, and current state of knowledge in the field) necessitate the use of nonprobability sampling strategies. Many important research questions that are of interest to nursing do not lend themselves to probability sampling. A well-designed, carefully controlled study using a nonprobability sampling strategy can yield accurate and meaningful evidence that makes a significant contribution to nursing’s scientific body of knowledge.
The greatest difficulty in nonprobability sampling stems from the fact that not every element in the population has an equal chance of being represented. Therefore it is likely that some segment of the population will be systematically underrepresented. If the population is homogeneous on critical characteristics, such as age, gender, socioeconomic status, and diagnosis, systematic bias will not be very important. Few of the attributes that researchers are interested in, however, are sufficiently homogeneous to make sampling bias an irrelevant consideration.
Basically you will decide whether the sample size for a quantitative study is appropriate and its size is justifiable. You want to make sure that the researcher indicated how the sample size was determined. The method of arriving at the sample size and the rationale should be briefly mentioned. In the study designed to examine the role of resilience in the relationships of hallucination and delusion-like experiences to psychological distress in a nonclinical population, the sample was selected through a stratified cluster sampling procedure (Barahmand & Ahmad, 2016). The sampling frame consisted of 11,000 students. The power analysis indicated that based on a 5% margin of error and a 95% confidence level and expecting the sample proportion to be 50%, a sample size of at least 372 individuals (3.38% of the population) was needed to detect a significant difference. To allow for data loss through mortality, data loss, or incomplete answers, a sample of 440 individuals (4% of the population) were enrolled in the study. When appraising qualitative research designs, you also apply criteria related to sampling strategies that are relevant for a particular type of qualitative study. In general, sampling strategies for qualitative studies are purposive because the study of specific phenomena in their natural setting is emphasized; any subject belonging to a specified group is considered to represent that
413
group. Keep in mind that qualitative studies will not discuss predetermining sample size or method of arriving at sample size. Rather, sample size will tend to be small and a function of data saturation. Finally, evidence that the rights of human subjects have been protected should appear in the “Sample” section of the research report and probably consists of no more than one sentence. Remember to evaluate whether permission was obtained from an institutional review board that reviewed the study relative to the maintenance of ethical research standards (see Chapter 13).
Key points • Sampling is a process that selects representative units of a
population for study. Researchers sample representative segments of the population because it is rarely feasible or necessary to sample entire populations of interest to obtain accurate and meaningful information.
• Researchers establish eligibility criteria; these are descriptors of the population and provide the basis for selection of a sample. Eligibility criteria, which are also referred to as delimitations, include the following: age, gender, socioeconomic status, level of education, religion, and ethnicity.
• The researcher must identify the target population (i.e., the entire set of cases about which the researcher would like to make generalizations). Because of the pragmatic constraints, however, the researcher usually uses an accessible population (i.e., one that meets the population criteria and is available).
• A sample is a set of elements that makes up the population.
• A sampling unit is the element or set of elements used for selecting the sample. The foremost criterion in appraising a sample is the representativeness or congruence of characteristics with the population.
• Sampling strategies consist of nonprobability and probability sampling.
414
• In nonprobability sampling, the elements are chosen by nonrandom methods. Types of nonprobability sampling include convenience, quota, and purposive sampling.
• Probability sampling is characterized by the random selection of elements from the population. In random selection, each element in the population has an equal and independent chance of being included in the sample. Types of probability sampling include simple random, stratified random, and multistage sampling.
• Sample size is a function of the type of sampling procedure being used, the degree of precision required, the type of sample estimation formula being used, the heterogeneity of the study attributes, the relative frequency of occurrence of the phenomena under consideration, and cost.
• Criteria for drawing a sample vary according to the sampling strategy. Systematic organization of the sampling procedure minimizes bias. The target population is identified, the accessible portion of the target population is delineated, permission to conduct the research study is obtained, and a sampling plan is formulated.
• When critically appraising a research report, the sampling plan needs to be evaluated for its appropriateness in relation to the particular research design and level of evidence generated by the design.
• Completeness of the sampling plan is examined in light of potential replicability of the study. The critiquer appraises whether the sampling strategy is the strongest plan for the particular study under consideration.
• An appropriate systematic sampling plan will maximize the efficiency of a research study. It will increase the strength, accuracy, and meaningfulness of the evidence provided by the findings and enhance the generalizability of the findings from the sample to the population.
415
Critical thinking challenges • How do inclusion and exclusion criteria contribute to increasing
the strength of evidence provided by the sampling strategy of a research study?
• Why is it important for a researcher to use power analysis to calculate sample size? How does adequate sample size affect subject mortality, representativeness of the sample, the researcher’s ability to detect a treatment effect, and your ability to generalize from the study findings to your patient population?
• How does a flow chart such as the one in Fig. 12.1 of the Thomas article in Appendix A contribute to the strength and quality of evidence provided by the findings of research study and their potential for applicability to practice?
• Your interprofessional team member argues that a random sample is always better, even if it is small and represents ONLY one site. Another team member counters that a very large convenience sample with random assignment to groups representing multiple sites can be very significant. Which colleague would you defend and why? How would each scenario affect the strength and quality of evidence provided by the findings?
• Your research classmate argues that a random sample is always better, even if it is small and represents only one site. Another student counters that a very large convenience sample representing multiple sites can be very significant. Which classmate would you defend and why? How would each scenario affect the strength and quality of evidence provided by the findings?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
416
References 1. Barahmand U., Ahmad R. H. S. Psychotic-like experiences
and psychological distress the role of resilience. Journal of the American Psychiatric Nurses Association 2016;22(4):312-319.
2. Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. New York, NY: Academic Press 1988;
3. Ford J. L., Boch S. J., McCarthy D. O. Feasibility of hair collection for cortisol measurement in population research on adolescent health. Nursing Research 2016;65(3):k249-k255.
4. Nyamathi A., Salem B. E., Zhang S., et al. Nursing case management, peer coaching, and hepatitis A and B vaccine completion among homeless men recently released on parole; Randomized clinical trial. Nursing Research 2015;64(3):177-189.
5. Sousa V. D., Zauszniewski J. A., Musil C. M. How to determine whether a convenience sample represents the population. Applied Nursing Research 2004;17(2):130-133.
6. Speziale S., Carpenter D. R. Qualitative research in nursing. 4th ed. Philadelphia, PA: Lippincott 2011;
7. Sun C., Dohrn J., Klopper H., et al. Clinical nursing and midwifery research priorities in eastern and southern African countries Results from a Delphi Survey. Nursing Research 2015;64(6):466-475.
8. Thabault P. J., Burke P. J., Ades P. A. Intensive behavioral treatment weight loss program in an adult primary care practice. Journal of the American Association of Nurse Practitioners 2016;28:249-257.
9. Traeger L., McDonnell T. M., McCarty C. E., et al. Nursing intervention to enhance outpatient chemotherapy symptom management patient reported outcomes of a randomized controlled trial. Cancer 2015;121:3905-3913.
10. Van Dijk J. F. M., Vervoort S. C. J. M., van Wijck A. J. M., et al. Postoperative patients perspectives on rating pain A qualitative study. International Journal of Nursing Studies 2016;53:260-269.
11. Vermeesch A. L., Ling J., Voskull V. R., et al. Biological and sociocultural differences in perceived barriers to physical activity among firth-to-seventh grade urban girls. Nursing Research
417
2015;64(5):342-350. 12. Ward L. S. Race as a variable in cross-cultural research.
Nursing Outlook 2003;51(3):120-125. 13. Wong E. S., Rosland A. M., Fihn S. D., Nelson K. M. Patient-
centered medical home implementation in the veterans’ health administration and primary care use differences by patient co- morbidity burden. Journal of General Internal Medicine 2016;31(12):1467-1474.
418
CHAPTER 13
Legal and ethical issues Judith Haber, Geri LoBiondo-Wood
Learning outcomes
After reading this chapter, you should be able to do the following:
• Describe the historical background that led to the development of ethical guidelines for the use of human subjects in research. • Identify the essential elements of an informed consent form. • Evaluate the adequacy of an informed consent form. • Describe the institutional review board’s role in the research review process. • Identify populations of subjects who require special legal and ethical research considerations. • Describe the nurse’s role as patient advocate in research situations. • Critique the ethical aspects of a research study.
KEY TERMS
anonymity
419
assent
beneficence
confidentiality
consent
ethics
informed consent
institutional review boards
justice
respect for persons
risk/benefit ratio
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
The focus of this chapter is the legal and ethical considerations that must be addressed before, during, and after the conduct of research. Informed consent, institutional review boards (IRBs), and research involving vulnerable populations—elderly people, pregnant women, children, and prisoners—are discussed. The nurse’s role as patient advocate, whether functioning as researcher, caregiver, or research consumer, is addressed.
Ethical and legal considerations in research: A historical perspective Ethical and legal considerations with regard to research first received attention after World War II, when the US Secretary of State and Secretary of War learned that the trials for war criminals would focus on justifying the atrocities committed by Nazi physicians as “medical research.” The American Medical Association appointed a group to develop a code of ethics for research that would serve as a standard for judging the medical atrocities committed on concentration camp prisoners.
420
The resultant Nuremberg Code and its definitions of the terms voluntary, legal capacity, sufficient understanding, and enlightened decision have been the subject of numerous court cases and presidential commissions involved in setting ethical standards in research (Amdur & Bankert, 2011). The code requires informed consent in all cases but makes no provisions for any special treatment of children, the elderly, or the mentally incompetent. In the United States, federal guidelines for the ethical conduct of research were developed in the 1970s. Despite the safeguards provided by the federal guidelines, some of the most atrocious, and hence memorable, examples of unethical research took place in the United States as recently as the 1990s. These examples are highlighted in Table 13.1. They are sad reminders of our own tarnished research heritage and illustrate the human consequences of not adhering to ethical research standards.
TABLE 13.1 Highlights of Unethical Research Studies Conducted in the United States
421
In 1973 the first set of proposed regulations on the protection of human subjects were published. The most important provision was a regulation mandating that an institutional review board must review and approve all studies. In 1974, the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research was created. A major charge brought forth by the commission was to identify the basic principles that should underlie the conduct of biomedical and behavioral research involving human subjects and to develop guidelines to ensure that research is conducted in accordance with those principles (Amdur & Bankert, 2011). Three ethical principles were identified as relevant to the conduct of research involving human subjects: the principles of respect for persons, beneficence, and justice (Box 13.1). Included in the report called the Belmont Report, these principles provided the basis for regulations affecting research (National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research, 1978).
422
BOX 13.1 Basic Ethical Principles Relevant to the Conduct of Research Respect for persons People have the right to self-determination and to treatment as autonomous agents. Thus they have the freedom to participate or not participate in research. Persons with diminished autonomy are entitled to protection.
Beneficence Beneficence is an obligation to do no harm and maximize possible benefits. Persons are treated in an ethical manner, decisions are respected, they are protected from harm, and efforts are made to secure their well-being.
Justice Human subjects should be treated fairly. An injustice occurs when a benefit to which a person is entitled is denied without good reason or when a burden is imposed unduly.
The US Department of Health and Human Services (USDHHS) also developed a set of regulations which have been revised several times (USDHHS, 2009). They include:
• General requirements for informed consent
• Documentation of informed consent
• IRB review of research proposals
• Exempt and expedited review procedures for certain kinds of research
• Criteria for IRB approval of research
Protection of human rights Human rights are the claims and demands that have been justified in
423
the eyes of an individual or by a group of individuals. The term refers to the rights outlined in the American Nurses Association (ANA, 2001) guidelines:
1. Right to self-determination
2. Right to privacy and dignity
3. Right to anonymity and confidentiality
4. Right to fair treatment
5. Right to protection from discomfort and harm
These rights apply to all involved in research, including research team members who may be involved in data collection, practicing nurses involved in the research setting, and subjects participating in the study. As you read a research article, you must realize that any issues highlighted in Table 13.2 should have been addressed and resolved before a research study is approved for implementation.
TABLE 13.2 Protection of Human Rights
424
425
Procedures for protecting basic human rights
Informed consent Elements of informed consent illustrated by the ethical principles of respect and by its related right to self-determination are outlined in Box 13.2 and Table 13.2. It is critical to note that informed consent is not just giving a potential subject a consent form, but is a process that the researcher completes with each subject. Informed consent is documented by a consent form that is given to prospective subjects and contains standard elements. BOX 13.2 Elements of Informed Consent
1. Title of protocol
2. Invitation to participate
3. Basis for subject selection
4. Overall purpose of study
5. Explanation of procedures
6. Description of risks and discomforts
7. Potential benefits
8. Alternatives to participation
9. Financial obligations
10. Assurance of confidentiality
426
11. In case of injury compensation
12. HIPAA disclosure
13. Subject withdrawal
14. Offer to answer questions
15. Concluding consent statement
16. Identification of investigators
Informed consent is a legal principle that means that potential subjects understand the implications of participating in research and they knowingly agree to participate (Amdur & Bankert, 2011). Informed consent (USDHHS, 2009; Food and Drug Administration [FDA], 2012a) is defined as follows:
The knowing consent of an individual or his/her legally authorized representative, under circumstances that provide the prospective subject or representative sufficient opportunity to consider whether or not to participate without undue inducement or any element of force, fraud, deceit, duress, or other forms of constraint or coercion.
No investigator may involve a person as a research subject before obtaining the legally effective informed consent of a subject or legally authorized representative. The study must be explained to all potential subjects, including the study’s purpose; procedures; risks, discomforts, and benefits; and expected duration of participation (i.e., when the study’s procedures will be implemented, how many times, and in what setting). Potential subjects must also be informed about any appropriate alternative procedures or treatments, if any, that might be advantageous to the subject. For example, in the Tuskegee Syphilis Study, the researchers should have disclosed that penicillin was an effective treatment for syphilis. Any compensation for subjects’ participation must be delineated when there is more than minimal risk through disclosure about medical treatments and/or compensation that is
427
available if injury occurs.
HIGHLIGHT It is important for your team to remember that the right to personal privacy may be more difficult to protect when researchers are carrying out qualitative studies because of the small sample size and the subjects’ verbatim quotes are often used in the findings/results section of the research article to highlight the findings.
Prospective subjects must have time to decide whether to participate in a study. The researcher must not coerce the subject into participating, nor may researchers collect data on subjects who have explicitly refused to participate in a study. An ethical violation of this principle is illustrated by the halting of eight experiments by the US Food and Drug Administration (FDA) at the University of Pennsylvania’s Institute for Human Gene Therapy 4 months after the death of an 18-year-old man, Jesse Gelsinger, who received experimental treatment as part of the institute’s research. The institute could not document that all patients had been informed of the risks and benefits of the procedures. Furthermore, some patients who received the therapy should have been considered ineligible because their illnesses were more severe than allowed by the clinical protocols. Mr. Gelsinger had a non-life-threatening genetic disorder that permits toxic amounts of ammonia to build up in the liver. Nevertheless, he volunteered for an experimental treatment in which normal genes were implanted directly into his liver, and he subsequently died of multiple organ failure. The institute failed to report to the FDA that two patients in the same trial as Mr. Gelsinger had suffered severe side effects, including inflammation of the liver as a result of the treatment. This should have triggered a halt to the trial (Brainard & Miller, 2000). Of course, subjects may discontinue participation or withdraw from a study at any time without penalty or loss of benefits.
HELPFUL HINT Research reports rarely provide readers with detailed information regarding the degree to which the researcher adhered to ethical principles, such as informed consent, because of space limitations
428
in journals that make it impossible to describe all aspects of a study. Failure to mention procedures to safeguard subjects’ rights does not necessarily mean that such precautions were not taken.
The language of the consent form must be understandable. The reading level should be no higher than eighth grade for adults, in lay language, and the avoidance of technical terms should be observed (USDHHS, 2009). Subjects should not be asked to waive their rights or release the investigator from liability for negligence. The elements for an informed consent form are listed in Box 13.2.
Investigators obtain consent through personal discussion with potential subjects. This process allows the person to obtain immediate answers to questions. However, consent forms, which are written in narrative or outline form, highlight elements that both inform and remind subjects of the nature of the study and their participation (Amdur & Bankert, 2011).
Assurance of anonymity and confidentiality (defined in Table 13.2) is conveyed in writing and describes how confidentiality of the subjects’ records will be maintained. The right to privacy is also protected through protection of individually identifiable health information (IIHI). The USDHHS developed the following guidelines to help researchers, health care organizations, health care providers, and academic institutions determine when they can use and disclose IIHI:
• IIHI has to be “de-identified” under the HIPAA Privacy Rule.
• Data are part of a limited data set, and a data use agreement with the researcher is in place.
• A potential subject provides authorization for the researcher to use and disclose protected health information (PHI).
• A waiver or alteration of the authorization requirement is obtained from the IRB.
• The consent form must be signed and dated by the subject. The presence of witnesses is not always necessary but does constitute evidence that the subject actually signed the form. If the subject is
429
a minor or is physically or mentally incapable of signing the consent, the legal guardian or representative must sign. The investigator also signs the form to indicate commitment to the agreement.
A copy of the signed informed consent is given to the subject. The researcher maintains the original for their records. Some research, such as a retrospective chart audit, may not require informed consent—only institutional approval. In some cases, when minimal risk is involved, the investigator may have to provide the subject only with an information sheet and verbal explanation. In other cases, such as a volunteer convenience sample, completion and return of research instruments provide evidence of consent. The IRB will help advise on exceptions to these guidelines, and there are cases in which the IRB might grant waivers or amend its guidelines in other ways. The IRB makes the final determination regarding the most appropriate documentation format. You should note whether and what kind of evidence of informed consent has been provided in a research article.
HELPFUL HINT Researchers may not obtain written, informed consent when the major means of data collection is through self-administered questionnaires. The researcher usually assumes implied consent in such cases—that is, the return of the completed questionnaire reflects the respondent’s voluntary consent to participate.
Institutional review boards IRBs are boards that review studies to assess that ethical standards are met in relation to the protection of the rights of human subjects. The National Research Act (1974) requires that agencies such as universities, hospitals, and other health care organizations (e.g., managed care companies) where the conduct of biomedical or behavioral research involving human subjects is conducted must submit an application with assurances that they have an IRB, sometimes called a human subjects’ committee, that reviews the research projects and protects the rights of the human subjects (Food and Drug Administration [FDA], 2012b). At agencies where
430
no federal grants or contracts are awarded, there is usually a review mechanism similar to an IRB process, such as a research advisory committee. The National Research Act requires that the IRBs have at least five members of various research backgrounds to promote complete and adequate study reviews. The members must be qualified by virtue of their expertise and experience and reflect professional, gender, racial, and cultural diversity. Membership must include one member whose concerns are primarily nonscientific (lawyer, clergy, ethicist) and at least one member from outside the agency. IRB members have mandatory training in scientific integrity and prevention of scientific misconduct, as do the principal investigator of a study and their research team members. In an effort to protect research subjects, the HIPAA Privacy Rule has made IRB requirements much more stringent for researchers (Code of Federal Regulations, Part 46, 2009).
The IRB is responsible for protecting subjects from undue risk and loss of personal rights and dignity. The risk/benefit ratio, the extent to which a study’s benefits are maximized and the risks are minimized such that the subjects are protected from harm, is always a major consideration. For a research proposal to be eligible for consideration by an IRB, it must already have been approved by a departmental review group, such as a nursing research committee that attests to the proposal’s scientific merit and congruence with institutional policies, procedures, and mission. The IRB reviews the study’s protocol to ensure that it meets the requirements of ethical research that appear in Box 13.3. BOX 13.3 Code of Federal Regulations for IRB Approval of Research Studies To approve research, the IRB must determine that the following has been satisfied:
1. Risks to subjects are minimized.
2. Risks to subjects are reasonable in relation to anticipated benefits.
3. Selection of the subjects is equitable.
431
4. Informed consent must be and will be sought from each prospective subject or the subject’s legally authorized representative.
5. Informed consent form must be properly documented.
6. Where appropriate, the research plan makes adequate provision for monitoring the data collected to ensure subject safety.
7. There are adequate provisions to protect subjects’ privacy and the confidentiality of data.
8. Where some or all of the subjects are likely to be vulnerable to coercion or undue influence, additional safeguards are included.
IRBs provide guidelines that include steps to be taken to receive IRB approval. For example, guidelines for writing a standard consent form or criteria for qualifying for an expedited rather than a full IRB review may be made available. The IRB has the authority to approve research, require modifications, or disapprove a research study. A researcher must receive IRB approval before beginning to conduct research. IRBs have the authority to audit, suspend, or terminate approval of research that is not conducted in accordance with IRB requirements or that has been associated with unexpected serious harm to subjects.
IRBs also have mechanisms for reviewing research in an expedited manner when the risk to research subjects is minimal (Code of Federal Regulations, 2009). Keep in mind that although a researcher may determine that a project involves minimal risk, the IRB makes the final determination, and the research may not be undertaken until approved. A full list of research categories eligible for expedited review is available from any IRB office. Examples include the following:
• Prospective collection of specimens by noninvasive procedure (e.g., buccal swab, deciduous teeth, hair/nail clippings)
• Research conducted in established educational settings in which subjects are de-identified
432
• Research involving materials collected for clinical purposes
• Research on taste, food quality, and consumer acceptance
• Collection of excreta and external secretions, including sweat
• Recording of data on subjects 18 years or older, using noninvasive procedures routinely employed in clinical practice
• Voice recordings
• Study of existing data, documents, records, pathological specimens, or diagnostic data
An expedited review does not automatically exempt the researcher from obtaining informed consent, and most importantly, the department or agency mechanisms retains the final judgment as to whether or not a study may be exempt.
CRITICAL THINKING DECISION PATH Evaluating the Risk/Benefit Ratio of a Research Study
433
When critiquing research, it is important to be conversant with current regulations to determine whether ethical standards have been met. The Critical Thinking Decision Path illustrates the ethical decision-making process an IRB might use in evaluating the risk/benefit ratio of a research study.
Protecting basic human rights of vulnerable groups Researchers are advised to consult their agency’s IRB for the most recent federal and state rules and guidelines when considering research involving vulnerable groups who may have diminished
434
autonomy, such as the elderly, children, pregnant women, the unborn, those who are emotionally or physically disabled, prisoners, the deceased, students, and persons with AIDS. In addition, researchers should consult the IRB before planning research that potentially involves an oversubscribed research population, such as organ transplantation patients or AIDS patients, or “captive” and convenient populations, such as prisoners. It should be emphasized that use of special populations does not preclude undertaking research; extra precautions must be taken to protect their rights.
Research with children. The age of majority differs from state to state, but there are some general rules for including children as subjects (Title 45, CFR46 Subpart D, USDHHS, 2009). Usually a child can assent between the ages of 7 and 18 years. Research in children requires parental permission and child assent. Assent contains the following fundamental elements:
1. A basic understanding of what the child will be expected to do and what will be done to the child
2. A comprehension of the basic purpose of the research
3. An ability to express a preference regarding participation
In contrast to assent, consent requires a relatively advanced level of cognitive ability. Informed consent reflects competency standards requiring abstract appreciation and reasoning regarding the information provided. The federal guidelines have specific criteria and standards that must be met for children to participate in research. If the research involves more than minimal risk and does not offer direct benefit to the individual child, both parents must give permission. When individuals reach maturity, usually at age 18 years, they may render their own consent. They may do so at a younger age if they have been legally declared emancipated minors. Questions regarding this are addressed by the IRB and/or research administration office and not left to the discretion of the researcher to answer.
435
Research with pregnant women, fetuses, and neonates. Research with pregnant women, fetuses, and neonates requires additional protection but may be conducted if specific criteria are met (HHS Code of Federal Regulations, Title 45, CFR46 Subpart B, 2009). Decisions are made relative to the direct or indirect benefit or lack of benefit to the pregnant woman and the fetus. For example, pregnant women may be involved in research if the research suggests the prospect of direct benefit to the pregnant women and fetus by providing data for assessing risks to pregnant women and fetuses. If the research suggests the prospect of direct benefit to the fetus solely, then both the mother and father must provide consent.
Research with prisoners. The federal guidelines also provide guidance to IRBs regarding research with prisoners. These guidelines address the issues of allowable research, understandable language, adequate assurances that participation does not affect parole decisions, and risks and benefits (HHS Code of Federal Regulations, Title 45 Part 46, Subpart C, 2009).
Research with the elderly. Elderly individuals have been historically and are potentially vulnerable to abuse and as such require special consideration. There is no issue if the potential subject can supply legally effective informed consent. Competence is not a clear issue. The complexity of the study may affect one’s ability to consent to participate. The capacity to obtain informed consent should be assessed in each individual for each research protocol being considered. For example, an elderly person may be able to consent to participate in a simple observational study but not in a clinical drug trial. The issue of the necessity of requiring the elderly to provide consent often arises, and each situation must be evaluated for its potential to preserve the rights of this population.
No vulnerable population may be singled out for study because it is convenient. For example, neither people with mental illness nor prisoners may be studied because they are an available and convenient group. Prisoners may be studied if the studies pertain to them—that is, studies concerning the effects and processes of
436
incarceration. Similarly, people with mental illness may participate in studies that focus on expanding knowledge about psychiatric disorders and treatments. Students also are often a convenient group. They must not be singled out as research subjects because of convenience; the research questions must have some bearing on their status as students. In all cases, the burden is on the investigator to show the IRB that it is appropriate to involve vulnerable subjects in research.
HELPFUL HINT Keep in mind that researchers rarely mention explicitly that the study participants were vulnerable subjects or that special precautions were taken to appropriately safeguard the human rights of this vulnerable group. Research consumers need to be attentive to the special needs of groups who may be unable to act as their own advocates or are unable to adequately assess the risk/benefit ratio of a research study.
Appraisal for evidence-based practice legal and ethical aspects of a research study Research reports do not contain detailed information regarding the ways in which the investigator adhered to the legal and ethical principles presented in this chapter. Lack of written evidence regarding the protection of human rights does not imply that appropriate steps were not taken.
The Critical Appraisal Criteria box provides guidelines for evaluating the legal and ethical aspects of a study. When reading a study, due to space constraints, you will not see all areas explicitly addressed in the article. Box 13.4 provides examples of statements in research articles that illustrate the brevity with which the legal and ethical component of a study is reported. BOX 13.4 Examples of Legal and Ethical Content in Published Research Reports Found in the Appendices
437
• “The study was approved by the Institutional Review Board (IRB) from the university, the 4 recruitment facilities and the State Department of Health prior to recruitment of study participants” (Hawthorne et al., 2016, p. 76).
• “Following institutional ethics approvals from the University of Windsor in Ontario, Canada and the University of Western Ontario, Canada data were collected from the pediatric oncology patients” (Turner-Sack et al., 2016, p. 50).
CRITICAL APPRAISAL CRITERIA Legal and Ethical Issues
1. Was the study approved by an IRB or other agency committees?
2. Is there evidence that informed consent was obtained from all subjects or their representatives? How was it obtained?
3. Were the subjects protected from physical or emotional harm?
4. Were the subjects or their representatives informed about the purpose and nature of the study?
5. Were the subjects or their representatives informed about any potential risks that might result from participation in the study?
6. Is the research study designed to maximize the benefit(s) to human subjects and minimize the risks?
7. Were subjects coerced or unduly influenced to participate in this study? Did they have the right to refuse to participate or withdraw without penalty? Were vulnerable subjects used?
8. Were appropriate steps taken to safeguard the privacy of subjects? How have data been kept anonymous and/or confidential?
Information about the legal and ethical considerations of a study
438
is usually presented in the methods section of an article. The subsection on the sample or data-collection methods is the most likely place for this information. The author most often indicates in a sentence that informed consent was obtained and that approval from an IRB was granted. To protect subject and institutional privacy, the locale of the study frequently is described in general terms in the sample subsection of the report. For example, the article might state that data were collected at a 1000-bed tertiary care center in the southwest, without mentioning its name. Protection of subject privacy may be explicitly addressed by statements indicating that anonymity or confidentiality of data was maintained or that grouped data were used in the data analysis.
When considering the special needs of vulnerable subjects, you should be sensitive to whether the special needs of groups, unable to act on their own behalf, have been addressed. For instance, has the right of self-determination been addressed by the informed consent protocol identified in the research report?
When qualitative studies are reported, verbatim quotes from informants often are incorporated into the findings section of the article. In such cases, you will evaluate how effectively the author protected the informant’s identity, either by using a fictitious name or by withholding information such as age, gender, occupation, or other potentially identifying data (see Chapters 5, 6, and 7 for ethical issues related to qualitative research).
It should be apparent from the preceding sections that although the need for guidelines for the use of human subjects in research is evident and the principles themselves are clear, there are many instances when you must use your best judgment both as a patient advocate and as a research consumer when evaluating the ethical nature of a research project. When conflicts arise, you must feel free to raise suitable questions with appropriate resources and personnel. In an institution these may include contacting the researcher first and then, if there is no resolution, the director of nursing research and the chairperson of the IRB. In cases where ethical considerations in a research article are in question, clarification from a colleague, agency, or IRB is indicated. You should pursue your concerns until satisfied that the patient’s rights and your rights as a professional nurse are protected.
439
Key points • Ethical and legal considerations in research first received
attention after World War II during the Nuremberg Trials, from which developed the Nuremberg Code. This became the standard for research guidelines protecting the human rights of research subjects.
• The Belmont Report discusses three basic ethical principles (respect for persons, beneficence, and justice) that underlie the conduct of research involving human subjects.
• Protection of human rights includes (1) right to self- determination, (2) right to privacy and dignity, (3) right to anonymity and confidentiality, (4) right to fair treatment, and (5) right to protection from discomfort and harm.
• Procedures for protecting human rights include gaining informed consent, which illustrates the ethical principle of respect, and obtaining IRB approval, which illustrates the ethical principles of respect, beneficence, and justice.
• Special consideration should be given to studies involving vulnerable populations, such as children, the elderly, prisoners, and those who are mentally or physically disabled.
• Nurses must be knowledgeable about the legal and ethical components of research so they can evaluate whether a researcher has ensured protection of patient rights.
Critical thinking challenges • A state government official interested in determining the number
of infants infected with the human immunodeficiency virus (HIV) has approached your hospital to participate in a state-wide funded study. The protocol will include the testing of all newborns for HIV, but the mothers will not be told that the test is being done, nor will they be told the results. Using the basic ethical principles found in Box 13.2, defend or refute the practice.
440
How will the findings of the proposed study be affected if the protocol is carried out?
• As a research consumer, what kind of information related to the legal and ethical aspects of a research study would you expect to see written about in a published research study? How does that differ from the data the researcher would have to prepare for an IRB submission?
• A randomized clinical trial (RCT) testing the effectiveness of a new Lyme disease vaccine is being conducted as a multisite RCT. There are two vaccine intervention groups, each of which is receiving a different vaccine, and one control group that is receiving a placebo. Using the information in Table 13.2, identify the conditions under which the RCT is halted due to potential legal and ethical issues to subjects.
• Your interprofessional QI team is asked to do a presentation about risk/benefit ratio and how it influences clinical decision making and resource allocation in your clinical organization.
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Amdur R., Bankert E. A. Institutional Review Board Member
Handbook. 3rd ed. Boston, MA: Jones & Bartlett 2011; 2. American Nurses Association. Code for nurses with
interpretive statements. Kansas City, MO: Author 2001; 3. Brainard J., Miller D. W. U. S. regulators suspend medical
studies at two universities. Chronicle of Higher Education 2000;A30.
4. Code of Federal Regulations. Part 46, Vol. 1. Available at: http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/cfresearch.cfm 2009;
5. French H. W. AIDS research in Africa Juggling risks and
441
hopes. New York Times 1997, October 9;A1-A12. 6. Hawthorne D., Youngblut J. M, Brooten D. Parent
spirituality, grief, and mental health at 1 and 3 months after their infant’s/child’s death in an intensive care unit. Journal of Pediatric Nursing 2016;31:73-80.
7. Hershey N., Miller R. D. Human experimentation and the law. Germantown, MD: Aspen 1976;
8. Hilts P. J. Agency faults a UCLA study for suffering of mental patients.;: New York Times1995, March 9;A1-A11.
9. Levine R. J. Ethics and regulation of clinical research. 2nd ed. Baltimore, MD-Munich, Germany: Urban & Schwartzenberg 1986;
10. National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. Belmont report ethical principles and guidelines for research involving human subjects, DHEW pub no 05. Washington, DC: US Government Printing Office 1978; 78–0012.
11. Turner-Sack A. M, Menna R., Setchell S. R, et al. Psychological functioning, post-treatment growth, and coping in parents and siblings of adolescent cancer survivors. 2016;43(1):48-56.
12. US Department of Health and Human Services (USDHHS). 45 CFR 46. Code of Federal Regulations protection of human subjects. Washington, DC: Author 2009;
13. US Food and Drug Administration (FDA). A guide to informed consent, Code of Federal Regulations, Title 21, Part 50. Available at: www.fda.gov/oc/ohrt/irbs/informedconsent.html 2012;
14. US Food and Drug Administration (FDA). Institutional Review Boards, Code of Federal Regulations, Title 21, Part 56. Available at: www.fda.gov/oc/ohrt/irbs/appendixc.html 2012;
15. Wheeler D. L. Three medical organizations embroiled in controversy over use of placebos in AIDS studies abroad. Chronicle of Higher Education 1997;A15-A16.
442
CHAPTER 14
Data collection methods Susan Sullivan-Bolyai, Carol Bova
Learning outcomes
After reading this chapter, you should be able to do the following:
• Define the types of data collection methods used in research. • List the advantages and disadvantages of each data collection method. • Compare how specific data collection methods contribute to the strength of evidence in a study. • Identify potential sources of bias related to data collection. • Discuss the importance of intervention fidelity in data collection. • Critically evaluate the data collection methods used in published research studies.
KEY TERMS
anecdotes
closed-ended questions
443
concealment
consistency
content analysis
debriefing
demographic data
existing data
field notes
intervention
interview guide
interviews
Likert scales
measurement
measurement error
objective
observation
open-ended questions
operational definition
participant observation
physiological data
questionnaires
random error
reactivity
respondent burden
scale
scientific observation
self-report
444
systematic
systematic error
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
Nurses are always collecting information (or data) from patients. We collect data on blood pressure, age, weight, and laboratory values as part of our daily work. Data collected for practice purposes and for research have several key differences. Data collection procedures in research must be objective, free from the researchers’ personal biases, attitudes, and beliefs, and systematic. Systematic means that everyone who is involved in the data collection process collects the data from each subject in a uniform, consistent, or standard way. This is called fidelity. When reading a study, the data collection methods should be identifiable, transparent, and repeatable. Thus, when reading the research literature to inform your evidence-based practice, there are several issues to consider regarding data collection methods.
It is important that researchers carefully define the concepts or variables they measure. The process of translating a concept into a measurable variable requires the development of an operational definition. An operational definition is how the researcher measures each variable. Example: ➤ Turner-Sack and colleagues (2016) (see Appendix D) conceptually defined coping for adolescents (cancer survivors) and their siblings as active, emotion-focused avoidant and acceptance coping; for parents, the definition was similar but slightly different, with active, social support, and emotion-focused avoidant and acceptance coping. They operationally defined coping as measured by the COPE, a measurement scale that assesses coping in adolescents and adults.
The purpose of this chapter is to familiarize you with the ways that researchers collect data from subjects. The chapter provides you with the tools for evaluating data collection procedures commonly used in research, their strengths and weaknesses, how consistent data collection operations (fidelity) can increase study rigor and decrease bias that affects study internal and external validity (see Chapter 8), and how useful each technique is for
445
providing evidence for nursing practice. This information will help you critique the research literature and decide whether the findings provide evidence that is applicable to your practice setting.
Measuring variables of interest Largely the success of a study depends on the fidelity (consistency and quality) of the data collection methods or measurement used. Determining what measurement to use in a study may be the most difficult and time-consuming step in study design. Thus, the process of evaluating and selecting the instruments to measure variables of interest is of critical importance to the potential success of the study.
As you read research articles and the data collection techniques used, look for consistency with the study’s aim, hypotheses, setting, and population. Data collection may be viewed as a two- step process. First, the researcher chooses the study’s data collection method(s). An algorithm that influences a researcher’s choice of data collection methods is diagrammed in the Critical Thinking Decision Path. The second step is deciding if the measurement scales are reliable and valid. Reliability and validity of instruments are discussed in Chapter 15 (for quantitative research) and in Chapter 6 (for qualitative research).
CRITICAL THINKING DECISION PATH Consumer of Research Literature Review
446
Data collection methods When reading a study, be aware that investigators decide early in the process whether they need to collect their own data or whether data already exist in the form of records or databases. This decision is based on a thorough literature review and the availability of existing data. If the researcher determines that no data exist, new data can be collected through observation, self-report (interviewing or questionnaires), or by collecting physiological data using standardized instruments or testing procedures (e.g., laboratory tests, x-rays). Existing data can be collected by extracting data from medical records or local, state, and national databases. Each of these methods has a specific purpose, as well as pros and cons inherent in its use. It is important to remember that all data collection methods rely on the ability of the researcher to standardize these procedures to increase data accuracy and reduce measurement error.
Measurement error is the difference between what really exists and what is measured in a study. Every study has some amount of measurement error. Measurement error can be random or systematic (see Chapter 15). Random error occurs when scores vary in a random way. Random error occurs when data collectors do not use standard procedures to collect data consistently among all
447
subjects in a study. Systematic error occurs when scores are incorrect but in the same direction. An example of systematic error occurs when all subjects were weighed using a weight scale that is under by 3 pounds for all subjects in the study. Researchers attempt to design data collection methods that will be consistently applied across all subjects and time points to reduce measurement error.
HELPFUL HINT Remember that the researcher may not always present complete information about the way the data were collected, especially when established instruments were used. To learn about the instrument that was used in greater detail, you may need to consult the original article describing the instrument.
To help decipher the quality of the data collection section in a research article, we will discuss the three main methods used for collecting data: observation, self-report, and physiological measurement.
EVIDENCE-BASED PRACTICE TIP It is difficult to place confidence in a study’s findings if the data collection methods are not consistent.
Observational methods Observation is a method for collecting data on how people behave under certain conditions. Observation can take place in a natural setting (e.g., in the home, in the community, on a nursing unit) or laboratory setting and includes collecting data on communication (verbal, nonverbal), behavior, and environmental conditions. Observation is also useful for collecting data that may have cultural or contextual influences. Example: ➤ If a researcher wanted to understand the emergence of obesity among immigrants in the United States, it might be useful to observe food preparation, exercise patterns, and shopping practices in the communities of the specific groups.
Although observing the environment is a normal part of living, scientific observation places a great deal of emphasis on the objective and systematic nature of the observation. The researcher is not merely looking at what is happening, but rather is watching
448
with a trained eye for specific events. To be scientific, observations must fulfill the following four conditions:
1. Observations undertaken are consistent with the study’s aims/objectives.
2. There is a standardized and systematic plan for observation and data recording.
3. All observations are checked and controlled.
4. The observations are related to scientific concepts and theories.
Observational methods may be structured or unstructured. Unstructured observation methods are not characterized by a total absence of structure, but usually involve collecting descriptive information about the topic of interest. In participant observation, the observer keeps field notes (a short summary of observations) to record the activities, as well as the observer’s interpretations of these activities. Field notes usually are not restricted to any particular type of action or behavior; rather, they represent a narrative set of written notes intended to paint a picture of a social situation in a more general sense. Another type of unstructured observation is the use of anecdotes. Anecdotes are summaries of a particular observation that usually focus on the behaviors of interest and frequently add to the richness of research reports by illustrating a particular point (see Chapters 5 and 6 for more on qualitative data collection strategies). Structured observations involve specifying in advance what behaviors or events are to be observed. Typically standardized forms are used for record keeping and include categorization systems, checklists, or rating scales. Structured observation relies heavily on the formal training and standardization of the observers (see Chapter 15 for an explanation of interrater reliability).
Observational methods can also be distinguished by the role of the observer. The observer’s role is determined by the amount of interaction between the observer and those being observed. These methods are illustrated in Fig. 14.1. Concealment refers to whether the subjects know they are being observed. Concealment has ethical
449
implications for the study. Whether concealment is permitted in a study will be decided by an institutional review board. The decision will be based on the potential risk to the subjects, the scientific rationale for the concealment, as well as the plan to debrief the participants about the concealment once the study is completed. Intervention deals with whether the observer provokes actions from those who are being observed. Box 14.1 describes the basic types of observational roles implemented by the observer(s). These are distinguishable by the amount of concealment or intervention implemented by the observer.
FIG 14.1 Types of observational roles in research.
BOX 14.1 Basic Types of Observational Roles
1. Concealment without intervention. The researcher watches subjects without their knowledge and does not provoke the subject into action. Often such concealed observations use hidden television cameras, audio recording devices, or one-way mirrors. This method is often used in observational studies of children and their parents. You may be familiar with rooms with one-way mirrors in which a researcher can observe the behavior of the occupants of the room without being observed by them. Such studies allow for the observation of children’s natural behavior
450
and are often used in developmental research.
2. Concealment with intervention. Concealed observation with intervention involves staging a situation and observing the behaviors that are evoked in the subjects as a result of the intervention. Because the subjects are unaware of their participation in a research study, this type of observation has fallen into disfavor and rarely is used in nursing research.
3. No concealment without intervention. The researcher obtains informed consent from the subject to be observed and then simply observes his or her behavior.
4. No concealment with intervention. No concealment with intervention is used when the researcher is observing the effects of an intervention introduced for scientific purposes. Because the subjects know they are participating in a research study, there are few problems with ethical concerns; however, reactivity is a problem in this type of study.
Observing subjects without their knowledge may violate assumptions of informed consent, and therefore researchers face ethical problems with this approach. However, sometimes there is no other way to collect such data, and the data collected are unlikely to have negative consequences for the subject. In these cases, the disadvantages of the study are outweighed by the advantages. Further, the problem is often handled by informing subjects after the observation, allowing them the opportunity to refuse to have their data included in the study and discussing any questions they might have. This process is called debriefing.
When the observer is neither concealed nor intervening, the ethical question is not a problem. Here the observer makes no attempt to change the subjects’ behavior and informs them that they are to be observed. Because the observer is present, this type of observation allows a greater depth of material to be studied than if the observer is separated from the subject by an artificial barrier, such as a one-way mirror. Participant observation is a commonly used observational technique in which the researcher functions as a part of a social group to study the group in question. The problem
451
with this type of observation is reactivity (also referred to as the Hawthorne effect), or the distortion created when the subjects change behavior because they know they are being observed.
EVIDENCE-BASED PRACTICE TIP When reading a research report that uses observation as a data collection method, note evidence of consistency across data collectors through use of interrater reliability (see Chapter 15) data. When this is present, it increases your confidence that the data were collected systematically.
Scientific observation has several advantages, the main one being that observation may be the only way for the researcher to study the variable of interest. Example: ➤ What people say they do often may not be what they really do. Therefore, if the study is designed to obtain substantive findings about human behavior, observation may be the only way to ensure the validity of the findings. In addition, no other data collection method can match the depth and variety of information that can be collected when using these techniques. Such techniques also are quite flexible in that they may be used in both experimental and nonexperimental designs. As with all data collection methods, observation also has its disadvantages. Data obtained by observational techniques are vulnerable to observer bias. Emotions, prejudices, and values can influence the way behaviors and events are observed and recorded. In general, the more the observer needs to make inferences and judgments about what is being observed, the more likely it is that distortions will occur. Thus in judging the adequacy of observation methods, it is important to consider how observation forms were constructed and how observers were trained and evaluated.
Ethical issues can also occur if subjects are not fully aware that they are being observed. For the most part, it is best to inform subjects of the study’s purpose and the fact that they are being observed. However, in certain circumstances, informing the subjects will change behaviors (Hawthorne effect; see Chapter 8). Example: ➤ If a nurse researcher wanted to study hand-washing frequency in a nursing unit, telling the nurses that they were being observed for their rate of hand washing would likely increase the hand-washing rate and thereby make the study results less valid.
452
Therefore, researchers must carefully balance full disclosure of all research procedures with the ability to obtain valid data through observational methods.
HIGHLIGHT It is important for members of your team to remember to look for evidence of fidelity, that data collectors and those carrying out the intervention were trained on how to collect data and/or implement an intervention consistently. It is also important to determine that there was periodic supervision to make sure that the consistency was maintained.
Self-report methods Self-report methods require subjects to respond directly to either interviews or questionnaires about their experiences, behaviors, feelings, or attitudes. Self-report methods are commonly used in nursing research and are most useful for collecting data on variables that cannot be directly observed or measured by physiological instruments. Some variables commonly measured by self-report in nursing research studies include quality of life, satisfaction with nursing care, social support, pain, resilience, and functional status.
The following are some considerations when evaluating self- report methods:
• Social desirability. There is no way to know for sure if a subject is telling the truth. People are known to respond to questions in a way that makes a favorable impression. Example: ➤ If a nurse researcher asks patients to describe the positive and negative aspects of nursing care received, the patient may want to please the researcher and respond with all positive responses, thus introducing bias into the data collection process. There is no way to tell whether the respondent is telling the truth or responding in a socially desirable way, so the accuracy of self-report measures is always open for scrutiny.
• Respondent burden is another concern for researchers who use self- report (Ulrich et al., 2012). Respondent burden occurs when the length of the questionnaire or interview is too long or the
453
questions are too difficult to answer in a reasonable amount of time considering respondents’ age, health condition, or mental status. It also occurs when there are multiple data collection points, as in longitudinal studies when the same questionnaires have to be completed multiple times. Respondent burden can result in incomplete or erroneous answers or missing data, jeopardizing the validity of the study findings.
Interviews and questionnaires Interviews are a method of data collection where a data collector asks subjects to respond to a set of open-ended or closed-ended questions as described in Box 14.2. Interviews are used in both quantitative and qualitative research, but are best used when the researcher may need to clarify the task for the respondent or is interested in obtaining more personal information from the respondent. BOX 14.2 Uses for Open-Ended and Closed-Ended Questions
• Open-ended questions are used when the researcher wants the subjects to respond in their own words or when the researcher does not know all of the possible alternative responses. Interviews that use open-ended questions often use a list of questions and probes called an interview guide. Responses to the interview guide are often audio-recorded to capture the subject’s responses. An example of an open-ended question is used for the interview in Appendix D.
• Closed-ended questions are structured, fixed-response items with a fixed number of responses. Closed-ended questions are best used when the question has a finite number of responses and the respondent is to choose the one closest to the correct response. Fixed-response items have the advantage of simplifying the respondent’s task but result in omission of important information about the subject. Interviews that use closed-ended questions typically record a subject’s responses
454
directly on the questionnaire. An example of a closed-ended item is found in Box 14.3.
Open-ended questions allow more varied information to be collected and require a qualitative or content analysis method to analyze responses (see Chapter 6). Content analysis is a method of analyzing narrative or word responses to questions and either counting similar responses or grouping the responses into themes or categories (also used in qualitative research). Interviews may take place face to face, over the telephone, or online via a web- based format.
Questionnaires are paper-and-pencil instruments designed to gather data from individuals about knowledge, attitudes, beliefs, and feelings. Questionnaires, like interviews, may be open-ended or closed-ended, as presented in Box 14.2. Questionnaires are most useful when there is a finite set of questions. Individual items in a questionnaire must be clearly written so that the intent of the question and the nature of the response options are clear. Questionnaires may be composed of individual items that measure different variables or concepts (e.g., age, race, ethnicity, and years of education) or scales. Survey researchers rely almost entirely on questionnaires for data collection.
Questionnaires can be referred to as instruments, measures, scales, or tools. When multiple items are used to measure a single concept, such as quality of life or anxiety, and the scores on those items are combined mathematically to obtain an overall score, the questionnaire or measurement instrument is called a scale. The important issue is that each of the items must measure the same concept or variable. An intelligence test is an example of a scale that combines individual item responses to determine an overall quantification of intelligence.
Scales can have subscales or total scale scores. For instance, in the study by Turner-Sack and colleagues (2016) (see Appendix D), the COPE scale has four separate subscales to measure coping for adolescents (cancer survivors) and their siblings, and four for parents with subjects responding to a four-point scale ranging from 1 to 4, with 1 indicating “I usually do not do this” and 4 indicating “I usually do this a lot.” The investigators also added a religious
455
coping subscale for adolescents and siblings and parents. Higher scores reflect more use of that particular type of coping strategy. The response options for scales are typically lists of statements on which respondents indicate, for example, whether they “strongly agree,” “agree,” “disagree,” or “strongly disagree.” This type of response option is called a Likert-type scale.
EVIDENCE-BASED PRACTICE TIP Scales used in research should have evidence of adequate reliability and validity so that you feel confident that the findings reflect what the researcher intended to measure (see Chapter 15).
Box 14.3 shows three items from a survey of nursing job satisfaction. The first item is closed-ended and uses a Likert scale response format. The second item is also closed-ended, and it forces respondents to choose from a finite number of possible answers. The third item is open-ended, and respondents use their own words to answer the question, allowing an unlimited number of possible answers. Often researchers use a combination of Likert- type, closed-ended, and open-ended questions when collecting data in nursing research. BOX 14.3 Examples of Open-Ended and Closed-Ended Questions Open-ended questions Please list the three most important reasons why you chose to stay in your current job:
1. ______________________________
2. ______________________________
3. ______________________________
Closed-ended questions (likert scale) How satisfied are you with your current position?
456
Closed-ended questions On average, how many patients do you care for in 1 day?
1. 1 to 3
2. 4 to 6
3. 7 to 9
4. 10 to 12
5. 13 to 15
6. 16 to 18
7. 19 to 20
8. More than 20
Turner-Sack and colleagues (2016; see Appendix D) used all self- report instruments to examine differences among adolescents, siblings, and parents and their psychological functioning, post- traumatic growth, and coping strategies. They also collected demographic data. Demographic data includes information that describes important characteristics about the subjects in a study (e.g., age, gender, race, ethnicity, education, marital status). It is important to collect demographic data in order to describe and compare different study samples so you can evaluate how similar the sample is to your patient population.
When reviewing articles with numerous questionnaires, remember (especially if the study deals with vulnerable populations) to assess if the author(s) addressed potential respondent burden such as:
• Reading level (eighth grade)
457
• Questionnaire font size (14-point font)
• Need to read and assist some subjects
• Time it took to complete the questionnaire (30 minutes)
• Multiple data collection points
This information is very important for judging the respondent burden associated with study participation. It is important to examine the benefits and caveats associated with using interviews and questionnaires as self-report methods. Interviews offer some advantages over questionnaires. The response rate is almost always higher with interviews, and there are fewer missing data, which helps reduce bias.
HELPFUL HINT Remember, sometimes researchers make trade-offs when determining the measures to be used. Example: ➤ A researcher may want to learn about an individual’s attitudes regarding job satisfaction; however, practicalities may preclude using an interview, so a questionnaire may be used instead.
Another advantage of the interview is that vulnerable populations such as children, the blind, and those with low literacy may not be able to fill out a questionnaire. With an interview, the data collector knows who is giving the answers. When questionnaires are mailed, for example, anyone in the household could be the person who supplies the answers. Interviews also allow for some safeguards, such as clarifying misunderstood questions, and observing and recording the level of the respondent’s understanding of the questions. In addition, the researcher has flexibility over the order of the questions.
With questionnaires, the respondent can answer questions in any order. Sometimes changing the order of the questions can change the response. Finally, interviews allow for richer and more complex data to be collected. This is particularly so when open-ended responses are sought. Even when closed-ended response items are used, interviewers can probe to understand why a respondent
458
answered in a particular way. Questionnaires also have certain advantages. They are much less
expensive to administer than interviews that require hiring and thoroughly training interviewers. Thus if a researcher has a fixed amount of time and money, a larger and more diverse sample can be obtained with questionnaires. Questionnaires may allow for more confidentiality and anonymity with sensitive issues that participants may be reluctant to discuss in an interview. Finally, the fact that no interviewer is present assures the researcher and the reader that there will be no interviewer bias. Interviewer bias occurs when the interviewer unwittingly leads the respondent to answer in a certain way. This problem can be especially pronounced in studies that use open-ended questions. The tone used to ask the question and/or nonverbal interviewer responses such as a subtle nod of the head could lead a respondent to change an answer to correspond with what the researcher wants to hear.
Finally, the use of Internet-based self-report data collection (both interviewing and questionnaire delivery) has gained momentum. The use of an online format is economical and can capture subjects from different geographic areas without the expense of travel or mailings. Open-ended questions are already typed and do not require transcription, and closed-ended questions can often be imported directly into statistical analysis software, and therefore reduce data entry mistakes. The main concerns with Internet-based data collection procedures involve the difficulty of ensuring informed consent (e.g., Is checking a box indicating agreement to participate the same thing as signing an informed consent form?) and the protection of subject anonymity, which is difficult to guarantee with any Internet-based venue. In addition, the requirement that subjects have computer access limits the use of this method in certain age groups and populations. However, the advantages of increased efficiency and accuracy make Internet- based data collection a growing trend among nurse researchers.
Physiological measurement Physiological data collection involves the use of specialized equipment to determine the physical and biological status of subjects. Such measures can be physical, such as weight or
459
temperature; chemical, such as blood glucose level; microbiological, as with cultures; or anatomical, as in radiological examinations. What separates these data collection procedures from others used in research is that they require special equipment to make the observation.
Physiological or biological measurement is particularly suited to the study of many types of nursing problems. Example: ➤ Examining different methods for taking a patient’s temperature or blood pressure or monitoring blood glucose levels may yield important information for determining the effectiveness of certain nursing monitoring procedures or interventions. However, it is important that the method be applied consistently to all subjects in the study. Example: ➤ Nurses are quite familiar with taking blood pressure measurements. However, for research studies that involve blood pressure measurement, the process must be standardized (Bern et al., 2007; Pickering et al., 2005). The subject must be positioned (sitting or lying down) the same way for a specified period of time, the same blood pressure instrument must be used, and often multiple blood pressure measurements are taken under the same conditions to obtain an average value.
The advantages of using physiological data collection methods include the objectivity, precision, and sensitivity associated with these measures. Unless there is a technical malfunction, two readings of the same instrument taken at the same time by two different nurses are likely to yield the same result. Because such instruments are intended to measure the variable being studied, they offer the advantage of being precise and sensitive enough to pick up subtle variations in the variable of interest. It is also unlikely that a subject in a study can deliberately distort physiological information.
Physiological measurements are not without inherent disadvantages and include the following:
• Some instruments may be quite expensive to obtain and use.
• Physiological instruments often require specialized training to be used accurately.
• The variable of interest may be altered as a result of using the
460
instrument. Example: ➤ An individual’s blood pressure may increase just because a health care professional enters the room (called white coat syndrome).
• Although thought as being nonintrusive, the presence of some types of devices might change the measurement. Example: ➤ The presence of a heart rate monitoring device might make some patients anxious and increase their heart rate.
• All types of measuring devices are affected in some way by the environment. A simple thermometer can be affected by the subject drinking something hot or smoking a cigarette immediately before the temperature is taken. Thus it is important to consider whether the researcher controlled such environmental variables in the study.
Existing data All of the data collection methods discussed thus far concern the ways that researchers gather new data to study phenomena of interest. Sometimes existing data can be examined in a new way to study a problem. The use of records (e.g., medical records, care plans, hospital records, death certificates) and databases (e.g., US Census, National Cancer Database, Minimum Data Set for Nursing Home Resident Assessment and Care Screening) are frequently used to answer research questions about clinical problems. Typically, this type of research design is referred to as secondary analysis.
The use of available data has advantages. First, data are already collected, thus eliminating subject burden and recruitment problems. Second, most databases contain large populations; therefore sample size is rarely a problem and random sampling is possible. Larger samples allow the researcher to use more sophisticated analytic procedures, and random sampling enhances generalizability of findings. Some records and databases collect standardized data in a uniform way and allow the researcher to examine trends over time. Finally, the use of available records has the potential to save significant time and money.
On the other hand, institutions may be reluctant to allow
461
researchers to have access to their records. If the records are kept so that an individual cannot be identified (known as de-identified data), this is usually not a problem. However, the Health Insurance Portability and Accountability Act (HIPAA), a federal law, protects the rights of individuals who may be identified in records (Bova et al., 2012; see Chapter 13). Recent escalation in the computerization of health records has led to discussion about the desirability of access to such records for research. Currently, it is not clear how much computerized health data will be readily available for research purposes.
Another problem that affects the quality of available data is that the researcher has access only to those records that have survived. If the records available are not representative of all of the possible records, the researcher may have to make an intelligent guess as to their accuracy. Example: ➤ A researcher might be interested in studying socioeconomic factors associated with the suicide rate. Frequently, these data are underreported because of the stigma attached to suicide, so the records would be biased.
EVIDENCE-BASED PRACTICE TIP Critical appraisal of any data collection method includes evaluating the appropriateness, objectivity, and consistency of the method employed.
Construction of new instruments Sometimes researchers cannot locate an instrument with acceptable reliability and validity to measure the variable of interest (see Chapter 15). In this situation, a new instrument or scale must be developed.
Instrument development is complex and time consuming. It consists of the following steps:
• Define the concept to be measured.
• Clarify the target population.
• Develop the items.
462
• Assess the items for content validity.
• Develop instructions for respondents and users.
• Pretest and pilot test the items.
• Estimate reliability and validity.
Defining the concept to be measured requires that the researcher develop expertise in the concept, which includes an extensive review of the literature and of all existing measurements that deal with related concepts. The researcher will use all of this information to synthesize the available knowledge so that the construct can be defined.
Once defined, the individual items measuring the concept can be developed. The researcher will develop many more items than are needed to address each aspect of the concept. The items are evaluated by a panel of experts in the field to determine if the items measure what they are intended to measure (content validity) (see Chapter 15). Items will be eliminated if they are not specific to the concept. In this phase, the researcher needs to ensure consistency among the items, as well as consistency in testing and scoring procedures.
Finally, the researcher pilot tests the new instrument to determine the quality of the instrument as a whole (reliability and validity), as well as the ability of each item to discriminate among individual respondents (variance in item response). Pilot testing can also yield important evidence about the reading level (too low or too high), length of the instrument (too short or too long), directions (clear or not clear), response rate (the percent of potential subjects who return a completed scale), and the appropriateness of culture or context. The researcher also may administer a related instrument to see if the new instrument is sufficiently different from the older one (construct validity). Instrument development and testing is an important part of nursing science because our ability to evaluate evidence related to practice depends on measuring nursing phenomena in a clear, consistent, and reliable way.
463
Appraisal for evidence-based practice data collection methods Assessing the adequacy of data collection methods is an important part of evaluating the results of studies that provide evidence for clinical practice. The data collection procedures provide a snapshot of the rigor with which the study was conducted. From an evidence-based practice perspective, you can judge if the data collection procedures would fit within your clinical environment and with your patient population. The manner in which the data were collected affects the study’s internal and external validity. A well-developed methods section of a study decreases bias in the findings. A key element for evidence-based practice is if the procedures were consistently completed. Also consider the following:
• If observation was used, was an observation guide developed, and were the observers trained and supervised until there was a high level of interrater reliability? How was the training confirmed periodically throughout the study to maintain fidelity and decrease bias?
• Was a data collection procedure manual developed and used during the study?
• If the study tested an intervention, were there interventionist and data collector training?
• If a physiological instrument was used, was the instrument properly calibrated throughout the study and the data collected in the same manner from each subject?
• If there were missing data, how were the data accounted for?
Some of these details may be difficult to discern in a research article, due to space limitations imposed by the journal. Typically, the interview guide, questionnaires, or scales are not available for review. However, research articles should indicate the following:
464
• Type(s) of data collection method used (self-report, observation, physiological, or existing data)
• Evidence of training and supervision for the data collectors and interventionists
• Consistency with which data collection procedures were applied across subjects
• Any threats to internal validity or bias related to issues of instrumentation or testing
• Any sources of bias related to external validity issues, such as the Hawthorne effect
• Scale reliability and validity discussed
• Interrater reliability across data collectors and time points (if observation was used)
When you review the data collection methods section of a study, it is important to think about the data strength and quality of the evidence. You should have confidence in the following:
• An appropriate data collection method was used
• Data collectors were appropriately trained and supervised
• Data were collected consistently by all data collectors
• Respondent burden, reactivity, and social desirability was avoided
You can critically appraise a study in terms of data collection bias being minimized, thereby strengthening potential applicability of the evidence provided by the findings. Because a research article does not always provide all of the details, it is not uncommon to contact the researcher to obtain added information that may assist you in using results in practice. Some helpful questions to ask are listed in the Critical Appraisal Criteria box.
465
CRITICAL APPRAISAL CRITERIA Data collection methods
1. Are all of the data collection instruments clearly identified and described?
2. Are operational definitions provided and clear?
3. Is the rationale for their selection given?
4. Is the method used appropriate to the problem being studied?
5. Were the methods used appropriate to the clinical situation?
6. Was a standardized manual used to guide data collection?
7. Were all data collectors adequately trained and supervised?
8. Are the data collection procedures the same for all subjects?
Observational methods
1. Who did the observing?
2. Were the observers trained to minimize bias?
3. Was there an observation guide?
4. Were the observers required to make inferences about what they saw?
5. Is there any reason to believe that the presence of the observers affected the subject’s behavior?
6. Were the observations performed using the principles of informed consent?
7. Was interrater agreement between observers established?
Self-report: Interviews
466
1. Is the interview schedule described adequately enough to know whether it covers the topic?
2. Is there clear indication that the subjects understood the task and the questions?
3. Who were the interviewers, and how were they trained?
4. Is there evidence of interviewer bias?
Self-report: Questionnaires
1. Is the questionnaire described well enough to know whether it covers the topic?
2. Is there evidence that subjects were able to answer the questions?
3. Are the majority of the items appropriately closed-ended or open-ended?
Physiological measurement
1. Is the instrument used appropriate to the research question or hypothesis?
2. Is a rationale given for why a particular instrument was selected?
3. Is there a provision for evaluating the accuracy of the instrument?
Existing data: Records and databases
1. Are the existing data used appropriately, considering the research question and hypothesis being studied?
2. Are the data examined in such a way as to provide new information?
3. Is there any indication of selection bias in the available records?
Key points
467
• Data collection methods are described as being both objective and systematic. The data collection methods of a study provide the operational definitions of the relevant variables.
• Types of data collection methods include observational, self- report, physiological, and existing data. Each method has advantages and disadvantages.
• Physiological measurement involves the use of technical instruments to collect data about patients’ physical, chemical, microbiological, or anatomical status. They are suited to studying patient clinical outcomes and how to improve the effectiveness of nursing care. Physiological measurements are objective, precise, and sensitive. Expertise, training, and consistent application of these tests or procedures are needed to reduce the measurement error associated with this data collection method.
• Observational methods are used in nursing research when the variables of interest deal with events or behaviors. Scientific observation requires preplanning, systematic recording, controlling the observations, and providing a relationship to scientific theory. This method is best suited to research problems that are difficult to view as a part of a whole. The advantages of observational methods are that they provide flexibility to measure many types of situations and they allow for depth and breadth of information to be collected. Disadvantages include that data may be distorted as a result of the observer’s presence and observations may be biased by the person who is doing the observing.
• Interviews are commonly used data collection methods in nursing research. Either open-ended or closed-ended questions may be used when asking the subject questions. The form of the question should be clear to the respondent, free of suggestion, and grammatically correct.
• Questionnaires, or paper-and-pencil tests, are useful when there are a finite number of questions to be asked. Questions need to be clear and specific. Questionnaires are less costly in terms of time
468
and money to administer to large groups of subjects, particularly if the subjects are geographically widespread. Questionnaires also can be completely anonymous and prevent interviewer bias.
• Existing data in the form of records or large databases are an important source for research data. The use of available data may save the researcher considerable time and money when conducting a study. This method reduces problems with subject recruitment, access, and ethical concerns. However, records and available data are subject to problems of authenticity and accuracy.
Critical thinking challenges • When a researcher opts to use observation as the data collection
method, what steps must be taken to minimize bias?
• In a randomized clinical trial investigating the differential effect of an educational video intervention in comparison to a telephone counseling intervention, data were collected at four different hospitals by four different data collectors. What steps should the researcher take to ensure intervention fidelity?
• What are the strengths and weaknesses of collecting data using existing sources such as records, charts, and databases?
• Your interprofessional Journal Club just finished reading the research article by Nyamathi and colleagues in Appendix A. As part of your critical appraisal of this study, your team needed to identify the strengths and weaknesses of the data collection section. Discuss the sources of bias in the data collection procedures and evidence of fidelity.
• How does a training manual decrease the possibility of introducing bias into the data collection process, thereby increasing intervention fidelity?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for
469
review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Bern L., Brandt M., Mbelu N., et al. Differences in blood
pressure values obtained with automated and manual methods in medical inpatients. MEDSURG Nursing 2007;16:356-361.
2. Bova C., Drexler D., Sullivan-Bolyai S. Reframing the influence of HIPAA on research. Chest 2012;141:782-786.
3. Pickering T., Hall J., Appel L., et al. Recommendations for blood pressure measurement in humans and experimental animals part 1: blood pressure measurement in humans: a statement for professionals from the Subcommittee of Professional and Public Education of the American Heart Association Council on High Blood Pressure Research. Hypertension 2005;45:142-161.
4. Turner-Sack A., Menna R., Setchell S., et al. Psychological functioning, post-traumatic growth, and coping in parents and siblings of adolescent cancer survivors. Oncology Nursing Forum 2016;43:48-56 Available at: doi:10.1188/16.ONF.48-56
5. Ulrich C. M., Knafl K. A., Ratcliffe S. J., et al. Developing a model of the benefits and burdens of research participation in cancer clinical trials. American Journal of Bioethics Primary Research 2012;3(2):10-23.
470
CHAPTER 15
Reliability and validity Geri LoBiondo-Wood, Judith Haber
Learning outcomes
After reading this chapter, you should be able to do the following:
• Discuss how measurement error can affect the outcomes of a study. • Discuss the purposes of reliability and validity. • Define reliability. • Discuss the concepts of stability, equivalence, and homogeneity as they relate to reliability. • Compare and contrast the estimates of reliability. • Define validity. • Compare and contrast content, criterion-related, and construct validity. • Identify the criteria for critiquing the reliability and validity of measurement tools. • Use the critical appraisal criteria to evaluate the reliability and validity of measurement tools. • Discuss how reliability and validity contribute to the strength and
471
quality of evidence provided by the findings of a research study.
KEY TERMS
chance (random) errors
concurrent validity
construct
construct validity
content validity
content validity index
contrasted-groups (known-groups) approach
convergent validity
criterion-related validity
Cronbach’s alpha
divergent/discriminant validity
equivalence
error variance
face validity
factor analysis
homogeneity
hypothesis-testing approach
internal consistency
interrater reliability
item to total correlations
kappa
Kuder-Richardson (KR-20) coefficient
Likert scale
472
observed test score
parallel or alternate form reliability
predictive validity
reliability
reliability coefficient
split-half reliability
stability
systematic (constant) error
test-retest reliability
validity
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
The measurement of phenomena is a major concern of nursing researchers. Unless measurement instruments validly (accurately) and reliably (consistently) reflect the concepts of the theory being tested, conclusions drawn from a study will be invalid or biased. Issues of reliability and validity are of central concern to researchers, as well as to appraisers of research. From either perspective, the instruments that are used in a study must be evaluated. Researchers often face the challenge of developing new instruments and, as part of that process, establishing the reliability and validity of those instruments.
When reading studies, you must assess the reliability and validity of the instruments to determine the soundness of these selections in relation to the concepts (concepts are often called constructs in instrument development studies) or variables under study. The appropriateness of instruments and the extent to which reliability and validity are demonstrated have a profound influence on the strength of the findings and the extent to which bias is present. Invalid measures produce invalid estimates of the relationships between variables, thus introducing bias, which affects the study’s internal and external validity. As such, the assessment of reliability
473
and validity is an extremely important critical appraisal skill for assessing the strength and quality of evidence provided by the design and findings of a study and its applicability to practice.
This chapter examines the types of reliability and validity and demonstrates the applicability of these concepts to the evaluation of instruments in research and evidence-based practice.
Reliability, validity, and measurement error Reliability is the ability of an instrument to measure the attributes of a variable or construct consistently. Validity is the extent to which an instrument measures the attributes of a concept accurately. To understand reliability and validity, you need to understand potential errors related to instruments. Researchers may be concerned about whether the scores that were obtained for a sample of subjects were consistent, true measures of the behaviors and thus an accurate reflection of the differences among individuals. The extent of variability in test scores that is attributable to error rather than a true measure of the behaviors is the error variance. Error in measurement can occur in multiple ways.
An observed test score that is derived from a set of items actually consists of the true score plus error (Fig. 15.1). The error may be either chance or random error, or it may be systematic or constant error. Validity is concerned with systematic error, whereas reliability is concerned with random error. Chance or random errors are errors that are difficult to control (e.g., a respondent’s anxiety level at the time of testing). Random errors are unsystematic in nature; they are a result of a transient state in the subject, the context of the study, or the administration of an instrument. Example: ➤ Perceptions or behaviors that occur at a specific point in time (e.g., anxiety) are known as state or transient characteristics and are often beyond the awareness and control of the examiner. Another example of random error is in a study that measures blood pressure. Random error resulting in different blood pressure readings could occur by misplacement of the cuff, not waiting for a specific time period before taking the blood pressure, or placing the arm randomly in relationship to the heart while
474
measuring blood pressure.
FIG 15.1 Components of observed scores.
Systematic or constant error is measurement error that is attributable to relatively stable characteristics of the study sample that may bias their behavior and/or cause incorrect instrument calibration. Such error has a systematic biasing influence on the subjects’ responses and thereby influences the validity of the instruments. For instance, level of education, socioeconomic status, social desirability, response set, or other characteristics may influence the validity of the instrument by altering measurement of the “true” responses in a systematic way. Example: ➤ A subject is completing a survey examining attitudes about caring for elderly patients. If the subject wants to please the investigator, items may constantly be answered in a socially desirable way rather than reflecting how the individual actually feels, thus making the estimate of validity inaccurate. Systematic error occurs also when an instrument is improperly calibrated. Consider a scale that consistently gives a person’s weight at 2 pounds less than the actual body weight. The scale could be quite reliable (i.e., capable of reproducing the precise measurement), but the result is consistently invalid.
The concept of error is important when appraising instruments in a study. The information regarding the instruments’ reliability and validity is found in the instrument or measures section of a study, which can be separately titled or appear as a subsection of the
475
methods section of a research report, unless the study is a psychometric or instrument development study (see Chapter 10). HELPFUL HINT Research articles vary considerably in the amount of detail included about reliability and validity. When the focus of a study is instrument development, psychometric evaluation—including reliability and validity data—is carefully documented and appears throughout the article rather than briefly in the “Instruments” or “Measures” section, as in a research article.
Validity Validity is the extent to which an instrument measures the attributes of a concept accurately. When an instrument is valid, it reflects the concept it is supposed to measure. A valid instrument that is supposed to measure anxiety does so; it does not measure another concept, such as stress. A measure can be reliable but not valid. Let’s say that a researcher wanted to measure anxiety in patients by measuring their body temperatures. The researcher could obtain highly accurate, consistent, and precise temperature recordings, but such a measure may not be a valid indicator of anxiety. Thus the high reliability of an instrument is not necessarily congruent with evidence of validity. A valid instrument, however, is reliable. An instrument cannot validly measure a variable if it is erratic, inconsistent, or inaccurate. There are three types of validity that vary according to the kind of information provided and the purpose of the instrument (i.e., content, criterion-related, and construct validity). As you appraise research articles, you will want to evaluate whether sufficient evidence of validity is present and whether the type of validity is appropriate to the study’s design and the instruments used in the study.
As you read the instruments or measures sections of studies, you will notice that validity data are reported much less frequently than reliability data. DeVon and colleagues (2007) note that adequate validity is frequently claimed, but rarely is the method specified. This lack of reporting, largely due to publication space constraints, shows the importance of critiquing the quality of the instruments and the conclusions (see Chapters 14 and 17).
476
EVIDENCE-BASED PRACTICE TIP Selecting instruments that have strong evidence of validity increases your confidence in the study findings—that the researchers actually measured what they intended to measure.
Content validity Content validity represents the universe of content or the domain of a given variable/construct. The universe of content provides the basis for developing the items that will adequately represent the content. When an investigator is developing an instrument and issues of content validity arise, the concern is whether the measurement instrument and the items it contains are representative of the content domain that the researcher intends to measure. The researcher begins by defining the concept and identifying the attributes or dimensions of the concept. The items that reflect the concept and its domain are developed.
The formulated items are submitted to content experts who judge the items. Example: ➤ Researchers typically request that the experts indicate their agreement with the scope of the items and the extent to which the items reflect the concept under consideration. Box 15.1 provides an example of content validity. BOX 15.1 Published Examples of Content Validity and Content Validity Index Content validity An expert panel of key stakeholders assisted with validation of the items on the adherence subscale on the modified version of the Fidelity Checklist. To determine adherence items, the expert panel of key stakeholders identified items that were deemed mandatory for clinicians to cover during the intervention. The mandatory items were used to develop the adherence subscale. Through in- person discussion, key stakeholders arrived at a 100% agreement on the relevance of each item of the adherence subscale, ensuring the content validity of both the intervention and the adherence subscale ( Clark et al., 2016).
477
Content validity index For the Chinese Illness Perception Questionnaire Revised Trauma (the Chinese IPQ-Revised-Trauma), the Item-level Content Validity Index (I-CVI) was calculated by a panel of five trauma content experts. An average of 88% for all subscale items was scored by the experts, indicating that the validity of the score was reguaranteed. A few words were fixed after expert checking. The ratings were on a four-point scale with a response format of 1 = not relevant to 4 = highly relevant. The I-CVI for each item was computed based on the percentage of experts giving a rating of 3 or 4, indicating item relevance ( Lee et al., 2016).
Another method used to establish content validity is the content validity index (CVI). The CVI moves beyond the level of agreement of a panel of expert judges and calculates an index of interrater agreement or relevance. This calculation gives a researcher more confidence or evidence that the instrument truly reflects the concept or construct. When reading the instrument section of a research article, note that the authors will comment if a CVI was used to assess content validity. When reading a psychometric study that reports the development of an instrument, you will find great detail and a much longer section indicating how exactly the researchers calculated the CVI and the acceptable item cutoffs. In the scientific literature there has been discussion of accepting a CVI of 0.78 to 1.0, depending on the number of experts (DeVon et al., 2007; Lynn, 1986). An example from a study that used CVI is presented in Box 15.1. A subtype of content validity is face validity, which is a rudimentary type of validity that basically verifies that the instrument gives the appearance of measuring the concept. It is an intuitive type of validity in which colleagues or subjects are asked to read the instrument and evaluate the content in terms of whether it appears to reflect the concept the researcher intends to measure.
EVIDENCE-BASED PRACTICE TIP If face and/or content validity, the most basic types of validity, was (or were) the only type(s) of validity reported in a research article, you would not appraise the measurement instrument(s) as having strong psychometric properties, which would negatively influence
478
your confidence about the study findings.
Criterion-related validity Criterion-related validity indicates to what degree the subject’s performance on the instrument and the subject’s actual behavior are related. The criterion is usually the second measure, which assesses the same concept under study. Two forms of criterion-related validity are concurrent and predictive.
Concurrent validity refers to the degree of correlation of one test with the scores of another more established instrument of the same concept when both are administered at the same time. A high correlation coefficient indicates agreement between the two measures and evidence of concurrent validity.
Predictive validity refers to the degree of correlation between the measure of the concept and some future measure of the same concept. Because of the passage of time, the correlation coefficients are likely to be lower for predictive validity studies. Examples of concurrent and predictive validity as they appear in research articles are illustrated in Box 15.2. BOX 15.2 Published Examples of Reported Criterion- Related Validity Concurrent validity The Patient-Reported Outcomes Measurement Information System Fatigue-Short Form (PROMIS-SF) consists of seven items that measure both the experience of fatigue and the interference of fatigue on daily activities over the past week (NIH, 2007). Concurrent validity of the PROMIS-SF was established through correlations between the PROMIS-SF and the Multidimensional Fatigue Symptom Inventory-Short Form (MFSI-SF), as well as the Brief Fatigue Inventory (BFI). Correlations between the PROMIS-SF and the MFSI-SF ranged from r = 0.70 to 0.85, and correlations between the PROMIS-F-SF and the BFI ranged from r = 0.60 to 0.71. Correlations between measures of like constructs are expected to be strong. As all three were measures of fatigue, strong correlations were expected and provided evidence of concurrent validity
479
(Ameringer et al., 2016).
Predictive validity In a study modifying the Champion Health Belief Model Scale (Champion, 1993) to fit with prostate cancer screening (PCS), translate it into Arabic, and test the psychometric properties of the Champion Health Belief Model Scale for Prostate Cancer Screening (CHBMS-PCS), the predictive validity of the Arabic version was established by using regression analysis (Chapter 16) to predict the combined predictive effect of all seven subscales of the CHBMS- PCS on the performance of the PCS. All of the subscales were found to be significantly correlated and predictive for the performance of the PCS at the p <.05 level or less (Abudas et al., 2016).
Construct validity Construct validity is based on the extent to which a test measures a theoretical construct, attribute, or trait. It attempts to validate the theory underlying the measurement by testing of the hypothesized relationships. Testing confirms or fails to confirm the relationships that are predicted between and/or among concepts and, as such, provides more or less support for the construct validity of the instruments measuring those concepts. The establishment of construct validity is complex, often involving several studies and approaches. The hypothesis-testing, factor analytical, convergent and divergent, and contrasted-groups approaches are discussed in the following sections. Box 15.3 provides examples of different types of construct validity as it is reported in published research articles. BOX 15.3 Published Examples of Reported Construct Validity Contrasted groups (known groups) In the study to examine the psychometric properties of the Patient- Reported Outcomes Measurement Information System Fatigue Short-Form across diverse populations, known group validity was
480
established by correlating the four study samples’ levels of fatigue (e.g., fibromyalgia, sickle cell disease, cardio metabolic risk, pregnancy) with healthy controls. The study samples had significantly higher levels of fatigue than the healthy controls (Ameringer et al., 2016).
Convergent validity “Convergent construct validity of the Spiritual Coping Strategies Scale (SCS) subscales is supported by correlations of 0.40 with the well-established Spiritual Well Being instrument (Baldacchino & Bulhagiar, 2003). In this study, parents’ subscales internal consistencies at T1 and T2 were r = 0.87 to 0.90 for religious activities and r = 0.80 to 0.82 for spiritual activities” (Hawthorne et al., 2016; Appendix B).
Divergent (discriminant) validity Pearson correlations between the Patient-Reported Outcomes Measurement Information System Fatigue Short-Form (PROMIS-F- SF), and the Perceived Stress Scale (PSS) and depressive symptoms (CES-D) were calculated to assess the discriminant validity. Since correlations between measures of constructs that are related but not alike are expected to be weak to moderate, correlations between the PROMIS-F-SF and CES-D ranged from r = 0.45 to 0.64 and the PROMIS-F-SF and the PSS ranged from r = 0.37 to 0.62 supported the discriminant validity of the PROMIS-F-SF (Ameringer et al., 2016).
Factor analysis In a study assessing nurses’ perceived leadership abilities during episodes of clinical deterioration, Hart and colleagues (2016) did psychometric testing of the Clinical Deterioration Leadership Ability Scale (CDLAS). Construct validity was supported by a Principal Components Analysis with varimax rotation. The factor analysis determined a 1-factor structure with factor loadings that ranged from 0.655 to 0.792; exceeding the factor loading cutoff of 0.40, this factor was named leadership abilities.
Hypothesis testing In a study assessing nurses’ perceived leadership abilities during
481
episodes of clinical deterioration, it was hypothesized that nurses with 11 or more years of practice experience would score significantly higher on the Clinical Deterioration Leadership Ability Scale (CDLAS) than nurses with 10 years or less of practice experience. A statistically significant difference in CDLAS mean t- test scores (p = 0.047) supported this hypothesis, thereby providing evidence of construct validity (Hart et al., 2016).
Hypothesis-testing approach When the hypothesis-testing approach is used, the investigator uses the theory or concept underlying the measurement instruments to validate the instrument. The investigator does this by developing hypotheses regarding the behavior of individuals with varying scores on the measurement instrument, collecting data to test the hypotheses, and making inferences on the basis of the findings concerning whether the rationale underlying the instrument’s construction is adequate to explain the findings and thereby provide support for evidence of construct validity (see Box 15.2).
Convergent and divergent approaches Strategies for assessing construct validity include convergent and divergent approaches. Convergent validity, sometimes called concurrent validity, refers to a search for other measures of the construct. Sometimes two or more instruments that theoretically measure the same construct are identified, and both are administered to the same subjects. A correlational analysis (i.e., test of relationship; see Chapter 16) is performed. If the measures are positively correlated, convergent validity is said to be supported.
Divergent validity, sometimes called discriminant validity, uses measurement approaches that differentiate one construct from others that may be similar. Sometimes researchers search for instruments that measure the opposite of the construct. If the divergent measure is negatively related to other measures, validity for the measure is strengthened.
HELPFUL HINT When validity data about the measurements used in a study are
482
not included in a research article, you have no way of determining whether the intended concept is actually being captured by the measurement. Before you use the results in such a case, it is important to go back to the original primary source to check the instrument’s validity.
Contrasted-groups approach When the contrasted-groups approach (sometimes called the known-groups approach) is used to test construct validity, the researcher identifies two groups of individuals who are suspected to score extremely high or low in the characteristic being measured by the instrument. The instrument is administered to both the high- scoring and the low-scoring group, and the differences in scores are examined. If the instrument is sensitive to individual differences in the trait being measured, the mean performance of these two groups should differ significantly and evidence of construct validity would be supported. A t test or analysis of variance could be used to statistically test the difference between the two groups (see Box 15.2 and Chapter 16).
EVIDENCE-BASED PRACTICE TIP When the instruments used in a study are presented, note whether the sample(s) used to develop the measurement instrument(s) is (are) similar to your patient population.
Factor analytical approach A final approach to assessing construct validity is factor analysis. This is a procedure that gives the researcher information about the extent to which a set of items measures the same underlying concept (variable) of a construct. Factor analysis assesses the degree to which the individual items on a scale truly cluster around one or more concepts. Items designed to measure the same concept should load on the same factor; those designed to measure different concepts should load on different factors (Anastasi & Urbina, 1997; Furr & Bacharach, 2008; Nunnally & Bernstein, 1993). This analysis, as illustrated in the example in Box 15.2, will also indicate whether the items in the instrument reflect a single construct or several constructs.
483
The Critical Thinking Decision Path will help you assess the appropriateness of the type of validity and reliability selected for use in a particular study.
CRITICAL THINKING DECISION PATH Determining the Appropriate Type of Validity and Reliability Selected for a Study
Reliability Reliable people are those whose behavior can be relied on to be
484
consistent and predictable. Likewise, the reliability of an instrument is defined as the extent to which the instrument produces the same results if the behavior is repeatedly measured with the same scale. Reliability is concerned with consistency, accuracy, precision, stability, equivalence, and homogeneity. Concurrent with the questions of validity or after they are answered, you ask about the reliability of the instrument. Reliability refers to the proportion of consistency to inconsistency in measurement. In other words, if we use the same or comparable instruments on more than one occasion to measure a set of behaviors that ordinarily remains relatively constant, we would expect similar results if the instruments are reliable.
The main attributes of a reliable scale are stability, homogeneity, and equivalence. The stability of an instrument refers to the instrument’s ability to produce the same results with repeated testing. The homogeneity of an instrument means that all of the items in an instrument measure the same concept, variable, or characteristic. An instrument is said to exhibit equivalence if it produces the same results when equivalent or parallel instruments or procedures are used. Each of these attributes and an understanding of how to interpret reliability are essential.
Reliability coefficient interpretation Reliability is concerned with the degree of consistency between scores that are obtained at two or more independent times of testing and is expressed as a correlation coefficient. Reliability coefficient ranges from 0 to 1. The reliability coefficient expresses the relationship between the error variance, the true (score) variance, and the observed score. A zero correlation indicates that there is no relationship. When the error variance in a measurement instrument is low, the reliability coefficient will be closer to 1. The closer to 1 the coefficient is, the more reliable the instrument. Example: ➤ A reliability coefficient of an instrument is reported to be 0.89. This tells you that the error variance is small and the instrument has little measurement error. On the other hand, if the reliability coefficient of a measure is reported to be 0.49, the error variance is high and the instrument has a problem with measurement error. For a research instrument to be considered
485
reliable, a reliability coefficient of 0.70 or above is necessary. If it is a clinical instrument, a reliability coefficient of 0.90 or higher is considered to be an acceptable level of reliability.
The tests of reliability used to calculate a reliability coefficient depends on the nature of the instrument. The tests are test-retest, parallel or alternate form, item to total correlation, split-half, Kuder-Richardson (KR-20), Cronbach’s alpha, and interrater reliability. These tests as they relate to stability, equivalence, and homogeneity are listed in Box 15.4, and examples of the types of reliability are in Box 15.5. There is no best means to assess reliability. The reliability method that the researcher uses should be consistent with the study’s aim and the instrument’s format. BOX 15.4 Measures Used to Test Reliability Stability Test-retest reliability
Parallel or alternate form
Homogeneity Item to total correlation
Split-half reliability Kuder-Richardson coefficient Cronbach’s alpha
Equivalence Parallel or alternate form
Interrater reliability
BOX 15.5 Published Examples of Reported Reliability Internal consistency In a study by Bhandari and Kim (2016) investigating self-care behaviors of Nepalese adults with type 2 diabetes, self-care behaviors were measured by the DMSE scale (Bijl et al., 1999). Cronbach’s alpha was 0.81 in a study of European adults with type
486
2 DM (Bijl et al., 1999); it was 0.86 in the current study.
Test-retest reliability In a study by Ganz and colleagues (2016) that examined whether nurses fully implement their scope of practice, the Implementation of Scope of Practice Scale, developed by the researchers for the study, established strong test-retest reliability (r = 0.92) by administering the scale at baseline and again 3 weeks later.
Kuder-richardson (kr-20) A study by Jessee and Tanner (2016) aimed to develop a Clinical Coaching Interactions Inventory, a tool to evaluate one-to-one teaching, verbal questioning, and feedback behaviors of clinical faculty and/or preceptors interacting with students in clinical settings. The teaching-questioning dimension demonstrated Kuder-Richardson Formula 20 (KR-20) of 0.70 overall, 0.63 for the faculty version, and 0.71 for the staff nurse preceptor version. The inventory is composed of binary items (e.g., Yes/No), and therefore a lower KR-20 reliability estimate is not unexpected and a KR-20 reliability estimate can still be considered acceptable.
Interrater reliability and kappa In the Johansson and colleagues (2016) study evaluating the oral health status of older adults in Sweden receiving elder care, the ROAG-J was used to assess oral health by evaluating the condition of the voice, lips, oral mucosa, tongue, gums, teeth, saliva, swallowing, and any prostheses. Moderate to good interrater reliability was reported for the trained examiners (mean kappa estimate 0.59); interrater reproducibility (kappa estimate 1.00) and high sensitivity and specificity within elderly care in previous studies have been reported (Anderson et al., 2002; Ribeiro et al., 2014).
Item to total Abuadas and colleagues (2016) examined the item-to-total correlations as part of the assessment of reliability for the Arabic version of the Champion’s Health Belief Model Scales for Prostate Cancer Screening (CHBMS-PCS). The aim was to identify poorly functioning items on the CHBMS-PCS. A cutoff score of 0.30 was
487
established; all of the corrected item-to-total correlations were greater than 0.30 and ranged from 0.60 to 0.79. This indicated that the scale items have distinguishing consistency with each other. This was reinforced by the Cronbach’s alpha coefficient for the total scale of 0.87.
Stability An instrument is stable or exhibits stability when the same results are obtained on repeated administration of the instrument. Measurement over time is important when an instrument is used in a longitudinal study and therefore used on several occasions. Stability is also a consideration when a researcher is conducting an intervention study that is designed to effect a change in a specific variable. In this case, the instrument is administered and then again later, after the experimental intervention has been completed. The tests that are used to estimate stability are test-retest and parallel or alternate form.
Test-retest reliability Test-retest reliability is the administration of the same instrument to the same subjects under similar conditions on two or more occasions. Scores from repeated testing are compared. This comparison is expressed by a correlation coefficient, usually a Pearson r (see Chapter 16). The interval between repeated administrations varies and depends on the variable being measured. Example: ➤ If the variable that the test measures is related to the developmental stages in children, the interval between tests should be short. The amount of time over which the variable was measured should also be identified in the study.
HELPFUL HINT When a longitudinal design with multiple data collection points is being conducted, look for evidence of test-retest or parallel form reliability.
Parallel or alternate form Parallel or alternate form reliability is applicable and can be tested only if two comparable forms of the same instrument exist. Not many
488
instruments have a parallel form, so it is unusual to find examples in the literature. It is similar to test-retest reliability in that the same individuals are tested within a specific interval, but it differs because a different form of the same test is given to the subjects on the second testing. Parallel forms or tests contain the same types of items that are based on the same concept, but the wording of the items is different. The development of parallel forms is desired if the instrument is intended to measure a variable for which a researcher believes that “test-wiseness” will be a problem (see Chapter 8). Example: ➤ Consider a study to establish the reliability and validity of the Social Attribution Task-Multiple Choice (SAT- MC), a measurement instrument of geometric figures designed to assess implicit social attribution formation while reducing verbal and cognitive demands required of other common measures (Johannesen et al., 2013). The authors conducted a comparable analysis of the SAT-MC and the new SAT-MC II, a comparable form created for repeated testing to decrease threats to internal validity related to practice effect. External validation measures between the two forms were nearly identical, with evidence supporting convergent and divergent validity. Practically speaking, it is difficult to develop alternate forms of an instrument when one considers the many issues of reliability and validity. If alternate forms of a test exist, they should be highly correlated if the measures are to be considered reliable.
Internal consistency/homogeneity Another attribute of an instrument related to reliability is the internal consistency or homogeneity. In this case, the items within the scale reflect or measure the same concept. This means that the items within the scale correlate or are complementary to each other. This also means that a scale is unidimensional. A unidimensional scale is one that measures one concept, such as self-efficacy. Box 15.5 provides several examples of how internal consistency is reported. Internal consistency can be assessed using one of four methods: item to total correlations, split-half reliability, Kuder- Richardson (KR-20) coefficient, or Cronbach’s alpha.
EVIDENCE-BASED PRACTICE TIP
489
When the characteristics of a study sample differ significantly from the sample in the original study, check to see if the researcher has reestablished the reliability of the instrument with the current sample.
Item to total correlations Item to total correlations measure the relationship between each of the items and the total scale. When item to total correlations are calculated, a correlation for each item on the scale is generated (Table 15.1). Items that do not achieve a high correlation may be deleted from the instrument. Usually in a research study, all of the item to total correlations are not reported unless the study is a report of a methodological study. The lowest and highest correlations are typically reported.
TABLE 15.1 Examples of Cronbach’s Alpha From the Alhusen Study (Appendix B)
Cronbach’s alpha The fourth and most commonly used test of internal consistency is Cronbach’s alpha, which is used when a measurement instrument uses a Likert scale. Many scales used to measure psychosocial variables and attitudes have a Likert scale response format. A Likert scale format asks the subject to respond to a question on a scale of varying degrees of intensity between two extremes. The two extremes are anchored by responses ranging from “strongly agree” to “strongly disagree” or “most like me” to “least like me.” The points between the two extremes may range from 1 to 4, 1 to 5, or 1 to 7. Subjects are asked to identify the response closest to how
490
they feel. Cronbach’s alpha simultaneously compares each item in the scale with the others. A total score is then used in the data analysis as illustrated in Table 15.1. Alphas above 0.70 are sufficient evidence for supporting the internal consistency of the instrument. Fig. 15.2 provides examples of items from an instrument that use a Likert scale format.
FIG 15.2 Examples of a Likert scale. (Redrawn from Roberts, K. T., & Aspy, C. B. (1993). Development of the serenity scale. Journal of Nursing Measurement,
1(2), 145–164.)
Split-half reliability Split-half reliability involves dividing a scale into two halves and making a comparison. The halves may be odd-numbered and even- numbered items or may be a simple division of the first from the second half, or items may be randomly selected into halves that will be analyzed opposite one another. The split-half method provides a measure of consistency. The two halves of the test or the contents in both halves are assumed to be comparable, and a reliability
491
coefficient is calculated. If the scores for the two halves are approximately equal, the test may be considered reliable. See Box 15.5 for an example.
Kuder-richardson (kr-20) coefficient The Kuder-Richardson (KR-20) coefficient is the estimate of homogeneity used for instruments that have a dichotomous response format. A dichotomous response format is one in which the question asks for a “yes/no” or “true/false” response. The technique yields a correlation that is based on the consistency of responses to all the items of a single form of a test that is administered one time. The minimum acceptable KR-20 score is r = 0.70 (see Box 15.5).
HIGHLIGHT Your team is critically appraising a research study reporting on an innovative intervention for reducing risk for hospital acquired pressure ulcers. Data are collected using observation and multiple observers. You want to find evidence that the observers have been trained until there is a high level of interrater reliability so that you are confident that they were observing the subjects’ skin according to standardized criteria and completing their Checklist ratings in a consistent way across observers.
Equivalence Equivalence either is the consistency or agreement among observers using the same measurement instrument or is the consistency or agreement between alternate forms of an instrument. An instrument is thought to demonstrate equivalence when two or more observers have a high percentage of agreement of an observed behavior or when alternate forms of a test yield a high correlation. There are two methods to test equivalence: interrater reliability and alternate or parallel form.
Interrater reliability Some measurement instruments are not self-administered questionnaires but are direct measurements of observed behavior. Instruments that depend on direct observation of a behavior that is
492
to be systematically recorded must be tested for interrater reliability. To accomplish interrater reliability, two or more individuals should make an observation, or one observer should examine the behavior on several occasions. The observers should be trained or oriented to the definition and operationalization of the behavior to be observed. The consistency or reliability of the observations among observers is extremely important. Interrater reliability tests the consistency of the observer rather than the reliability of the instrument. Interrater reliability is expressed as a percentage of agreement between scorers or as a correlation coefficient of the scores assigned to the observed behaviors.
Kappa (κ) expresses the level of agreement observed beyond the level that would be expected by chance alone. κ ranges from +1 (total agreement) to 0 (no agreement). A κ of 0.80 or better indicates good interrater reliability. κ between 0.80 and 0.68 is considered acceptable/substantial agreement; less than 0.68 allows tentative conclusions to be drawn at times when lower levels are accepted (McDowell & Newell, 1996) (see Box 15.5).
EVIDENCE-BASED PRACTICE TIP Interrater reliability is an important approach to minimizing bias.
Parallel or alternate form Parallel or alternate form was described in the discussion of stability in this chapter. Use of parallel forms is a measure of stability and equivalence. The procedures for assessing equivalence using parallel forms are the same.
Classic test theory versus item response theory The methods of reliability and validity described in this chapter are considered classical test theory (CTT) methods. There are newer methods that you will find described in research articles under the category of item response theory (IRT). The two methods share basic characteristics, but some feel that IRT methods are superior for discriminating test items. Several terms and concepts linked with IRT are Rasch models and one (or two) parameter logistic
493
models. The methodology of these methods is beyond the scope of this text, but several references are cited for future use (DeVellis, 2012; Furr & Bacharach, 2008).
How validity and reliability are reported When reading a research article, a lengthy discussion of how the different types of reliability and validity were obtained will typically not be found. What is found in the methods section is the instrument’s title, a definition of the concept/construct that it measures, and a sentence or two about discussion is appropriate. Examples of what you will see include the following:
• “Tedeschi and Calhoun (1996) reported an internal consistency coefficient of 0.9 for the full scale the PTG (Post-traumatic Growth Inventory) for the full scale and a test-retest reliability of 0.71 after two months. Yaskowich (2003) reported an internal consistency for the full scale of the modified PTGI in a sample of 35 adolescent cancer survivors. The internal consistency of the modified PTGI was 0.94 for survivors and siblings and 0.96 for parents in the current study” (Turner-Sack et al., 2016, p. 51; Appendix D).
• The Connor-Davidson Resilience Scale (CD-RISC) reports the “Cronbach’s alpha for the full scale is reported to be.89 and item- total correlations range from.30 to.70. The CD-RISC possess good validity and reliability in the Iranian population (Khoshouei, 2009) and Cronbach’s alpha for the scale in the current study was.93” (Barahmand & Ahmad, 2016).
• “Content, construct, and criterion-related validity has been documented for the Bakas Caregiving Outcomes Scale (BCOS) that measures Life changes (e.g., Changes in social functioning, subjective well-being, and physical health). Evidence of internal consistency reliability has been documented in primary care and with stroke care givers. The Cronbach alpha for the BCOS in this study was 0.87” (Bakas et al., 2015).
494
Appraisal for evidence-based practice reliability and validity Reliability and validity are crucial aspects in the critical appraisal of a measurement instrument. Criteria for critiquing reliability and validity are presented in the Critical Appraisal Criteria box. When reviewing a research article, you need to appraise each instrument’s reliability and validity. In a research article, the reliability and validity for each measure should be presented or a reference should be provided where it was described in more detail. If this information is not been presented at all, you must seriously question the merit and use of the instrument and the evidence provided by the study’s results.
CRITICAL APPRAISAL CRITERIA Reliability and validity
1. Was an appropriate method used to test the reliability of the instrument?
2. Is the reliability of the instrument adequate?
3. Was an appropriate method(s) used to test the validity of the instrument?
4. Is the validity of the measurement instrument adequate?
5. If the sample from the developmental stage of the instrument was different from the current sample, were the reliability and validity recalculated to determine if the instrument is appropriate for use in a different population?
6. What kinds of threats to internal and/or external validity are presented by weaknesses in reliability and/or validity?
7. Are strengths and weaknesses of the reliability and validity of the instruments appropriately addressed in the “Discussion,” “Limitations,” or “Recommendations” sections of the report?
495
8. How do the reliability and/or validity affect the strength and quality of the evidence provided by the study findings?
The amount of information provided for each instrument will vary depending on the study type and the instrument. In a psychometric study (an instrument development study) you will find great detail regarding how the researchers established the reliability and validity of the instrument. When reading a research article in which the instruments are used to test a research question or hypothesis, you may find only brief reference to the type of reliability and validity of the instrument. If the instrument is a well- known, reliable, and valid instrument, it is not uncommon that only a passing comment may be made, which is appropriate. Example: ➤ In the study by Vermeesch and colleagues (2015) examining biological and sociocultural differences in perceived barriers to physical activity among fifth to seventh grade urban girls, the researchers noted acceptable face, content and construct validity, and reliability estimated by Cronbach’s alpha of 0.78 have been reported (Robbins et al., 2008, 2009). As in the previously provided example, authors often will cite a reference that you can locate if you are interested in detailed data about the instrument’s reliability or validity. If a study does not use reliable and valid questionnaires, you need to consider the sources of bias that may exist as threats to internal or external validity. It is very difficult to place confidence in the evidence generated by a study’s findings if the measures used did not have established validity and reliability. The following discussion highlights key areas related to reliability and validity that should be evident as you read a research article.
The investigator determines which type of reliability procedures need to be used in the study, depending on the nature of the measurement instrument and how it will be used. Example: ➤ If the instrument is to be administered twice, you would expect to read that test-retest reliability was used to establish the stability of the instrument. If an alternate form has been developed for use in a repeated-measures design, evidence of alternate form reliability should be presented to determine the equivalence of the parallel forms. If the degree of internal consistency among the items is relevant, an appropriate test of internal consistency should be
496
presented. In some instances, more than one type of reliability will be presented, but as you assess the instruments section of a research report, you should determine whether all are appropriate. Example: ➤ The Kuder-Richardson formula implies that there is a single right or wrong answer, making it inappropriate to use with scales that provide a format of three or more possible responses. In the latter case, another formula is applied, such as Cronbach’s coefficient alpha. Another important consideration is the acceptable level of reliability, which varies according to the type of test. Reliability coefficients of 0.70 or higher are desirable. The validity of an instrument is limited by its reliability; that is, less confidence can be placed in scores from tests with low reliability coefficients.
Satisfactory evidence of validity will probably be the most difficult item for you to ascertain. It is this aspect of measurement that is most likely to fall short of meeting the required criteria. Page count limitations often account for this brevity. Detailed validity data usually are only reported in studies focused on instrument development; therefore validity data are mentioned only briefly or, sometimes, not at all. The most common type of reported validity is content validity. When reviewing a study, you want to find evidence of content validity. Once again, you will find the detailed reporting of content validity and the CVI in psychometric studies; Box 15.2 provides a good example of how content validity is reported in a psychometric study. Such procedures provide you with assurance that the instrument is psychometrically sound and that the content of the items is consistent with the conceptual framework and construct definitions. In studies where several instruments are used, the reporting of content validity is either absent or very brief.
Construct validity and criterion-related validity are more precise statistical tests of whether the instrument measures what it is supposed to measure. Ideally an instrument should provide evidence of content validity, as well as criterion-related or construct validity, before one invests a high level of confidence in the instrument. You will see evidence that the reliability and validity of a measurement instrument are reestablished periodically, as you can see in the examples that appear in Boxes 15.2 to 15.5. You would expect to see the strengths and weaknesses of instrument
497
reliability and validity presented in the “Discussion,” “Limitations,” and/or “Recommendations” sections of an article. In this context, the reliability and validity might be discussed in terms of bias—that is, threats to internal and/or external validity that affect the study findings. Example: ➤ In the study by Hart and colleagues (2016), evaluating the psychometric properties of the Clinical Deterioration Leadership Ability Scale (CDLAS), the authors note that despite satisfactory reliability and validity findings, limitations include the homogeneous sample of mostly white, female registered nurses practicing in one integrated five hospital health system in the southeast United States. This sample limits the generalizability of the results to other populations. The authors suggest that further research is needed with diverse groups of nurses in multiple geographic locations. In addition, further research should focus on conducting test-retest reliability to further establish the psychometric properties of the CDLAS.
The findings of any study in which the reliability and validity are sparse does limit generalizability of the findings, but also adds to our knowledge regarding future research directions. Finally, recommendations for improving future studies in relation to instrument reliability and validity may be proposed.
As you can see, the area of reliability and validity is complex. You should not feel intimidated by the complexity of this topic; use the guidelines presented in this chapter to systematically assess the reliability and validity aspects of a research study. Collegial dialogue is also an approach for evaluating the merits and shortcomings of an existing, as well as a newly developed, instrument that is reported in the nursing literature. Such an exchange promotes the understanding of methodologies and techniques of reliability and validity, stimulates the acquisition of a basic knowledge of psychometrics, and encourages the exploration of alternative methods of observation and use of reliable and valid instruments in clinical practice.
Key points • Reliability and validity are crucial aspects of conducting and
critiquing research.
498
• Validity is the extent to which an instrument measures the attributes of a concept accurately. Three types of validity are content validity, criterion-related validity, and construct validity.
• The choice of a method for establishing reliability or validity is important and is made by the researcher on the basis of the characteristics of the measurement instrument and its intended use.
• Reliability is the ability of an instrument to measure the attributes of a concept or construct consistently. The major tests of reliability are test-retest, parallel or alternate form, split-half, item to total correlation, Kuder-Richardson, Cronbach’s alpha, and interrater reliability.
• The selection of a method for establishing reliability or validity depends on the characteristics of the instrument, the testing method that is used for collecting data from the sample, and the kinds of data that are obtained.
• Critical appraisal of instrument reliability and validity in a research report focuses on internal and external validity as sources of bias that contribute to the strength and quality of evidence provided by the findings.
Critical thinking challenges • Discuss the types of validity that must be established before you
invest a high level of confidence in the measurement instruments used in a research study.
• What are the major tests of reliability? Why is it important to establish the appropriate type of reliability for a measurement instrument?
• A journal club just finished reading the research report by Thomas and colleagues in Appendix A. As part of their critical appraisal of this study, they needed to identify the strengths and weaknesses of the reliability and validity section of this research
499
report. If you were a member of this journal club, how would you assess the reliability and validity of the instruments used in this study?
• How does the strength and quality of evidence related to reliability and validity influence the applicability of findings to clinical practice?
• When your QI Team finds that a researcher does not report reliability or validity data, which threats to internal and/or external validity should your team consider? In your judgment, how would these threats affect your evaluation of the strength and quality of evidence provided by the study and your team’s confidence in applying the findings to practice?
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Abuadas M. H, Petro-Nustas W., Albikawi Z. F, Nabolski
M. Transcultural adaptation and validation of Champion’s health belief model scales for prostate cancer screening. Journal of Nursing Measurement 2016;24(2):296-313.
2. Ameringer S., Jr. Menzies V., et al. Psychometric evaluation of the patient-reported outcomes measurement information system fatigue-short form across diverse populations. Nursing Research 2016;65(4):279k-289k
3. Anastasi A., Urbina S. Psychological testing. 7th ed. New York, NY: Macmillan 1997;
4. Bakas T., Austin J. K, Habermann B., et al. Telephone assessment and skill-building kit for stroke caregivers; A randomized controlled clinical trial. Stroke 2015;46:3478-3487.
5. Baldacchino D. R, Bulhagiar A. Psychometric evaluation of the spiritual coping strategies in English, Maltese Back translation and bilingual versions. Journal of Advanced Nursing 2003;42:558-570.
500
6. Barahmand U., Ahmad R.H.S. Psychotic-like experiences and psychological distress The role of resilience. Journal of the American Psychiatric Nurses Association 2016;22(4):312-319.
7. Bhandari P., Kim M. Self-care behaviors of Nepalese adults with type 2 diabetes A mixed methods analysis. Nursing Research 2016;65(3):202-214.
8. Bijl J. V, Peolgeest-Eeltink A. V, Shortbridge-Baggett L. The psychometric properties of the diabetes management self-efficacy scale for patients with type 2 diabetes mellitus. Journal of Advanced Nursing 1999;30:352-359.
9. Champion V. L. Instrument refinement for breast cancer screening behaviors. Nursing Research 1993;42(3):139-143.
10. Clark A., Breitenstein S., Martsolf D. S, Winstanley E. L. Assessing fidelity of a community-based opioid overdose prevention program Modification of the fidelity checklist. Journal of Nursing Scholarship 2016;48(4):377-378.
11. DeVon F. A, Block M. E, Moyle-Wright P., et al. A psychometric toolbox for testing validity and reliability. Journal of Nursing Scholarship 2007;39(2):155-164.
12. DeVellis R. F. Scale development Theory and applications. Los Angeles, CA: Sage Publications 2012;
13. Furr M. R, Bacharach V. R. Psychometrics An introduction. Los Angeles, CA: Sage Publications 2008;
14. Ganz F. D, Toren O., Fadion F. Factors associated with full implementation of scope of practice. Journal of Nursing Scholarship 2016;48(3):285-293.
15. Hart P. L, Spiva L. A, Mareno N. Clinical deterioration leadership ability scale A psychometric study. Journal of Nursing Measurement 2016;24(2):314-322.
16. Hawthorne D. M, Youngblut J. M, Brooten D. Parent spirituality, grief, and mental health at 1 and 3 months after their infant’s/child’s death in an intensive care unit. Journal of Pediatric Nursing 2016;31:73-80.
17. Johansson I., Jansson H., Lindmark U. Oral health status of older adults in Sweden receiving elder care Findings from nursing assessments. Nursing Research 2016;65(3):215-223.
18. Johannesen J. K, Lurie J. B, Fiszdon J. M, Bell M. D. The social attribution task-multiple choice (SAT-MC) A
501
psychometric and equivalence study of an alternate form. ISRN Psychiatry 2013;2013:1-9.
19. Khoshouei M. S. Psychometric evaluation of the Connor- Davidson resilience scale (CD-RISC) using Iranian students. International Journal of Testing 2009;9(1):60-66.
20. Lee C. E, Von Ah D., Szuck B., Lau Y. J. Determinants of physical activity maintenance in breast cancer survivors after a community-based intervention. Oncology Nursing Forum 2016;43(1):93-102.
21. Lynn M. R. Determination and quantification of content validity. Nursing Research 1986;35:382-385.
22. McDowell I., Newell C. Measuring health A guide to rating scales and questionnaires. New York, NY: Oxford Press 1996;
23. Nunnally J. C, Bernstein I. H. Psychometric theory. 3rd ed. New York, NY: McGraw-Hill 1993;
24. Robbins L. B, Sikorski A., Hamel L. M, et al. Gender comparisons of perceived benefits of and barriers to physical activity in middle school youth. Research in Nursing and Health 2009;32:163-176.
25. Robbins L. B, Sikorski A., Morely B. Psychometric assessment of the adolescent physical activity perceived benefits and barriers scale. Journal of Nursing Measurement 2008;16:98-112.
26. Tedeschi R. G, Calhoun L. G. The post-traumatic growth inventory Measuring the positive legacy of trauma. Journal of Traumatic Stress 1996;9:455k-471k
27. Turner-Sack A. M, Menna R., Setchell S. R, et al. Psychological functioning, post-traumatic growth, and coping in parents and siblings of adolescent cancer survivors. Oncology Nursing Forum 2016;43(1):48-56.
28. Vermeesch A. L, Ling J., Voskull V. R, et al. Biological and sociocultural differences in perceived barriers to physical activity among fifth-to-seventh-grade urban girls. Nursing Research 2015;64(5):342-350.
29. Yaskowich E. Posttraumatic growth in children and adolescents with cancer. Digital Dissertation 2003;63:3948.
502
CHAPTER 16
Data analysis: Descriptive and inferential statistics Susan Sullivan-Bolyai, Carol Bova
Learning outcomes
After reading this chapter, you should be able to do the following:
• Differentiate between descriptive and inferential statistics. • State the purposes of descriptive statistics. • Identify the levels of measurement in a study. • Describe a frequency distribution. • List measures of central tendency and their use. • List measures of variability and their use. • State the purpose of inferential statistics. • Explain the concept of probability as it applies to the analysis of sample data. • Distinguish between a type I and type II error and its effect on a study’s outcome. • Distinguish between parametric and nonparametric tests.
503
• List some commonly used statistical tests and their purposes. • Critically appraise the statistics used in published research studies. • Evaluate the strength and quality of the evidence provided by the findings of a research study and determine applicability to practice.
KEY TERMS
analysis of covariance
analysis of variance
categorical variable
chi-square (χ2)
continuous variable
correlation
degrees of freedom
descriptive statistics
dichotomous variable
factor analysis
Fisher exact probability test
frequency distribution
inferential statistics
interval measurement
levels of measurement
level of significance (alpha level)
mean
measures of central tendency
measures of variability
median
504
measurement
modality
mode
multiple analysis of variance
multiple regression
multivariate statistics
nominal measurement
nonparametric statistics
normal curve
null hypothesis
ordinal measurement
parameter
parametric statistics
Pearson correlation coefficient (Pearson r; Pearson product moment correlation coefficient)
percentile
probability
range
ratio measurement
sampling error
scientific hypothesis
standard deviation
statistic
t statistic
type I error
type II error
505
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
It is important to understand the principles underlying statistical methods used in quantitative research. This understanding allows you to critically analyze the results of research that may be useful in practice. Researchers link the statistical analyses they choose with the type of research question, design, and level of data collected.
As you read a research article, you will find a discussion of the statistical procedures used in both the methods and results sections. In the methods section, you will find the planned statistical analyses. In the results section, you will find the data generated from testing the hypotheses or research questions. The data are analyzed using both descriptive and inferential statistics.
Procedures that allow researchers to describe and summarize data are known as descriptive statistics. Descriptive statistics include measures of central tendency, such as mean, median, and mode; measures of variability, such as range and standard deviation (SD); and some correlation techniques, such as scatter plots. For example, Nyamathi and colleagues (2015; Appendix A) used descriptive statistics to inform the reader about the 345 subjects who were eligible for the HAV/HBV vaccine in their study (51% African American, 31% Latino, 59% not married, with a mean education of 11.6 years).
Statistical procedures that allow researchers to estimate how reliably they can make predictions and generalize findings based on the data are known as inferential statistics. Inferential statistics are used to analyze the data collected, test hypotheses, and answer the research questions in a research study. With inferential statistics, the researcher is trying to draw conclusions that extend beyond the study’s data.
This chapter describes how researchers use descriptive and inferential statistics in studies. This will help you determine the appropriateness of the statistics used and to interpret the strength and quality of the reported findings, as well as the clinical significance and applicability of the results for your evidence-based practice.
506
Levels of measurement Measurement is the process of assigning numbers to variables or events according to rules. Every variable in a study that is assigned a specific number must be similar to every other variable assigned that number. The measurement level is determined by the nature of the object or event being measured. Understanding the levels of measurement is an important first step when you evaluate the statistical analyses used in a study. There are four levels of measurement: nominal, ordinal, interval, and ratio (Table 16.1). The level of measurement of each variable determines the statistic that can be used to answer a research question or test a hypothesis. The higher the level of measurement, the greater the flexibility the researcher has in choosing statistical procedures. The following Critical Thinking Decision Path illustrates the relationship between levels of measurement and the appropriate use of descriptive statistics.
TABLE 16.1 Level of Measurement Summary Table
CRITICAL THINKING DECISION PATH Descriptive Statistics
507
Nominal measurement is used to classify variables or events into categories. The categories are mutually exclusive; the variable or event either has or does not have the characteristic. The numbers assigned to each category are only labels; such numbers do not indicate more or less of a characteristic. Nominal-level measurement is used to categorize a sample on such information as gender, marital status, or religious affiliation. For example, Hawthorne and colleagues (2016; Appendix B) measured race using a nominal level of measurement. Nominal-level measurement is the lowest level and allows for the least amount of statistical manipulation. When using nominal-level variables, the frequency and percent are typically calculated. For example, Hawthorne and colleagues (2016) found that among their sample of mothers, 44% were black, non-Hispanic; 37% Hispanic; and 19% white, non- Hispanic.
508
A variable at the nominal level can also be categorized as either a dichotomous or a categorical variable. A dichotomous (nominal) variable is one that has only two true values, such as true/false or yes/no. For example, in the Turner-Sack and colleagues (2016; Appendix D) study the variable gender (male/female) is dichotomous because it has only two possible values. On the other hand, nominal variables that are categorical still have mutually exclusive categories but have more than two true values, such as religion in the Hawthorne and colleagues study (Protestant, Catholic, Jewish, other, none).
Ordinal measurement is used to show relative rankings of variables or events. The numbers assigned to each category can be compared, and a member of a higher category can be said to have more of an attribute than a person in a lower category. The intervals between numbers on the scale are not necessarily equal, and there is no absolute zero. For example, ordinal measurement is used to formulate class rankings, where one student can be ranked higher or lower than another. However, the difference in actual grade point average between students may differ widely. Another example is ranking individuals by their level of wellness or by their ability to carry out activities of daily living. Hawthorne and colleagues used an ordinal variable to measure the total family annual income of families in their study and found that 37% (n = 34) of the sample had household incomes greater or equal to $50,000. Ordinal-level data are limited in the amount of mathematical manipulation possible. Frequencies, percentages, medians, percentiles, and rank order coefficients of correlation can be calculated for ordinal-level data.
Interval measurement shows rankings of events or variables on a scale with equal intervals between the numbers. The zero point remains arbitrary and not absolute. For example, interval measurements are used in measuring temperatures on the Fahrenheit scale. The distances between degrees are equal, but the zero point is arbitrary and does not represent the absence of temperature. Test scores also represent interval data. The differences between test scores represent equal intervals, but a zero does not represent the total absence of knowledge.
HELPFUL HINT
509
The term continuous variable is also used to represent a measure that contains a range of values along a continuum and may include ordinal-, interval-, and ratio-level data ( Plichta & Kelvin, 2012). An example is heart rate.
In many areas of science, including nursing, the classification of the level of measurement of scales that use Likert-type response options to measure concepts such as quality of life, depression, functional status, or social support is controversial, with some regarding these measurements as ordinal and others as interval. You need to be aware of this controversy and look at each study individually in terms of how the data are analyzed. Interval-level data allow more manipulation of data, including the addition and subtraction of numbers and the calculation of means. This additional manipulation is why many argue for classifying behavioral scale data as interval level. For example, Turner-Sack and colleagues (2016) used the Brief Symptom Inventory (BSI) to evaluate psychological distress of adolescent cancer survivors and siblings. The BSI has 53 items and uses a five-point Likert scale from 0 (not at all) to 4 (extremely), with higher scores indicating greater psychological distress. They reported the mean BSI score as 47.31 for cancer survivors and 48.94 for siblings.
Ratio measurement shows rankings of events or variables on scales with equal intervals and absolute zeros. The number represents the actual amount of the property the object possesses. Ratio measurement is the highest level of measurement, but it is most often used in the physical sciences. Examples of ratio-level data that are commonly used in nursing research are height, weight, pulse, and blood pressure. All mathematical procedures can be performed on data from ratio scales. Therefore, the use of any statistical procedure is possible as long as it is appropriate to the design of the study.
HELPFUL HINT Descriptive statistics assist in summarizing data. The descriptive statistics calculated must be appropriate to the purpose of the study and the level of measurement.
510
Descriptive statistics
Frequency distribution One way of organizing descriptive data is by using a frequency distribution. In a frequency distribution the number of times each event occurs is counted. The data can also be grouped and the frequency of each group reported. Table 16.2 shows the results of an examination given to a class of 51 students. The results of the examination are reported in several ways. The columns on the left give the raw data tally and the frequency for each grade, and the columns on the right give the grouped data tally and grouped frequencies.
TABLE 16.2 Frequency Distribution
Mean, 73.1; standard deviation, 12.1; median, 74; mode, 72; range, 36 (54–90).
511
When data are grouped, it is necessary to define the size of the group or the interval width so that no score will fall into two groups and each group will be mutually exclusive. The grouping of the data in Table 16.2 prevents overlap; each score falls into only one group. The grouping should allow for a precise presentation of the data without a serious loss of information.
Information about frequency distributions may be presented in the form of a table, such as Table 16.2, or in graphic form. Fig. 16.1 illustrates the most common graphic forms: the histogram and the frequency polygon. The two graphic methods are similar in that both plot scores, or percentages of occurrence, against frequency. The greater the number of points plotted, the smoother the resulting graph. The shape of the resulting graph allows for observations that further describe the data.
FIG 16.1 Frequency distributions. A, Histogram. B,
Frequency polygon.
Measures of central tendency Measures of central tendency are used to describe the pattern of responses among a sample. Measures of central tendency include the mean, median, and mode. They yield a single number that describes the middle of the group and summarize the members of a sample. Each measure of central tendency has a specific use and is most appropriate to specific kinds of measurement and types of distributions.
512
The mean is the arithmetical average of all the scores (add all of the values in a distribution and divide by the total number of values) and is used with interval or ratio data. The mean is the most widely used measure of central tendency. Most statistical tests of significance use the mean. The mean is affected by every score and can change greatly with extreme scores, especially in studies that have a limited sample size. The mean is generally considered the single best point for summarizing data when using interval- or ratio-level data. You can find the mean in research reports by looking for the symbol x̅.
The median is the score where 50% of the scores are above it and 50% of the scores are below it. The median is not sensitive to extremes in high and low scores. It is best used when the data are skewed (see the Normal Distribution section in this chapter) and the researcher is interested in the “typical” score. For example, if age is a variable and there is a wide range with extreme scores that may affect the mean, it would be appropriate to also report the median. The median is easy to find either by inspection or by calculation and can be used with ordinal-, interval-, and ratio-level data.
The mode is the most frequent value in a distribution. The mode is determined by inspection of the frequency distribution (not by mathematical calculation). For example, in Table 16.2 the mode would be a score of 72 because nine students received this score and it represents the score that was attained by the greatest number of students. It is important to note that a sample distribution can have more than one mode. The number of modes contained in a distribution is called the modality of the distribution. It is also possible to have no mode when all scores in a distribution are different. The mode is most often used with nominal data but can be used with all levels of measurement. The mode cannot be used for calculations, and it is unstable; that is, the mode can fluctuate widely from sample to sample from the same population.
HELPFUL HINT Of the three measures of central tendency, the mean is the affected by every score and the most useful. The mean can only be calculated with interval and ratio data.
513
When you examine a study, the measures of central tendency provide you with important information about the distribution of scores in a sample. If the distribution is symmetrical and unimodal, the mean, median, and mode will coincide. If the distribution is skewed (asymmetrical), the mean will be pulled in the direction of the long tail of the distribution and will differ from the median. With a skewed distribution, all three statistics should be reported.
HELPFUL HINT Measures of central tendency are descriptive statistics that describe the characteristics of a sample.
Normal distribution The concept of the normal distribution is based on the observation that data from repeated measures of interval- or ratio-level data group themselves about a midpoint in a distribution in a manner that closely approximates the normal curve illustrated in Fig. 16.2. The normal curve is one that is symmetrical about the mean and is unimodal. The mean, median, and mode are equal. An additional characteristic of the normal curve is that a fixed percentage of the scores fall within a given distance of the mean. As shown in Fig. 16.2, about 68% of the scores or means will fall within 1 SD of the mean, 95% within 2 SD of the mean, and 99.7% within 3 SD of the mean. The presence or absence of a normal distribution is a fundamental issue when examining the appropriate use of inferential statistical procedures.
514
FIG 16.2 The normal distribution and associated
standard deviations.
EVIDENCE-BASED PRACTICE TIP Inspection of descriptive statistics for the sample will indicate whether the sample data are skewed.
Interpreting measures of variability Variability or dispersion is concerned with the spread of data. Measures of variability answer questions such as: “Is the sample homogeneous (similar) or heterogeneous (different)?” If a researcher measures oral temperatures in two samples, one sample drawn from a healthy population and one sample from a hospitalized population, it is possible that the two samples will have the same mean. However, it is likely that there will be a wider range of temperatures in the hospitalized sample than in the healthy sample. Measures of variability are used to describe these differences in the dispersion of data. As with measures of central tendency, the various measures of variability are appropriate to specific kinds of measurement and types of distributions.
HELPFUL HINT
515
The descriptive statistics related to variability will enable you to evaluate the homogeneity or heterogeneity of a sample.
The range is the simplest but most unstable measure of variability. Range is the difference between the highest and lowest scores. A change in either of these two scores would change the range. The range should always be reported with other measures of variability. The range in Table 16.2 is 36, but this could easily change with an increase or decrease in the high score of 90 or the low score of 54. Turner-Sack and colleagues (2016; Appendix D) reported the range of BSI scores among their sample of adolescent cancer survivors (range = 25 to 79).
A percentile represents the percentage of cases a given score exceeds. The median is the 50% percentile, and in Table 16.2 it is a score of 74. A score in the 90th percentile is exceeded by only 10% of the scores. The zero percentile and the 100th percentile are usually dropped.
The standard deviation (SD) is the most frequently used measure of variability, and it is based on the concept of the normal curve (see Fig. 16.2). It is a measure of average deviation of the scores from the mean and as such should always be reported with the mean. The SD considers all scores and can be used to interpret individual scores. The SD is used in the calculation of many inferential statistics.
HELPFUL HINT Many measures of variability exist. The SD is the most useful because it helps you visualize how the scores disperse around the mean.
Inferential statistics Inferential statistics allow researchers to test hypotheses about a population using data obtained from probability samples. Statistical inference is generally used for two purposes: to estimate the probability that the statistics in the sample accurately reflect the population parameter and to test hypotheses about a population.
A parameter is a characteristic of a population, whereas a statistic is a characteristic of a sample. We use statistics to estimate
516
population parameters. Suppose we randomly sample 100 people with chronic lung disease and use an interval-level scale to study their knowledge of the disease. If the mean score for these subjects is 65, the mean represents the sample statistic. If we were able to study every subject with chronic lung disease, we could calculate an average knowledge score, and that score would be the parameter for the population. As you know, a researcher rarely is able to study an entire population, so inferential statistics provide evidence that allows the researcher to make statements about the larger population from studying the sample.
CRITICAL THINKING DECISION PATH Inferential Statistics—Difference Questions
The example given alludes to two important qualifications of how a study must be conducted so that inferential statistics may be used. First, it was stated that the sample was selected using
517
probability methods (see Chapter 12). Because you are already familiar with the advantages of probability sampling, it should be clear that if we wish to make statements about a population from a sample, that sample must be representative. All procedures for inferential statistics are based on the assumption that the sample was drawn with a known probability. Second, the scale used has to be at either an interval or a ratio level of measurement. This is because the mathematical operations involved in calculating inferential statistics require this higher level of measurement. It should be noted that in studies that use nonprobability methods of sampling, inferential statistics are also used. To compensate for the use of nonprobability sampling methods, researchers use techniques such as sample size estimation using power analysis. The following two Critical Thinking Decision Paths examine inferential statistics and provide matrices that researchers use for statistical decision making. CRITICAL THINKING DECISION PATH Inferential Statistics—Relationship Questions
518
Hypothesis testing Inferential statistics are used for hypothesis testing. Statistical hypothesis testing allows researchers to make objective decisions about the data from their study. The use of statistical hypothesis testing answers questions such as the following: “How much of this effect is the result of chance?” “How strongly are these two variables associated with each other?” “What is the effect of the intervention?”
HIGHLIGHT Members of your interprofessional team may have diverse data analysis preparation. Capitalizing on everybody’s background, try to figure out whether the statistical tests chosen for the studies your team is critically appraising are appropriate for the design, type of data collection, and level of measurement.
The procedures used when making inferences are based on principles of negative inference. In other words, if a researcher studied the effect of a new educational program for patients with chronic lung disease, the researcher would actually have two hypotheses: the scientific hypothesis and the null hypothesis. The research or scientific hypothesis is that which the researcher believes will be the outcome of the study. In our example, the scientific hypothesis would be that the educational intervention would have a marked effect on the outcome in the experimental group beyond that in the control group. The null hypothesis, which is the hypothesis that actually can be tested by statistical methods, would state that there is no difference between the groups. Inferential statistics use the null hypothesis to test the validity of a scientific hypothesis. The null hypothesis states that there is no relationship between the variables and that any observed relationship or difference is merely a function of chance.
HELPFUL HINT Most samples used in clinical research are samples of convenience, but often researchers use inferential statistics. Although such use violates one of the assumptions of such tests, the tests are robust
519
enough to not seriously affect the results unless the data are skewed in unknown ways.
Probability Probability theory underlies all of the procedures discussed in this chapter. The probability of an event is its long-run relative frequency (0% to 100%) in repeated trials under similar conditions. In other words, what are the chances of obtaining the same result from a study that can be carried out many times under identical conditions? It is the notion of repeated trials that allows researchers to use probability to test hypotheses.
Statistical probability is based on the concept of sampling error. Remember that the use of inferential statistics is based on random sampling. However, even when samples are randomly selected, there is always the possibility of some error in sampling. Therefore, the characteristics of any given sample may be different from those of the entire population. The tendency for statistics to fluctuate from one sample to another is known as sampling error.
EVIDENCE-BASED PRACTICE TIP The strength and quality of evidence are enhanced by repeated trials that have consistent findings, thereby increasing generalizability of the findings and applicability to clinical practice.
Type I and type II errors Statistical inference is always based on incomplete information about a population, and it is possible for errors to occur. There are two types of errors in statistical inference—type I and type II errors. A type I error occurs when a researcher rejects a null hypothesis when it is actually true (i.e., accepts the premise that there is a difference when actually there is no difference between groups). A type II error occurs when a researcher accepts a null hypothesis that is actually false (i.e., accepts the premise that there is no difference between the groups when a difference actually exists). The relationship of the two types of errors is shown in Fig. 16.3.
520
FIG 16.3 Outcome of statistical decision making.
When critiquing a study to see if there is a possibility of a type I error having occurred (rejecting the null hypothesis when it is actually true), one should consider the reliability and validity of the instruments used. For example, if the instruments did not accurately measure the intervention variables, one could conclude that the intervention made a difference when in reality it did not. It is critical to consider the reliability and validity of all the measurement instruments reported (see Chapter 15). For example, Turner-Sack and colleagues (2016) reported the reliability of the BSI in their sample and found it was reliable, as evidenced by a Cronbach’s alpha, of 0.97 for survivors and siblings and 0.98 for parents (refer to Chapter 15 to review scale reliability). This gives the reader greater confidence in the study’s results.
In a practice discipline, type I errors usually are considered more serious because if a researcher declares that differences exist where none are present, the potential exists for patient care to be affected adversely. Type II errors (accepting the null hypothesis when it is false) often occur when the sample is too small, thereby limiting the opportunity to measure the treatment effect, the true difference between two groups. A larger sample size improves the ability to detect the treatment effect—that is, the difference between two groups. If no significant difference is found between two groups with a large sample, it provides stronger evidence (than with a small sample) not to reject the null hypothesis.
Level of significance The researcher does not know when an error in statistical decision making has occurred. It is possible to know only that the null hypothesis is indeed true or false if data from the total population are available. However, the researcher can control the risk of
521
making type I errors by setting the level of significance before the study begins (a priori).
The level of significance (alpha level) is the probability of making a type I error, the probability of rejecting a true null hypothesis. The minimum level of significance acceptable for most research is.05. If the researcher sets alpha, or the level of significance, at.05, the researcher is willing to accept the fact that if the study were done 100 times, the decision to reject the null hypothesis would be wrong 5 times out of those 100 trials. As is sometimes the case, if the researcher wants to have a smaller risk of rejecting a true null hypothesis, the level of significance may be set at.01. In this case the researcher is willing to be wrong only once in 100 trials.
The decision as to how strictly the alpha level should be set depends on how important it is to avoid errors. For example, if the results of a study are to be used to determine whether a great deal of money should be spent in an area of patient care, the researcher may decide that the accuracy of the results is so important that an alpha level of.01 is needed. In most studies, however, alpha is set at.05.
Perhaps you are thinking that researchers should always use the lowest alpha level possible to keep the risk of both types of errors at a minimum. Unfortunately, decreasing the risk of making a type I error increases the risk of making a type II error. Therefore the researcher always has to accept more of a risk of one type of error when setting the alpha level.
HELPFUL HINT Decreasing the alpha level acceptable for a study increases the chance that a type II error will occur. When a researcher is doing many statistical tests, the probability of some of the tests being significant increases as the number of tests increases. Therefore, when a number of tests are being conducted, the researcher may decrease the alpha level to.01.
Clinical and statistical significance It is important for you to realize that there is a difference between statistical significance and clinical significance. When a researcher
522
tests a hypothesis and finds that it is statistically significant, it means that the finding is unlikely to have happened by chance. For example, if a study was designed to test an intervention to help a large sample of patients lose weight, and the researchers found that a change in weight of 1.02 pounds was statistically significant, one might find this questionable because few would say that a change in weight of just over 1 pound would represent a clinically significant difference. Therefore as a consumer of research it is important for you to evaluate the clinical significance as well as the statistical significance of findings.
Some people believe that if findings are not statistically significant, they have no practical value. However, knowing that something does not work is important information to share with the scientific community. Nonsupported hypotheses provide as much information about the intervention as supported hypotheses. Nonsignificant results (sometimes called negative findings) force the researcher to return to the literature and consider alternative explanations for why the intervention did not work as planned.
EVIDENCE-BASED PRACTICE TIP You will study the results to determine whether the new treatment is effective, the size of the effect, and whether the effect is clinically important.
Parametric and nonparametric statistics Tests of significance may be parametric or nonparametric. Parametric statistics have the following attributes:
1. Involve the estimation of at least one population parameter
2. Require measurement on at least an interval scale
3. Involve certain assumptions about the variables being studied
One important assumption is that the variable is normally distributed in the overall population.
In contrast to parametric tests, nonparametric statistics are not based on the estimation of population parameters, so they involve less restrictive assumptions about the underlying distribution.
523
Nonparametric tests usually are applied when the variables have been measured on a nominal or ordinal scale, or when the distribution of scores is severely skewed.
HELPFUL HINT Just because a researcher has used nonparametric statistics does not mean that the study is not useful. The use of nonparametric statistics is appropriate when measurements are not made at the interval level or the variable under study is not normally distributed.
There has been some debate about the relative merits of the two types of statistical tests. The moderate position taken by most researchers and statisticians is that nonparametric statistics are best used when data are not at the interval level of measurement, when the sample is small, and data do not approximate a normal distribution. However, most researchers prefer to use parametric statistics whenever possible (as long as data meet the assumptions) because they are more powerful and more flexible than nonparametric statistics.
Tables 16.3 and 16.4 list the commonly used inferential statistics. The test used depends on the level of the measurement of the variables in question and the type of hypothesis being studied. These statistics test two types of hypotheses: that there is a difference between groups (see Table 16.3) or that there is a relationship between two or more variables (see Table 16.4).
TABLE 16.3 Tests of Differences Between Means
524
ANOVA, Analysis of variance; ANCOVA, analysis of covariance; MANOVA, multiple analysis of variance.
TABLE 16.4 Tests of Association
EVIDENCE-BASED PRACTICE TIP Try to discern whether the test chosen for analyzing the data was chosen because it gave a significant p value. A statistical test should be chosen on the basis of its appropriateness for the type of data collected, not because it gives the answer that the researcher hoped to obtain.
Tests of difference The type of test used for any particular study depends primarily on whether the researcher is examining differences in one, two, or three or more groups and whether the data to be analyzed are nominal, ordinal, or interval (see Table 16.3). Suppose a researcher
525
has conducted an experimental study (see Chapter 9). What the researcher hopes to determine is that the two randomly assigned groups are different after the introduction of the experimental treatment. If the measurements taken are at the interval level, the researcher would use the t test to analyze the data. If the t statistic was found to be high enough as to be unlikely to have occurred by chance, the researcher would reject the null hypothesis and conclude that the two groups were indeed more different than would have been expected on the basis of chance alone. In other words, the researcher would conclude that the experimental treatment had the desired effect.
EVIDENCE-BASED PRACTICE TIP Tests of difference are most commonly used in experimental and quasi-experimental designs that provide Level II and Level III evidence.
The t statistic tests whether two group means are different. Thus this statistic is used when the researcher has two groups, and the question is whether the mean scores on some measure are more different than would be expected by chance. To use this test, the dependent variable (DV) must have been measured at the interval or ratio level, and the two groups must be independent. By independent we mean that nothing in one group helps determine who is in the other group. If the groups are related, as when samples are matched, and the researcher also wants to determine differences between the two groups, a paired or correlated t test would be used. The degrees of freedom (represents the freedom of a score’s value to vary given what is known about the other scores and the sum of scores; often df = N − 1) are reported with the t statistic and the probability value (p). Degrees of freedom is usually abbreviated as df.
The t statistic illustrates one of the major purposes of research in nursing—to demonstrate that there are differences between groups. Groups may be naturally occurring collections, such as gender, or they may be experimentally created, such as the treatment and control groups. Sometimes a researcher has more than two groups, or measurements are taken more than once, and then analysis of variance (ANOVA) is used. ANOVA is similar to the t test. Like the
526
t statistic, ANOVA tests whether group means differ, but rather than testing each pair of means separately, ANOVA considers the variation between groups and within groups.
HELPFUL HINT A research report may not always contain the test that was done. You can find this information by looking at the tables. For example, a table with t statistics will contain a column for “t” values, and an ANOVA table will contain “F” values.
Analysis of covariance (ANCOVA) is used to measure differences among group means, but it also uses a statistical technique to equate the groups under study on an important variable. Another expansion of the notion of ANOVA is multiple analysis of variance (MANOVA), which also is used to determine differences in group means, but it is used when there is more than one DV.
Nonparametric statistics When data are at the nominal level and the researcher wants to determine whether groups are different, the researcher uses the chi- square (χ2). Chi-square is a nonparametric statistic used to determine whether the frequency in each category is different from what would be expected by chance. As with the t test and ANOVA, if the calculated chi-square is high enough, the researcher would conclude that the frequencies found would not be expected on the basis of chance alone, and the null hypothesis would be rejected. Although this test is quite robust and can be used in many different situations, it cannot be used to compare frequencies when samples are small and expected frequencies are less than six in each cell. In these instances the Fisher exact probability test is used.
When the data are ranks, or are at the ordinal level, researchers have several other nonparametric tests at their disposal. These include the Kolmogorov-Smirnov test, the sign test, the Wilcoxon matched pairs test, the signed rank test for related groups, the median test, and the Mann-Whitney U test for independent groups. Explanation of these tests is beyond the scope of this chapter; those readers who desire further information should consult a general statistics book.
527
HELPFUL HINT Chi-square is the test of difference commonly used for nominal level demographic variables such as gender, marital status, religion, ethnicity, and others.
Tests of relationships Researchers often are interested in exploring the relationship between two or more variables. Such studies use statistics that determine the correlation, or the degree of association, between two or more variables. Tests of the relationships between variables are sometimes considered to be descriptive statistics when they are used to describe the magnitude and direction of a relationship of two variables in a sample and the researcher does not wish to make statements about the larger population. Such statistics also can be inferential when they are used to test hypotheses about the correlations that exist in the target population.
EVIDENCE-BASED PRACTICE TIP You will often note that in the results or findings section of a research study, parametric (e.g., t tests, ANOVA) and nonparametric (e.g., chi-square, Fisher exact probability test) measures will be used to test differences among variables depending on their level of measurement. For example, chi-square may be used to test differences among nominal level demographic variables, t tests will be used to test the hypotheses or research questions about differences between two groups, and ANOVA will be used to test differences among groups when there are multiple comparisons.
Null hypothesis tests of the relationships between variables assume that there is no relationship between the variables. Thus when a researcher rejects this type of null hypothesis, the conclusion is that the variables are in fact related. Suppose a researcher is interested in the relationship between the age of patients and the length of time it takes them to recover from surgery. As with other statistics discussed, the researcher would design a study to collect the appropriate data and then analyze the data using measures of association. In this example, age and length
528
of time until recovery would be considered interval-level measurements. The researcher would use a test called the Pearson correlation coefficient, Pearson r, or Pearson product moment correlation coefficient. Once the Pearson r is calculated, the researcher consults the distribution for this test to determine whether the value obtained is likely to have occurred by chance. Again, the research reports both the value of the correlation and its probability of occurring by chance.
Correlation coefficients can range in value from −1.0 to +1.0 and also can be zero. A zero coefficient means that there is no relationship between the variables. A perfect positive correlation is indicated by a +1.0 coefficient, and a perfect negative correlation by a −1.0 coefficient. We can illustrate the meaning of these coefficients by using the example from the previous paragraph. If there were no relationship between the age of the patient and the time required for the patient to recover from surgery, the researcher would find a correlation of zero. However, if the correlation was +1.0, it would mean that the older the patient, the longer the recovery time. A negative coefficient would imply that the younger the patient, the longer the recovery time.
Of course, relationships are rarely perfect. The magnitude of the relationship is indicated by how close the correlation comes to the absolute value of 1. Thus a correlation of −.76 is just as strong as a correlation of +.76, but the direction of the relationship is opposite. In addition, a correlation of.76 is stronger than a correlation of.32. When a researcher tests hypotheses about the relationships between two variables, the test considers whether the magnitude of the correlation is large enough not to have occurred by chance. This is the meaning of the probability value or the p value reported with correlation coefficients. As with other statistical tests of significance, the larger the sample, the greater the likelihood of finding a significant correlation. Therefore researchers also report the df associated with the test performed.
Nominal and ordinal data also can be tested for relationships by nonparametric statistics. When two variables being tested have only two levels (e.g., male/female; yes/no), the phi coefficient can be used to test relationships. When the researcher is interested in the relationship between a nominal variable and an interval variable,
529
the point-biserial correlation is used. Spearman rho is used to determine the degree of association between two sets of ranks, as is Kendall’s tau. All of these correlation coefficients may range in value from −1.0 to +1.0.
EVIDENCE-BASED PRACTICE TIP Tests of relationship are usually associated with nonexperimental designs that provide Level IV evidence. Establishing a strong statistically significant relationship between variables often lends support for replicating the study to increase the consistency of the findings and provide a foundation for developing an intervention study.
Advanced statistics Nurse researchers are often interested in health problems that are very complex and require that we analyze many different variables at once using advanced statistical procedures called multivariate statistics. Computer software has made the use of multivariate statistics quite accessible to researchers. When researchers are interested in understanding more about a problem than just the relationship between two variables, they often use a technique called multiple regression, which measures the relationship between one interval-level DV and several independent variables (IVs). Multiple regression is the expansion of correlation to include more than two variables, and it is used when the researcher wants to determine what variables contribute to the explanation of the DV and to what degree. For example, a researcher may be interested in determining what factors help women decide to breastfeed their infants. A number of variables, such as the mother’s age, previous experience with breastfeeding, number of other children, and knowledge of the advantages of breastfeeding, might be measured and analyzed to see whether they separately and together predict the duration of breastfeeding. Such a study would require the use of multiple regression.
Another advanced technique often used in nursing research is factor analysis. There are two types of factor analysis, exploratory and confirmatory factor analysis. Exploratory factor analysis is used to reduce a set of data so that it may be easily described and used. It
530
is also used in the early phases of instrument development and theory development. Factor analysis is used to determine whether a scale actually measured the concepts that it is intended to measure. Confirmatory factor analysis resembles structural equation modeling and is used in instrument development to examine construct validity and reliability and to compare factor structures across groups (Plichta & Kelvin, 2012).
Many studies use statistical modeling procedures to answer research questions. Causal modeling is used most often when researchers want to test hypotheses and theoretically derived relationships. Path analysis, structured equation modeling (SEM), and linear structural relations analysis (LISREL) are different types of modeling procedures used in nursing research.
Many other statistical techniques are available for nurse researchers. It is beyond the scope of this chapter to review all statistical analyses available. You should consider having several statistical texts available to you as you sort through the evidence reported in studies that are important to your clinical practice (e.g., Field, 2013; Plichta & Kelvin, 2012).
Appraisal for evidence-based practice descriptive and inferential statistics Nurses are challenged to understand the results of studies that use sophisticated statistical procedures. Understanding the principles that guide statistical analysis is the first step in this process. Statistics are used to describe the samples of studies and to test for hypothesized differences or associations in the sample. Knowing the characteristics of the sample of a study allows you to determine whether the results are potentially useful for your patients. For example, if a study sample was primarily white with a mean age of 42 years (SD 2.5), the findings may not be applicable if your patients are mostly elderly and African American. Cultural, demographic, or clinical factors of an elderly population of a different ethnic group may contribute to different results. Thus understanding the descriptive statistics of a study will assist you in determining the applicability of findings to your practice setting.
Statistics are also used to test hypotheses. Inferential statistics
531
used to analyze data and the associated significance level (p values) indicate the likelihood that the association or difference found in a study is due to chance or to a true difference among groups. The closer the p value is to zero, the less likely the association or difference of a study is due to chance. Thus inferential statistics provide an objective way to determine if the results of the study are likely to be a true representation of reality. However, it is still important for you to judge the clinical significance of the findings. Was there a big enough effect (difference between the experimental and control groups) to warrant changing current practice?
CRITICAL APPRAISAL CRITERIA Descriptive and Inferential Statistics
1. Were appropriate descriptive statistics used?
2. What level of measurement was used to measure each of the major variables?
3. Is the sample size large enough to prevent one extreme score from affecting the summary statistics used?
4. What descriptive statistics are reported?
5. Were these descriptive statistics appropriate to the level of measurement for each variable?
6. Are there appropriate summary statistics for each major variable (e.g., demographic variables) and any other relevant data?
7. Does the hypothesis indicate that the researcher is interested in testing for differences between groups or in testing for relationships? What is the level of significance?
8. Does the level of measurement permit the use of parametric statistics?
9. Is the size of the sample large enough to permit the use of parametric statistics?
532
10. Has the researcher provided enough information to decide whether the appropriate statistics were used?
11. Are the statistics used appropriate to the hypothesis, the research question, the method, the sample, and the level of measurement?
12. Are the results for each of the research questions or hypotheses presented clearly and appropriately?
13. If tables and graphs are used, do they agree with the text and extend it, or do they merely repeat it?
14. Are the results understandable?
15. Is a distinction made between clinical significance and statistical significance? How is it made?
The systematic review and meta-analysis by Al-Mallah and colleagues (2016; Appendix E) provides an excellent example of how a meta-analysis (the summarization of many studies) can help us understand the mortality and morbidity of patients who are cared for at nurse-led clinics.
EVIDENCE-BASED PRACTICE TIP A basic understanding of statistics will improve your ability to think about the effect of the IV on the DV and related patient outcomes for your patient population and practice setting.
There are a few steps to follow when critiquing the statistics used in studies (see the Critical Appraisal Criteria box). Before a decision can be made as to whether the statistics that were used make sense, it is important to return to the beginning of the research study and review the purpose of the study. Just as the hypotheses or research questions should flow from the purpose of a study, so should the hypotheses or research questions suggest the type of analysis that will follow. The hypotheses or the research questions should indicate the major variables that are expected to be tested and presented in the “Results” section. Both the summary descriptive
533
statistics and the results of the inferential testing of each of the variables should be in the “Results” section with appropriate information.
After reviewing the hypotheses or research questions, you should proceed to the “Methods” section. Next, try to determine the level of measurement for each variable. From this information it is possible to determine the measures of central tendency and variability that should be used to summarize the data. For example, you would not expect to see a mean used as a summary statistic for the nominal variable of gender. In all likelihood, gender would be reported as a frequency distribution. However, you would expect to find a mean and SD for a variable that used a questionnaire. The means and SD should be provided for measurements performed at the interval level. The sample size is another aspect of the “Methods” section that is important to review when evaluating the researcher’s use of descriptive statistics. The sample is usually described using descriptive summary statistics. Remember, the larger the sample, the less chance that one outlying score will affect the summary statistics. It is also important to note whether the researchers indicated that they did a power analysis to estimate the sample size needed to conduct the study.
If tables or graphs are used, they should agree with the information presented in the text. Evaluate whether the tables and graphs are clearly labeled. If the researcher presents grouped frequency data, the groups should be logical and mutually exclusive. The size of the interval in grouped data should not obscure the pattern of the data, nor should it create an artificial pattern. Each table and graph should be referred to in the text, but each should add to the text—not merely repeat it.
The following are some simple steps for reading a table:
1. Look at the title of the table and see if it matches the purpose of the table.
2. Review the column headings and assess whether the headings follow logically from the title.
3. Look at the abbreviations used. Are they clear and easy to understand? Are any nonstandard abbreviations explained?
534
4. Evaluate whether the statistics contained in the table are appropriate to the level of measurement for each variable.
After evaluating the descriptive statistics, inferential statistics can then be evaluated. The best place to begin appraising the inferential statistical analysis of a research study is with the hypothesis or research question. If the hypothesis or research question indicates that a relationship will be found, you should expect to find tests of correlation. If the study is experimental or quasi-experimental, the hypothesis or research question would indicate that the author is looking for significant differences between the groups studied, and you would expect to find statistical tests of differences between means that test the effect of the intervention. Then as you read the “Methods” section of the paper, again consider what level of measurement the author has used to measure the important variables. If the level of measurement is interval or ratio, the statistics most likely will be parametric statistics. On the other hand, if the variables are measured at the nominal or ordinal level, the statistics used should be nonparametric. Also consider the size of the sample, and remember that samples have to be large enough to permit the assumption of normality. If the sample is quite small (e.g., 5 to 10 subjects), the researcher may have violated the assumptions necessary for inferential statistics to be used. Thus the important question is whether the researcher has provided enough justification to use the statistics presented.
Finally, consider the results as they are presented. There should be enough data presented for each hypothesis or research question studied to determine whether the researcher actually examined each hypothesis or research question. The tables should accurately reflect the procedure performed and be in harmony with the text. For example, the text should not indicate that a test reached statistical significance while the tables indicate that the probability value of the test was above.05. If the researcher has used analyses that are not discussed in this text, you may want to refer to a statistics text to decide whether the analysis was appropriate to the hypothesis or research question and the level of measurement.
There are two other aspects of the data analysis section that you should appraise. The results of the study in the text of the article
535
should be clear. In addition, the author should attempt to make a distinction between the clinical and statistical significance of the evidence related to the findings. Some results may be statistically significant, but their clinical importance may be doubtful in terms of applicability for a patient population or clinical setting. If this is so, the author should note it. Alternatively, you may find yourself reading a research study that is elegantly presented, but you come away with a “So what?” feeling. From an evidence-based practice perspective, a significant hypothesis or research question should contribute to improving patient care and clinical outcomes. The important question to ask is “What is the strength and quality of the evidence provided by the findings of this study and their applicability to practice?”
Note that the critical analysis of a research paper’s statistical analysis is not done in a vacuum. It is possible to judge the adequacy of the analysis only in relationship to the other important aspects of the paper: the problem, the hypotheses, the research question, the design, the data collection methods, and the sample. Without consideration of these aspects of the research process, the statistics themselves have very little meaning.
Key points • Descriptive statistics are a means of describing and organizing
data gathered in research.
• The four levels of measurement are nominal, ordinal, interval, and ratio. Each has appropriate descriptive techniques associated with it.
• Measures of central tendency describe the average member of a sample. The mode is the most frequent score, the median is the middle score, and the mean is the arithmetical average of the scores. The mean is the most stable and useful of the measures of central tendency and, combined with the standard deviation, forms the basis for many of the inferential statistics.
• The frequency distribution presents data in tabular or graphic
536
form and allows for the calculation or observations of characteristics of the distribution of the data, including skew symmetry, and modality.
• In nonsymmetrical distributions, the degree and direction of the off-center peak are described in terms of positive or negative skew.
• The range reflects differences between high and low scores.
• The SD is the most stable and useful measure of variability. It is derived from the concept of the normal curve. In the normal curve, sample scores and the means of large numbers of samples group themselves around the midpoint in the distribution, with a fixed percentage of the scores falling within given distances of the mean. This tendency of means to approximate the normal curve is called the sampling distribution of the means.
• Inferential statistics are a tool to test hypotheses about populations from sample data.
• Because the sampling distribution of the means follows a normal curve, researchers are able to estimate the probability that a certain sample will have the same properties as the total population of interest. Sampling distributions provide the basis for all inferential statistics.
• Inferential statistics allow researchers to estimate population parameters and to test hypotheses. The use of these statistics allows researchers to make objective decisions about the outcome of the study. Such decisions are based on the rejection or acceptance of the null hypothesis, which states that there is no relationship between the variables.
• If the null hypothesis is accepted, this result indicates that the findings are likely to have occurred by chance. If the null hypothesis is rejected, the researcher accepts the scientific hypothesis that a relationship exists between the variables that is unlikely to have been found by chance.
537
• Statistical hypothesis testing is subject to two types of errors: type I and type II.
• A type I error occurs when the researcher rejects a null hypothesis that is actually true.
• A type II error occurs when the researcher accepts a null hypothesis that is actually false.
• The researcher controls the risk of making a type I error by setting the alpha level, or level of significance; however, reducing the risk of a type I error by reducing the level of significance increases the risk of making a type II error.
• The results of statistical tests are reported to be significant or nonsignificant. Statistically significant results are those whose probability of occurring is less than.05 or.01, depending on the level of significance set by the researcher.
• Commonly used parametric and nonparametric statistical tests include those that test for differences between means, such as the t test and ANOVA, and those that test for differences in proportions, such as the chi-square test.
• Tests that examine data for the presence of relationships include the Pearson r, the sign test, the Wilcoxon matched pairs, signed rank test, and multiple regression.
• The most important aspect of critiquing statistical analyses is the relationship of the statistics employed to the problem, design, and method used in the study. Clues to the appropriate statistical test to be used by the researcher should stem from the researcher’s hypotheses. The reader also should determine if all of the hypotheses have been presented in the paper.
• A basic understanding of statistics will improve your ability to think about the level of evidence provided by the study design and findings and their relevance to patient outcomes for your patient population and practice setting.
538
Critical thinking challenges • When reading a research study, what is the significance of
applying findings if a nurse researcher made a type I error in statistical inference?
• What is the relationship between the level of measurement a researcher uses and the choice of statistics used? As you read a research study, identify the statistics, level of measurement, and the associated level of evidence provided by the design.
• When reviewing a study you find the sample size provided does not seem adequate. Before you make this final decision, think about how the design type (e.g., pilot study, intervention study), data collection methods, the number of variables, and the sensitivity of the data collection instruments can affect your decision.
• When your team finishes critically appraising a research study, those team members responsible for the critique report that the findings are not statistically significant. Consider how those findings are or are not applicable to your practice.
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
References 1. Al-Mallah M. H., Faraf I., Al-Madani W., et al. The impact of
nurse-led clinics on mortality and morbidity of patients with cardiovascular diseases a systematic review and meta- analysis. Journal of Cardiovascular Nursing 2016;31:89-95 Available at: doi:10.1097/JCN.0000000000000224
2. Field A. Discovering statistics using SPSS. 4th ed. Thousand Oaks, CA: Sage 2013;
3. Hawthorne D. M., Youngblut J. M., Brooten D. Parent spirituality, grief, and mental health at 1-year and 3 months after
539
their infant’s/child’s death in an intensive care unit. Journal of Pediatric Nursing 2016;31:73-80 Available at: doi:org/10.1016/j.pedn.2015.07.008
4. Nyamathi A., Salem B. E., Zhang S., et al. Nursing care management, peer coaching, and hepatitis A and B vaccine completion among homeless men recently released on parole. Nursing Research 2015;64:177-189 Available at: doi:10.1097/NNR.0000000000000083
5. Plichta S. B., Kelvin E. Munro’s statistical methods for health care research. 6th ed. Philadelphia, PA: Lippincott Williams & Wilkins 2012;
6. Turner-Sack A. M., Menna R., Setchell S. R., et al. Psychological functioning, post traumatic growth, and coping in parents and siblings of adolescent cancer survivors. Oncology Nursing Forum 2016;43:48-57 Available at: doi:10.1188/16.ONF.48-56
540
CHAPTER 17
Understanding research findings Geri LoBiondo-Wood
Learning outcomes
After reading this chapter, you should be able to do the following:
• Discuss the difference between the “Results” and the “Discussion” sections of a research study. • Determine if findings are objectively discussed. • Describe how tables and figures are used in a research report. • List the criteria of a meaningful table. • Identify the purpose and components of the “Discussion” section. • Discuss the importance of including generalizability and limitations of a study in the report. • Determine the purpose of including recommendations in the study report. • Discuss how the strength, quality, and consistency of evidence provided by the findings are related to a study’s results, limitations, generalizability, and applicability to practice.
KEY TERMS
541
confidence interval
findings
generalizability
limitations
recommendations
Go to Evolve at http://evolve.elsevier.com/LoBiondo/ for review questions, critiquing exercises, and additional research articles for practice in reviewing and critiquing.
The ultimate goal of nursing research is to develop knowledge that advances evidence-based nursing practice and quality patient care. From a clinical application perspective, analysis, interpretation, discussion, and generalizability of the results become highly important pieces of the research study. After the analysis of the data, the researcher puts the final pieces of the jigsaw puzzle together to view the total picture with a critical eye. This process is analogous to evaluation, the last step in the nursing process. You may view these last sections as an easier step for the investigator, but it is here that a most critical and creative process comes to the forefront. In the final sections of the report, after the statistical procedures have been applied, the researcher relates the findings to the research question, hypotheses, theoretical framework, literature, methods, and analyses; reviews the findings for any potential bias; and makes decisions about the application of the findings to future research and practice.
The final sections of published studies are generally titled “Results” and “Discussion.” Other topics, such as conclusions, limitations of findings, recommendations, and implications for future research and nursing practice, may be addressed separately or included in these sections. The presentation format is a function of the author’s and the journal’s stylistic considerations. The function of these final sections is to integrate all aspects of the research process, as well as to discuss, interpret, and identify the limitations, the threats related to bias, and the generalizability relevant to the investigation, thereby furthering evidence-based
542
practice. The process that both an investigator and you use to assess the results of a study is depicted in the Critical Thinking Decision Path.
The goal of this chapter is to introduce the purpose and content of the final sections of a research study where data are presented, interpreted, discussed, and generalized.
Findings The findings of a study are the results, conclusions, interpretations, recommendations, and implications for future research and nursing practice, which are addressed by separating the presentation into two major areas. These two areas are the results and the discussion of the results. The “Results” section focuses on the results or statistical findings of a study, and the “Discussion” section focuses on the remaining topics. For both sections, the rule applies—as it does to all other sections of a report—that the content must be presented clearly, concisely, and logically.
EVIDENCE-BASED PRACTICE TIP Evidence-based practice is an active process that requires you to consider how, and if, research findings are applicable to your patient population and practice setting.
Results The “Results” section of a study is the data-bound section of the report and is where the quantitative data or numbers generated by the descriptive and inferential statistical tests are presented. Other headings that may be used for the results section are “Statistical Analyses,” “Data Analysis,” or “Analysis.” The results of the data analysis set the stage for the interpretation or discussion and the limitations sections that follow the results. The “Results” section should reflect analysis of each research question and/or hypothesis tested. The information from each hypothesis or research question should be sequentially presented. The tests used to analyze the data should be identified. If the exact test that was used is not explicitly stated, the values obtained should be noted. The researcher does this by providing the numerical values of the statistics and stating
543
the specific test value and probability level achieved (see Chapter 16). Examples ➤ of these statistical results can be found in Table 17.1. The numbers are important, but there is much more to the research process than the numbers. They are one piece of the whole. Chapter 16 conceptually presents the meanings of the numbers found in studies. Whether you only superficially understand statistics or have an in-depth knowledge of statistics, it should be obvious that the results are clearly stated, and the presence or lack of statistically significant results should be noted.
TABLE 17.1 Examples of Reported Statistical Results
Statistical Test Examples of Reported Results Mean m = 118.28 Standard deviation SD = 62.5 Pearson correlation r =.49, P <.01 Analysis of variance F = 3.59, df = 2, 48, P <.05 t test t = 2.65, P <.01 Chi-square χ2 = 2.52, df = 1, P <.05
CRITICAL THINKING DECISION PATH Assessing Study Results
544
HELPFUL HINT In the results section of a research report, the descriptive statistics results are generally presented first; then the inferential results of each hypothesis or research question that was tested are presented.
At times the researchers will begin the “Results” or “Data Analysis” section by identifying the name of the statistical software program they used to analyze the data. This is not a statistical test
545
but a computer program specifically designed to analyze a variety of statistical tests. Example: ➤ Li and colleagues (2016) state that “SPSS version 22.0 software and Mplus7 were used for the statistical analysis” (see Chapter 16). Information on the statistical tests used is presented after this information.
The researcher will present the data for all of the hypotheses tested or research questions asked (e.g., whether the hypotheses or research questions were accepted, rejected, supported, or partially supported). If the data supported the hypotheses or research questions, you may be tempted to assume that the hypotheses or research questions were proven; however, this is not true. It only means that the hypotheses or research questions were supported. The results suggest that the relationships or differences tested, derived from the theoretical framework, were statistically significant and probably logical for that study’s sample. You may think that if a study’s results are not supported statistically or are only partially supported, the study is irrelevant or possibly should not have been published, but this also is not true. If the data are not supported, you should not expect the researcher to bury the work in a file. It is as important for you, as well as the researcher, to review and understand studies where the hypotheses or research questions are not supported by the study findings. Information obtained from these studies is often as useful as data obtained from studies with supported hypotheses and research questions.
Studies that have findings that do not support one or more hypotheses or research questions can be used to suggest limitations (issues with the study’s validity, bias, or study weaknesses) of particular aspects of a study’s design and procedures. Findings from studies with data that do not support the hypotheses or research questions may suggest that current modes of practice or current theory may not be supported by research evidence and therefore must be reexamined, researched further, and not be used at this time to support practice changes. Data help generate new knowledge and evidence, as well as prevent knowledge stagnation. Generally, the results are interpreted in a separate section of the report. At times, you may find that the “Results” section contains the results and the researcher’s interpretations, which are generally found in the “Discussion” section. Integrating the results with the
546
discussion is the author’s or journal editor’s decision. Both sections may be integrated when a study contains several segments that may be viewed as fairly separate subproblems of a major overall problem.
The investigator should also demonstrate objectivity in the presentation of the results. The investigators would be accused of lacking objectivity if they state the results in the following manner: “The results were not surprising as we found that the mean scores were significantly different in the comparison group, as we expected.” Opinions or reactionary statements about the data are therefore avoided in the “Results” section. Box 17.1 provides examples of objectively stated results. As you appraise a study, you should consider the following points when reading the “Results” section:
• Investigators responded objectively to the results in the discussion of the findings.
• Investigators interpreted the evidence provided by the results, with a careful reflection on all aspects of the study that preceded the results. Data presented are summarized. Much data are generated, but only the critical summary numbers for each test are presented. Examples of summarized demographic data are the means and standard deviations of age, education, and income. Including all data is too cumbersome. The results should be viewed as a summary.
• Reduction of data is provided in the text and through the use of tables and figures. Tables and figures facilitate the presentation of large amounts of data.
• Results for the descriptive and inferential statistics for each hypothesis or research question are presented. No data are omitted, even if they are not significant. Untoward events during the course of the study should be reported.
BOX 17.1 Examples of Results Section
547
• “Parents’ psychological distress was positively associated with age (r = 0.53, P < 0.01) and avoidant coping (e.g., denial, disengagement) (r = 0.53, P < 0.01)” (Turner-Sack et al., 2016).
• “Bereaved fathers’ greater use of spiritual activities was significantly related to lower symptoms of grief (despair, detachment and disorganization at T1 and T2 [Table 2])” (Hawthorne et al., 2012).
In their study, Hawthorne and colleagues (2016) developed tables to present the results visually. Table 17.2 provides a portion of the descriptive results about the subjects’ demographics. Table 17.3 provides the correlations among the study’s variables. Tables allow researchers to provide a more visually thorough explanation and discussion of the results. If tables and figures are used, they must be concise. Although the article’s text is the major mode of communicating the results, the tables and figures serve a supplementary but independent role. The role of tables and figures is to report results with some detail that the investigator does not explore in the text. This does not mean that tables and figures should not be mentioned in the text. The amount of detail that an author uses in the text to describe the specific tabled data varies according to the needs of the author. A good table is one that meets the following criteria:
• Supplements and economizes the text
• Has precise titles and headings
• Does not repeat the text
TABLE 17.2 Description of the Sample
548
From Hawthorne, D. M., Youngblut, J. M., & Brooten, D. (2016). Parent spirituality, grief, and mental health at 1 and 3 months after their infant’s/child’s death in an intensive care unit. Journal of Pediatric Nursing, 31, 73–80.
TABLE 17.3 Correlations of Parents’ Use of Spiritual and Religious Activities With Grief, Mental Health, and Personal Growth at 1 (T1) and 3 (T2) Months Post-Death
*P <.05. **P <.01. From Hawthorne, D. M., Youngblut, J. M., & Brooten, D. (2016). Parent spirituality, grief, and mental health at 1 and 3 months after their infant’s/child’s death in an intensive care unit. Journal of Pediatric Nursing, 31, 73–80.
Tables are found in each of the studies in the appendices. Each of these tables helps to economize and supplement the text clearly, with precise data that help you to visualize the variables quickly
549
and to assess the results.
EVIDENCE-BASED PRACTICE TIP As you reflect on the results of a study, think about how the results fit with previous research on the topic and the strength and quality of available evidence on which to base clinical practice decisions.
Discussion In this section, the investigator interprets and discusses the study’s results. The researcher makes the data come alive and gives meaning to and provides interpretations for the numbers in quantitative studies or the concepts in qualitative studies. This discussion section contains a discussion of the findings, the study’s limitations, and recommendations for practice and future research. At times these topics are separated as stand-alone sections of the research report, or they may be integrated under the title of “Discussion.” You may ask where the investigator extracted the meaning that is applied in this section. If the researcher does the job properly, you will find a return to the beginning of the study. The researcher returns to the earlier points in the study where the purpose, objective, and research question and/or a hypothesis was identified, and independent and dependent variables were linked on the basis of a theoretical framework and literature review (see Chapters 3 and 4). It is in this section that the researcher discusses
• Both the supported and nonsupported data
• Limitations or weaknesses (threats to internal or external validity) of a study in light of the design, sample, instruments, data collection procedures, and fidelity
• How the theoretical framework was supported or not supported
• How the data may suggest additional or previously unrealized findings
• Strength and quality of the evidence provided by the study and its findings interpreted in relation to its applicability to practice and future research
550
Even if the data are supported, this is not the final word. Statistical significance is not the endpoint of a researcher’s thinking; statistically significant but low P values may not be indicative of research breakthroughs. It is important to think beyond statistical significance to clinical significance. This means that statistical significance in a study does not always indicate that the results of a study are clinically significant. A key step in the process of evaluation is the ability to critically analyze beyond the test of significance by assessing a research study’s applicability to practice. Chapters 19 through 21 review the methods used to analyze the usefulness and applicability of research findings. Within nursing and health care literature, discussion of clinical significance, evidence-based practice, and quality improvement are focal points (Titler, 2012). As indicated throughout this text, many important pieces in the research puzzle must fit together for a study to be evaluated as a well-done study. The evidence generated by the findings of a study is appraised in order to validate current practice or support the need for a change in practice. Results of unsupported hypotheses or research questions do not require the investigator to go on a fault-finding tour of each piece of the study—this can become an overdone process. All research studies have weaknesses as well as strengths. The final discussion is an attempt to identify the strengths as well as the weaknesses or bias of the study.
HELPFUL HINT A well-written “Results” section is systematic, logical, concise, and drawn from all of the analyzed data. The writing in the “Results” section should allow the data to reflect the testing of the research questions and hypotheses. The length of this section depends on the scope and breadth of the analysis.
Researchers and appraisers should accept statistical significance with prudence. Statistically significant findings are not the sole means of establishing a study’s merit. Remember that accepting statistical significance means accepting that the sample mean is the same as the population mean. Statistical significance is a measure of assessment that, if true, does not automatically support the merit to a study and, if untrue, does not necessarily negate the value of a study (see Chapter 12). Another method to assess the merit of a
551
study and determine whether the findings from one study can be generalized is to calculate a confidence interval. A confidence interval quantifies the uncertainty of a statistic or the probable value range within which a population parameter is expected to lie (see Chapter 19). The process used to calculate a confidence interval is beyond the scope of this text, but references are provided for further explanation (Altman, 2005; Altman et al., 2005; Kline, 2004). Other aspects, such as the sample, instruments, data collection methods, and fidelity, must also be considered.
Whether the results are or are not statistically supported, in this section, the researcher returns to the conceptual/theoretical framework and analyzes each step of the research process to accomplish a discussion of the following issues:
• Suggest what the possible or actual problems are in the study.
• Whether findings are supported or not supported, the researcher is obliged to review the study’s processes.
• Was the theoretical thinking correct? (See Chapters 3 and 4.)
• Was the correct design chosen? (See Chapters 9 and 10.)
• In terms of sampling methods (see Chapter 12), was the sample size adequate? Were the inclusion and exclusion criteria delineated well?
• Did any bias arise during the course of the study; that is, threats to internal and external validity? (See Chapter 8.)
• Was data collection consistent, and did it exhibit fidelity? (See Chapter 14.)
• Were the instruments sensitive to what was being tested? Were they reliable and valid? (See Chapters 14 and 15.)
• Were the analysis choices appropriate? (See Chapter 16.)
The purpose of this section is not to show humility or one’s technical competence but rather to enable you to judge the validity
552
of the interpretations drawn from the data and the general worth of the study. It is in this section of the report that the researcher ties together all the loose ends of the study and returns to the beginning to assess if the findings support, extend, or counter the theoretical framework of the study. It is from this point that you can begin to think about clinical relevance, the need for replication, or the germination of an idea for further research. The researcher also includes generalizability and recommendations for future research, as well as a summary or a conclusion.
Generalizations (generalizability) are inferences that the data are representative of similar phenomena in a population beyond the study’s sample. Rarely, if ever, can one study be a recommendation for action. Beware of research studies that may overgeneralize. Generalizations that draw conclusions and make inferences for a specific group within a particular situation and at a particular time are appropriate. An example ➤ of such a limitation is drawn from the study conducted by Hawthorne and colleagues (2016; Appendix B). The researchers appropriately noted the following:
There are several additional limitations of the study. At 1 and 3 months post-death, parents were in early stages of grieving. Thus, these findings may not be applicable to parents who are later in the grieving process.
This type of statement is important for consumers of research. It helps to guide our thinking in terms of a study’s clinical relevance and also suggests areas for research. One study does not provide all of the answers, nor should it. In fact, the risk versus the benefit of the potential change in practice must be considered in terms of the strength and quality of the evidence (see Chapter 19). The greater the risk involved in making a change in practice, the stronger the evidence needs to be to justify the merit of implementing a practice change. The final steps of evaluation are critical links to the refinement of practice and the generation of future research. Evaluation of research, like evaluation of the nursing process, is not the last link in the chain but a connection between the strength of the evidence that may serve to improve patient care and inform
553
clinical decision making and support an evidence-based practice.
HIGHLIGHT Your team should remember the saying that a good study is one that raises more questions than it answers. So your team should not view a researcher’s review of a study’s limitations and recommendations for future research as evidence of the researcher’s lack of research skills. Rather, it reflects the next steps in building a strong body of evidence.
The final element that the investigator integrates into the “Discussion” is the recommendations. The recommendations are the investigator’s suggestions for the study’s application to practice, theory, and further research. This requires the investigator to reflect on the following questions:
• What contribution does this study make to clinical practice?
• What are the strengths, quality, and consistency of the evidence provided by the findings?
• Does the evidence provided in the findings validate current practice or support the need for change in practice?
Box 17.2 provides examples ➤ of recommendations for future research and implications for nursing practice. This evaluation places the study into the realm of what is known and what needs to be known before being used. Nursing knowledge and evidence- based practice have grown tremendously over the last century through the efforts of many nurse researchers and scholars. BOX 17.2 Examples of Research Recommendations and Practice Implications Research recommendations
• “The findings support the need to continue examining the effects of childhood and adolescent cancer on the entire family. Additional studies would benefit from having all members of
554
each family participate to obtain a true family systems perspective on the impact of childhood and adolescent cancer” (Turner-Sack et al., 2016).
• “Further research is needed to determine if any changes, whether negative or positive, occurred in parents’ use of religious and spiritual activities to cope and the effect on their grief response, mental health and personal growth in the later stages of bereavement” (Hawthorne et al., 2016).
Practice implications
• “The results from this longitudinal study with a racially and ethnically diverse sample provide evidence for healthcare professionals about the importance of spiritual coping activities for bereaved mothers and fathers” (Hawthorne et al., 2016).
• “Healthcare providers have contact not only with their patients, but also with their patients’ family members. These findings demonstrate the need to be aware of the potential impact of cancer on all family members” (Turner-Sack et al., 2016).
Appraisal for evidence-based practice research findings The “Results” and the “Discussion” sections are the researcher’s opportunity to examine the logic of the hypothesis (or hypotheses) or research question(s) posed, the theoretical framework, the methods, and the analysis (see the critical appraisal criteria box). This final section requires as much logic, conciseness, and specificity as employed in the preceding steps of the research process. You should be able to identify statements of the type of analysis that was used and whether the data statistically supported the hypothesis or research question. These statements should be straightforward and should not reflect bias (see Tables 17.2 and 17.3). Auxiliary data or serendipitous findings also may be presented. If such auxiliary findings are presented, they should be as dispassionately presented as the hypothesis and research
555
question data.
CRITICAL APPRAISAL CRITERIA Research Findings
1. Are the results of each of the hypotheses presented?
2. Is the information regarding the results concisely and sequentially presented?
3. Are the tests that were used to analyze the data presented?
4. Are the results presented objectively?
5. If tables or figures are used, do they meet the following standards?
a. They supplement and economize the text.
b. They have precise titles and headings.
c. They are not repetitious of the text.
6. Are the results interpreted in light of the hypotheses, research questions, and theoretical framework, and all of the other steps that preceded the results?
7. If the hypotheses or research questions are supported, does the investigator provide a discussion of how the theoretical framework was supported?
8. How does the investigator attempt to identify the study’s weaknesses (i.e., threats to internal and external validity) and strengths, as well as suggest possible solutions for the research area?
9. Does the researcher discuss the study’s clinical relevance?
10. Are any generalizations made, and if so, are they within the
556
scope of the findings or beyond the findings?
11. Are any recommendations for future research stated or implied?
12. What is the study’s strength of evidence?
The statistical test(s) used should also be noted. The numerical value of the obtained data should also be presented (see Tables 17.1 to 17.3). The presentation of the tests, the numerical values found, and the statements of support or nonsupport should be clear, concise, and systematically reported. For illustrative purposes that facilitate readability, the researchers should present extensive findings in tables. If the findings were not supported, you should— as the researcher did—attempt to identify, without finding fault, possible methodological problems (e.g., sample too small to detect a treatment effect).
From a consumer perspective, the “Discussion” section at the end of a research article is very important for determining the potential application to practice. The “Discussion” section should interpret the study’s data for future research and implications for practice, including its strength, quality, gaps, limitations, and conclusions of the study. Statements reflecting the underlying theory are necessary, whether or not the hypotheses were supported. Included in this discussion are the limitations for practice. This discussion should reflect each step of the research process and potential threats to internal validity or bias and external validity or generalizability.
This last presentation can help you begin to rethink clinical practice, provoke discussion in clinical settings (see Chapters 19 and 20), and find similar studies that may support or refute the phenomena being studied to more fully understand the problem.
One study alone does not lead to a practice change. Evidence- based practice and quality improvement require you to critically read and understand each study—that is, the quality of the study, the strength of the evidence generated by the findings and its consistency with other studies in the area, and the number of studies that were conducted in the area. This assessment along with the active use of clinical judgment and patient preference leads to evidence-based practice.
557
Key points • The analysis of the findings is the final step of a study. It is in this
section that the results will be presented in a straightforward manner.
• All results should be reported whether or not they support the hypothesis. Tables and figures may be used to illustrate and condense data for presentation.
• Once the results are reported, the researcher interprets the results. In this presentation, usually titled “Discussion,” readers should be able to identify the key topics being discussed. The key topics, which include an interpretation of the results, are the limitations, generalizations, implications, and recommendations for future research.
• The researcher draws together the theoretical framework and makes interpretations based on the findings and theory in the section on the interpretation of the results. Both statistically supported and unsupported results should be interpreted. If the results are n