a case study methodology is useful in

The Ultimate Guide to Qualitative Research - Part 1: The Basics

a case study methodology is useful in

  • Introduction and overview
  • What is qualitative research?
  • What is qualitative data?
  • Examples of qualitative data
  • Qualitative vs. quantitative research
  • Mixed methods
  • Qualitative research preparation
  • Theoretical perspective
  • Theoretical framework
  • Literature reviews

Research question

  • Conceptual framework
  • Conceptual vs. theoretical framework

Data collection

  • Qualitative research methods
  • Focus groups
  • Observational research

What is a case study?

Applications for case study research, what is a good case study, process of case study design, benefits and limitations of case studies.

  • Ethnographical research
  • Ethical considerations
  • Confidentiality and privacy
  • Power dynamics
  • Reflexivity

Case studies

Case studies are essential to qualitative research , offering a lens through which researchers can investigate complex phenomena within their real-life contexts. This chapter explores the concept, purpose, applications, examples, and types of case studies and provides guidance on how to conduct case study research effectively.

a case study methodology is useful in

Whereas quantitative methods look at phenomena at scale, case study research looks at a concept or phenomenon in considerable detail. While analyzing a single case can help understand one perspective regarding the object of research inquiry, analyzing multiple cases can help obtain a more holistic sense of the topic or issue. Let's provide a basic definition of a case study, then explore its characteristics and role in the qualitative research process.

Definition of a case study

A case study in qualitative research is a strategy of inquiry that involves an in-depth investigation of a phenomenon within its real-world context. It provides researchers with the opportunity to acquire an in-depth understanding of intricate details that might not be as apparent or accessible through other methods of research. The specific case or cases being studied can be a single person, group, or organization – demarcating what constitutes a relevant case worth studying depends on the researcher and their research question .

Among qualitative research methods , a case study relies on multiple sources of evidence, such as documents, artifacts, interviews , or observations , to present a complete and nuanced understanding of the phenomenon under investigation. The objective is to illuminate the readers' understanding of the phenomenon beyond its abstract statistical or theoretical explanations.

Characteristics of case studies

Case studies typically possess a number of distinct characteristics that set them apart from other research methods. These characteristics include a focus on holistic description and explanation, flexibility in the design and data collection methods, reliance on multiple sources of evidence, and emphasis on the context in which the phenomenon occurs.

Furthermore, case studies can often involve a longitudinal examination of the case, meaning they study the case over a period of time. These characteristics allow case studies to yield comprehensive, in-depth, and richly contextualized insights about the phenomenon of interest.

The role of case studies in research

Case studies hold a unique position in the broader landscape of research methods aimed at theory development. They are instrumental when the primary research interest is to gain an intensive, detailed understanding of a phenomenon in its real-life context.

In addition, case studies can serve different purposes within research - they can be used for exploratory, descriptive, or explanatory purposes, depending on the research question and objectives. This flexibility and depth make case studies a valuable tool in the toolkit of qualitative researchers.

Remember, a well-conducted case study can offer a rich, insightful contribution to both academic and practical knowledge through theory development or theory verification, thus enhancing our understanding of complex phenomena in their real-world contexts.

What is the purpose of a case study?

Case study research aims for a more comprehensive understanding of phenomena, requiring various research methods to gather information for qualitative analysis . Ultimately, a case study can allow the researcher to gain insight into a particular object of inquiry and develop a theoretical framework relevant to the research inquiry.

Why use case studies in qualitative research?

Using case studies as a research strategy depends mainly on the nature of the research question and the researcher's access to the data.

Conducting case study research provides a level of detail and contextual richness that other research methods might not offer. They are beneficial when there's a need to understand complex social phenomena within their natural contexts.

The explanatory, exploratory, and descriptive roles of case studies

Case studies can take on various roles depending on the research objectives. They can be exploratory when the research aims to discover new phenomena or define new research questions; they are descriptive when the objective is to depict a phenomenon within its context in a detailed manner; and they can be explanatory if the goal is to understand specific relationships within the studied context. Thus, the versatility of case studies allows researchers to approach their topic from different angles, offering multiple ways to uncover and interpret the data .

The impact of case studies on knowledge development

Case studies play a significant role in knowledge development across various disciplines. Analysis of cases provides an avenue for researchers to explore phenomena within their context based on the collected data.

a case study methodology is useful in

This can result in the production of rich, practical insights that can be instrumental in both theory-building and practice. Case studies allow researchers to delve into the intricacies and complexities of real-life situations, uncovering insights that might otherwise remain hidden.

Types of case studies

In qualitative research , a case study is not a one-size-fits-all approach. Depending on the nature of the research question and the specific objectives of the study, researchers might choose to use different types of case studies. These types differ in their focus, methodology, and the level of detail they provide about the phenomenon under investigation.

Understanding these types is crucial for selecting the most appropriate approach for your research project and effectively achieving your research goals. Let's briefly look at the main types of case studies.

Exploratory case studies

Exploratory case studies are typically conducted to develop a theory or framework around an understudied phenomenon. They can also serve as a precursor to a larger-scale research project. Exploratory case studies are useful when a researcher wants to identify the key issues or questions which can spur more extensive study or be used to develop propositions for further research. These case studies are characterized by flexibility, allowing researchers to explore various aspects of a phenomenon as they emerge, which can also form the foundation for subsequent studies.

Descriptive case studies

Descriptive case studies aim to provide a complete and accurate representation of a phenomenon or event within its context. These case studies are often based on an established theoretical framework, which guides how data is collected and analyzed. The researcher is concerned with describing the phenomenon in detail, as it occurs naturally, without trying to influence or manipulate it.

Explanatory case studies

Explanatory case studies are focused on explanation - they seek to clarify how or why certain phenomena occur. Often used in complex, real-life situations, they can be particularly valuable in clarifying causal relationships among concepts and understanding the interplay between different factors within a specific context.

a case study methodology is useful in

Intrinsic, instrumental, and collective case studies

These three categories of case studies focus on the nature and purpose of the study. An intrinsic case study is conducted when a researcher has an inherent interest in the case itself. Instrumental case studies are employed when the case is used to provide insight into a particular issue or phenomenon. A collective case study, on the other hand, involves studying multiple cases simultaneously to investigate some general phenomena.

Each type of case study serves a different purpose and has its own strengths and challenges. The selection of the type should be guided by the research question and objectives, as well as the context and constraints of the research.

The flexibility, depth, and contextual richness offered by case studies make this approach an excellent research method for various fields of study. They enable researchers to investigate real-world phenomena within their specific contexts, capturing nuances that other research methods might miss. Across numerous fields, case studies provide valuable insights into complex issues.

Critical information systems research

Case studies provide a detailed understanding of the role and impact of information systems in different contexts. They offer a platform to explore how information systems are designed, implemented, and used and how they interact with various social, economic, and political factors. Case studies in this field often focus on examining the intricate relationship between technology, organizational processes, and user behavior, helping to uncover insights that can inform better system design and implementation.

Health research

Health research is another field where case studies are highly valuable. They offer a way to explore patient experiences, healthcare delivery processes, and the impact of various interventions in a real-world context.

a case study methodology is useful in

Case studies can provide a deep understanding of a patient's journey, giving insights into the intricacies of disease progression, treatment effects, and the psychosocial aspects of health and illness.

Asthma research studies

Specifically within medical research, studies on asthma often employ case studies to explore the individual and environmental factors that influence asthma development, management, and outcomes. A case study can provide rich, detailed data about individual patients' experiences, from the triggers and symptoms they experience to the effectiveness of various management strategies. This can be crucial for developing patient-centered asthma care approaches.

Other fields

Apart from the fields mentioned, case studies are also extensively used in business and management research, education research, and political sciences, among many others. They provide an opportunity to delve into the intricacies of real-world situations, allowing for a comprehensive understanding of various phenomena.

Case studies, with their depth and contextual focus, offer unique insights across these varied fields. They allow researchers to illuminate the complexities of real-life situations, contributing to both theory and practice.

a case study methodology is useful in

Whatever field you're in, ATLAS.ti puts your data to work for you

Download a free trial of ATLAS.ti to turn your data into insights.

Understanding the key elements of case study design is crucial for conducting rigorous and impactful case study research. A well-structured design guides the researcher through the process, ensuring that the study is methodologically sound and its findings are reliable and valid. The main elements of case study design include the research question , propositions, units of analysis, and the logic linking the data to the propositions.

The research question is the foundation of any research study. A good research question guides the direction of the study and informs the selection of the case, the methods of collecting data, and the analysis techniques. A well-formulated research question in case study research is typically clear, focused, and complex enough to merit further detailed examination of the relevant case(s).

Propositions

Propositions, though not necessary in every case study, provide a direction by stating what we might expect to find in the data collected. They guide how data is collected and analyzed by helping researchers focus on specific aspects of the case. They are particularly important in explanatory case studies, which seek to understand the relationships among concepts within the studied phenomenon.

Units of analysis

The unit of analysis refers to the case, or the main entity or entities that are being analyzed in the study. In case study research, the unit of analysis can be an individual, a group, an organization, a decision, an event, or even a time period. It's crucial to clearly define the unit of analysis, as it shapes the qualitative data analysis process by allowing the researcher to analyze a particular case and synthesize analysis across multiple case studies to draw conclusions.

Argumentation

This refers to the inferential model that allows researchers to draw conclusions from the data. The researcher needs to ensure that there is a clear link between the data, the propositions (if any), and the conclusions drawn. This argumentation is what enables the researcher to make valid and credible inferences about the phenomenon under study.

Understanding and carefully considering these elements in the design phase of a case study can significantly enhance the quality of the research. It can help ensure that the study is methodologically sound and its findings contribute meaningful insights about the case.

Ready to jumpstart your research with ATLAS.ti?

Conceptualize your research project with our intuitive data analysis interface. Download a free trial today.

Conducting a case study involves several steps, from defining the research question and selecting the case to collecting and analyzing data . This section outlines these key stages, providing a practical guide on how to conduct case study research.

Defining the research question

The first step in case study research is defining a clear, focused research question. This question should guide the entire research process, from case selection to analysis. It's crucial to ensure that the research question is suitable for a case study approach. Typically, such questions are exploratory or descriptive in nature and focus on understanding a phenomenon within its real-life context.

Selecting and defining the case

The selection of the case should be based on the research question and the objectives of the study. It involves choosing a unique example or a set of examples that provide rich, in-depth data about the phenomenon under investigation. After selecting the case, it's crucial to define it clearly, setting the boundaries of the case, including the time period and the specific context.

Previous research can help guide the case study design. When considering a case study, an example of a case could be taken from previous case study research and used to define cases in a new research inquiry. Considering recently published examples can help understand how to select and define cases effectively.

Developing a detailed case study protocol

A case study protocol outlines the procedures and general rules to be followed during the case study. This includes the data collection methods to be used, the sources of data, and the procedures for analysis. Having a detailed case study protocol ensures consistency and reliability in the study.

The protocol should also consider how to work with the people involved in the research context to grant the research team access to collecting data. As mentioned in previous sections of this guide, establishing rapport is an essential component of qualitative research as it shapes the overall potential for collecting and analyzing data.

Collecting data

Gathering data in case study research often involves multiple sources of evidence, including documents, archival records, interviews, observations, and physical artifacts. This allows for a comprehensive understanding of the case. The process for gathering data should be systematic and carefully documented to ensure the reliability and validity of the study.

Analyzing and interpreting data

The next step is analyzing the data. This involves organizing the data , categorizing it into themes or patterns , and interpreting these patterns to answer the research question. The analysis might also involve comparing the findings with prior research or theoretical propositions.

Writing the case study report

The final step is writing the case study report . This should provide a detailed description of the case, the data, the analysis process, and the findings. The report should be clear, organized, and carefully written to ensure that the reader can understand the case and the conclusions drawn from it.

Each of these steps is crucial in ensuring that the case study research is rigorous, reliable, and provides valuable insights about the case.

The type, depth, and quality of data in your study can significantly influence the validity and utility of the study. In case study research, data is usually collected from multiple sources to provide a comprehensive and nuanced understanding of the case. This section will outline the various methods of collecting data used in case study research and discuss considerations for ensuring the quality of the data.

Interviews are a common method of gathering data in case study research. They can provide rich, in-depth data about the perspectives, experiences, and interpretations of the individuals involved in the case. Interviews can be structured , semi-structured , or unstructured , depending on the research question and the degree of flexibility needed.

Observations

Observations involve the researcher observing the case in its natural setting, providing first-hand information about the case and its context. Observations can provide data that might not be revealed in interviews or documents, such as non-verbal cues or contextual information.

Documents and artifacts

Documents and archival records provide a valuable source of data in case study research. They can include reports, letters, memos, meeting minutes, email correspondence, and various public and private documents related to the case.

a case study methodology is useful in

These records can provide historical context, corroborate evidence from other sources, and offer insights into the case that might not be apparent from interviews or observations.

Physical artifacts refer to any physical evidence related to the case, such as tools, products, or physical environments. These artifacts can provide tangible insights into the case, complementing the data gathered from other sources.

Ensuring the quality of data collection

Determining the quality of data in case study research requires careful planning and execution. It's crucial to ensure that the data is reliable, accurate, and relevant to the research question. This involves selecting appropriate methods of collecting data, properly training interviewers or observers, and systematically recording and storing the data. It also includes considering ethical issues related to collecting and handling data, such as obtaining informed consent and ensuring the privacy and confidentiality of the participants.

Data analysis

Analyzing case study research involves making sense of the rich, detailed data to answer the research question. This process can be challenging due to the volume and complexity of case study data. However, a systematic and rigorous approach to analysis can ensure that the findings are credible and meaningful. This section outlines the main steps and considerations in analyzing data in case study research.

Organizing the data

The first step in the analysis is organizing the data. This involves sorting the data into manageable sections, often according to the data source or the theme. This step can also involve transcribing interviews, digitizing physical artifacts, or organizing observational data.

Categorizing and coding the data

Once the data is organized, the next step is to categorize or code the data. This involves identifying common themes, patterns, or concepts in the data and assigning codes to relevant data segments. Coding can be done manually or with the help of software tools, and in either case, qualitative analysis software can greatly facilitate the entire coding process. Coding helps to reduce the data to a set of themes or categories that can be more easily analyzed.

Identifying patterns and themes

After coding the data, the researcher looks for patterns or themes in the coded data. This involves comparing and contrasting the codes and looking for relationships or patterns among them. The identified patterns and themes should help answer the research question.

Interpreting the data

Once patterns and themes have been identified, the next step is to interpret these findings. This involves explaining what the patterns or themes mean in the context of the research question and the case. This interpretation should be grounded in the data, but it can also involve drawing on theoretical concepts or prior research.

Verification of the data

The last step in the analysis is verification. This involves checking the accuracy and consistency of the analysis process and confirming that the findings are supported by the data. This can involve re-checking the original data, checking the consistency of codes, or seeking feedback from research participants or peers.

Like any research method , case study research has its strengths and limitations. Researchers must be aware of these, as they can influence the design, conduct, and interpretation of the study.

Understanding the strengths and limitations of case study research can also guide researchers in deciding whether this approach is suitable for their research question . This section outlines some of the key strengths and limitations of case study research.

Benefits include the following:

  • Rich, detailed data: One of the main strengths of case study research is that it can generate rich, detailed data about the case. This can provide a deep understanding of the case and its context, which can be valuable in exploring complex phenomena.
  • Flexibility: Case study research is flexible in terms of design , data collection , and analysis . A sufficient degree of flexibility allows the researcher to adapt the study according to the case and the emerging findings.
  • Real-world context: Case study research involves studying the case in its real-world context, which can provide valuable insights into the interplay between the case and its context.
  • Multiple sources of evidence: Case study research often involves collecting data from multiple sources , which can enhance the robustness and validity of the findings.

On the other hand, researchers should consider the following limitations:

  • Generalizability: A common criticism of case study research is that its findings might not be generalizable to other cases due to the specificity and uniqueness of each case.
  • Time and resource intensive: Case study research can be time and resource intensive due to the depth of the investigation and the amount of collected data.
  • Complexity of analysis: The rich, detailed data generated in case study research can make analyzing the data challenging.
  • Subjectivity: Given the nature of case study research, there may be a higher degree of subjectivity in interpreting the data , so researchers need to reflect on this and transparently convey to audiences how the research was conducted.

Being aware of these strengths and limitations can help researchers design and conduct case study research effectively and interpret and report the findings appropriately.

a case study methodology is useful in

Ready to analyze your data with ATLAS.ti?

See how our intuitive software can draw key insights from your data with a free trial today.

  • Business Essentials
  • Leadership & Management
  • Credential of Leadership, Impact, and Management in Business (CLIMB)
  • Entrepreneurship & Innovation
  • Digital Transformation
  • Finance & Accounting
  • Business in Society
  • For Organizations
  • Support Portal
  • Media Coverage
  • Founding Donors
  • Leadership Team

a case study methodology is useful in

  • Harvard Business School →
  • HBS Online →
  • Business Insights →

Business Insights

Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.

  • Career Development
  • Communication
  • Decision-Making
  • Earning Your MBA
  • Negotiation
  • News & Events
  • Productivity
  • Staff Spotlight
  • Student Profiles
  • Work-Life Balance
  • AI Essentials for Business
  • Alternative Investments
  • Business Analytics
  • Business Strategy
  • Business and Climate Change
  • Design Thinking and Innovation
  • Digital Marketing Strategy
  • Disruptive Strategy
  • Economics for Managers
  • Entrepreneurship Essentials
  • Financial Accounting
  • Global Business
  • Launching Tech Ventures
  • Leadership Principles
  • Leadership, Ethics, and Corporate Accountability
  • Leading Change and Organizational Renewal
  • Leading with Finance
  • Management Essentials
  • Negotiation Mastery
  • Organizational Leadership
  • Power and Influence for Positive Impact
  • Strategy Execution
  • Sustainable Business Strategy
  • Sustainable Investing
  • Winning with Digital Platforms

5 Benefits of Learning Through the Case Study Method

Harvard Business School MBA students learning through the case study method

  • 28 Nov 2023

While several factors make HBS Online unique —including a global Community and real-world outcomes —active learning through the case study method rises to the top.

In a 2023 City Square Associates survey, 74 percent of HBS Online learners who also took a course from another provider said HBS Online’s case method and real-world examples were better by comparison.

Here’s a primer on the case method, five benefits you could gain, and how to experience it for yourself.

Access your free e-book today.

What Is the Harvard Business School Case Study Method?

The case study method , or case method , is a learning technique in which you’re presented with a real-world business challenge and asked how you’d solve it. After working through it yourself and with peers, you’re told how the scenario played out.

HBS pioneered the case method in 1922. Shortly before, in 1921, the first case was written.

“How do you go into an ambiguous situation and get to the bottom of it?” says HBS Professor Jan Rivkin, former senior associate dean and chair of HBS's master of business administration (MBA) program, in a video about the case method . “That skill—the skill of figuring out a course of inquiry to choose a course of action—that skill is as relevant today as it was in 1921.”

Originally developed for the in-person MBA classroom, HBS Online adapted the case method into an engaging, interactive online learning experience in 2014.

In HBS Online courses , you learn about each case from the business professional who experienced it. After reviewing their videos, you’re prompted to take their perspective and explain how you’d handle their situation.

You then get to read peers’ responses, “star” them, and comment to further the discussion. Afterward, you learn how the professional handled it and their key takeaways.

HBS Online’s adaptation of the case method incorporates the famed HBS “cold call,” in which you’re called on at random to make a decision without time to prepare.

“Learning came to life!” said Sheneka Balogun , chief administration officer and chief of staff at LeMoyne-Owen College, of her experience taking the Credential of Readiness (CORe) program . “The videos from the professors, the interactive cold calls where you were randomly selected to participate, and the case studies that enhanced and often captured the essence of objectives and learning goals were all embedded in each module. This made learning fun, engaging, and student-friendly.”

If you’re considering taking a course that leverages the case study method, here are five benefits you could experience.

5 Benefits of Learning Through Case Studies

1. take new perspectives.

The case method prompts you to consider a scenario from another person’s perspective. To work through the situation and come up with a solution, you must consider their circumstances, limitations, risk tolerance, stakeholders, resources, and potential consequences to assess how to respond.

Taking on new perspectives not only can help you navigate your own challenges but also others’. Putting yourself in someone else’s situation to understand their motivations and needs can go a long way when collaborating with stakeholders.

2. Hone Your Decision-Making Skills

Another skill you can build is the ability to make decisions effectively . The case study method forces you to use limited information to decide how to handle a problem—just like in the real world.

Throughout your career, you’ll need to make difficult decisions with incomplete or imperfect information—and sometimes, you won’t feel qualified to do so. Learning through the case method allows you to practice this skill in a low-stakes environment. When facing a real challenge, you’ll be better prepared to think quickly, collaborate with others, and present and defend your solution.

3. Become More Open-Minded

As you collaborate with peers on responses, it becomes clear that not everyone solves problems the same way. Exposing yourself to various approaches and perspectives can help you become a more open-minded professional.

When you’re part of a diverse group of learners from around the world, your experiences, cultures, and backgrounds contribute to a range of opinions on each case.

On the HBS Online course platform, you’re prompted to view and comment on others’ responses, and discussion is encouraged. This practice of considering others’ perspectives can make you more receptive in your career.

“You’d be surprised at how much you can learn from your peers,” said Ratnaditya Jonnalagadda , a software engineer who took CORe.

In addition to interacting with peers in the course platform, Jonnalagadda was part of the HBS Online Community , where he networked with other professionals and continued discussions sparked by course content.

“You get to understand your peers better, and students share examples of businesses implementing a concept from a module you just learned,” Jonnalagadda said. “It’s a very good way to cement the concepts in one's mind.”

4. Enhance Your Curiosity

One byproduct of taking on different perspectives is that it enables you to picture yourself in various roles, industries, and business functions.

“Each case offers an opportunity for students to see what resonates with them, what excites them, what bores them, which role they could imagine inhabiting in their careers,” says former HBS Dean Nitin Nohria in the Harvard Business Review . “Cases stimulate curiosity about the range of opportunities in the world and the many ways that students can make a difference as leaders.”

Through the case method, you can “try on” roles you may not have considered and feel more prepared to change or advance your career .

5. Build Your Self-Confidence

Finally, learning through the case study method can build your confidence. Each time you assume a business leader’s perspective, aim to solve a new challenge, and express and defend your opinions and decisions to peers, you prepare to do the same in your career.

According to a 2022 City Square Associates survey , 84 percent of HBS Online learners report feeling more confident making business decisions after taking a course.

“Self-confidence is difficult to teach or coach, but the case study method seems to instill it in people,” Nohria says in the Harvard Business Review . “There may well be other ways of learning these meta-skills, such as the repeated experience gained through practice or guidance from a gifted coach. However, under the direction of a masterful teacher, the case method can engage students and help them develop powerful meta-skills like no other form of teaching.”

Your Guide to Online Learning Success | Download Your Free E-Book

How to Experience the Case Study Method

If the case method seems like a good fit for your learning style, experience it for yourself by taking an HBS Online course. Offerings span seven subject areas, including:

  • Business essentials
  • Leadership and management
  • Entrepreneurship and innovation
  • Finance and accounting
  • Business in society

No matter which course or credential program you choose, you’ll examine case studies from real business professionals, work through their challenges alongside peers, and gain valuable insights to apply to your career.

Are you interested in discovering how HBS Online can help advance your career? Explore our course catalog and download our free guide —complete with interactive workbook sections—to determine if online learning is right for you and which course to take.

a case study methodology is useful in

About the Author

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

What the Case Study Method Really Teaches

  • Nitin Nohria

a case study methodology is useful in

Seven meta-skills that stick even if the cases fade from memory.

It’s been 100 years since Harvard Business School began using the case study method. Beyond teaching specific subject matter, the case study method excels in instilling meta-skills in students. This article explains the importance of seven such skills: preparation, discernment, bias recognition, judgement, collaboration, curiosity, and self-confidence.

During my decade as dean of Harvard Business School, I spent hundreds of hours talking with our alumni. To enliven these conversations, I relied on a favorite question: “What was the most important thing you learned from your time in our MBA program?”

  • Nitin Nohria is the George F. Baker Jr. and Distinguished Service University Professor. He served as the 10th dean of Harvard Business School, from 2010 to 2020.

Partner Center

  • Open access
  • Published: 27 June 2011

The case study approach

  • Sarah Crowe 1 ,
  • Kathrin Cresswell 2 ,
  • Ann Robertson 2 ,
  • Guro Huby 3 ,
  • Anthony Avery 1 &
  • Aziz Sheikh 2  

BMC Medical Research Methodology volume  11 , Article number:  100 ( 2011 ) Cite this article

785k Accesses

1048 Citations

37 Altmetric

Metrics details

The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.

Peer Review reports

Introduction

The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations in relation to the design, planning, analysis, interpretation and reporting of case studies.

The illustrative 'grand round', 'case report' and 'case series' have a long tradition in clinical practice and research. Presenting detailed critiques, typically of one or more patients, aims to provide insights into aspects of the clinical case and, in doing so, illustrate broader lessons that may be learnt. In research, the conceptually-related case study approach can be used, for example, to describe in detail a patient's episode of care, explore professional attitudes to and experiences of a new policy initiative or service development or more generally to 'investigate contemporary phenomena within its real-life context' [ 1 ]. Based on our experiences of conducting a range of case studies, we reflect on when to consider using this approach, discuss the key steps involved and illustrate, with examples, some of the practical challenges of attaining an in-depth understanding of a 'case' as an integrated whole. In keeping with previously published work, we acknowledge the importance of theory to underpin the design, selection, conduct and interpretation of case studies[ 2 ]. In so doing, we make passing reference to the different epistemological approaches used in case study research by key theoreticians and methodologists in this field of enquiry.

This paper is structured around the following main questions: What is a case study? What are case studies used for? How are case studies conducted? What are the potential pitfalls and how can these be avoided? We draw in particular on four of our own recently published examples of case studies (see Tables 1 , 2 , 3 and 4 ) and those of others to illustrate our discussion[ 3 – 7 ].

What is a case study?

A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table 5 ), the central tenet being the need to explore an event or phenomenon in depth and in its natural context. It is for this reason sometimes referred to as a "naturalistic" design; this is in contrast to an "experimental" design (such as a randomised controlled trial) in which the investigator seeks to exert control over and manipulate the variable(s) of interest.

Stake's work has been particularly influential in defining the case study approach to scientific enquiry. He has helpfully characterised three main types of case study: intrinsic , instrumental and collective [ 8 ]. An intrinsic case study is typically undertaken to learn about a unique phenomenon. The researcher should define the uniqueness of the phenomenon, which distinguishes it from all others. In contrast, the instrumental case study uses a particular case (some of which may be better than others) to gain a broader appreciation of an issue or phenomenon. The collective case study involves studying multiple cases simultaneously or sequentially in an attempt to generate a still broader appreciation of a particular issue.

These are however not necessarily mutually exclusive categories. In the first of our examples (Table 1 ), we undertook an intrinsic case study to investigate the issue of recruitment of minority ethnic people into the specific context of asthma research studies, but it developed into a instrumental case study through seeking to understand the issue of recruitment of these marginalised populations more generally, generating a number of the findings that are potentially transferable to other disease contexts[ 3 ]. In contrast, the other three examples (see Tables 2 , 3 and 4 ) employed collective case study designs to study the introduction of workforce reconfiguration in primary care, the implementation of electronic health records into hospitals, and to understand the ways in which healthcare students learn about patient safety considerations[ 4 – 6 ]. Although our study focusing on the introduction of General Practitioners with Specialist Interests (Table 2 ) was explicitly collective in design (four contrasting primary care organisations were studied), is was also instrumental in that this particular professional group was studied as an exemplar of the more general phenomenon of workforce redesign[ 4 ].

What are case studies used for?

According to Yin, case studies can be used to explain, describe or explore events or phenomena in the everyday contexts in which they occur[ 1 ]. These can, for example, help to understand and explain causal links and pathways resulting from a new policy initiative or service development (see Tables 2 and 3 , for example)[ 1 ]. In contrast to experimental designs, which seek to test a specific hypothesis through deliberately manipulating the environment (like, for example, in a randomised controlled trial giving a new drug to randomly selected individuals and then comparing outcomes with controls),[ 9 ] the case study approach lends itself well to capturing information on more explanatory ' how ', 'what' and ' why ' questions, such as ' how is the intervention being implemented and received on the ground?'. The case study approach can offer additional insights into what gaps exist in its delivery or why one implementation strategy might be chosen over another. This in turn can help develop or refine theory, as shown in our study of the teaching of patient safety in undergraduate curricula (Table 4 )[ 6 , 10 ]. Key questions to consider when selecting the most appropriate study design are whether it is desirable or indeed possible to undertake a formal experimental investigation in which individuals and/or organisations are allocated to an intervention or control arm? Or whether the wish is to obtain a more naturalistic understanding of an issue? The former is ideally studied using a controlled experimental design, whereas the latter is more appropriately studied using a case study design.

Case studies may be approached in different ways depending on the epistemological standpoint of the researcher, that is, whether they take a critical (questioning one's own and others' assumptions), interpretivist (trying to understand individual and shared social meanings) or positivist approach (orientating towards the criteria of natural sciences, such as focusing on generalisability considerations) (Table 6 ). Whilst such a schema can be conceptually helpful, it may be appropriate to draw on more than one approach in any case study, particularly in the context of conducting health services research. Doolin has, for example, noted that in the context of undertaking interpretative case studies, researchers can usefully draw on a critical, reflective perspective which seeks to take into account the wider social and political environment that has shaped the case[ 11 ].

How are case studies conducted?

Here, we focus on the main stages of research activity when planning and undertaking a case study; the crucial stages are: defining the case; selecting the case(s); collecting and analysing the data; interpreting data; and reporting the findings.

Defining the case

Carefully formulated research question(s), informed by the existing literature and a prior appreciation of the theoretical issues and setting(s), are all important in appropriately and succinctly defining the case[ 8 , 12 ]. Crucially, each case should have a pre-defined boundary which clarifies the nature and time period covered by the case study (i.e. its scope, beginning and end), the relevant social group, organisation or geographical area of interest to the investigator, the types of evidence to be collected, and the priorities for data collection and analysis (see Table 7 )[ 1 ]. A theory driven approach to defining the case may help generate knowledge that is potentially transferable to a range of clinical contexts and behaviours; using theory is also likely to result in a more informed appreciation of, for example, how and why interventions have succeeded or failed[ 13 ].

For example, in our evaluation of the introduction of electronic health records in English hospitals (Table 3 ), we defined our cases as the NHS Trusts that were receiving the new technology[ 5 ]. Our focus was on how the technology was being implemented. However, if the primary research interest had been on the social and organisational dimensions of implementation, we might have defined our case differently as a grouping of healthcare professionals (e.g. doctors and/or nurses). The precise beginning and end of the case may however prove difficult to define. Pursuing this same example, when does the process of implementation and adoption of an electronic health record system really begin or end? Such judgements will inevitably be influenced by a range of factors, including the research question, theory of interest, the scope and richness of the gathered data and the resources available to the research team.

Selecting the case(s)

The decision on how to select the case(s) to study is a very important one that merits some reflection. In an intrinsic case study, the case is selected on its own merits[ 8 ]. The case is selected not because it is representative of other cases, but because of its uniqueness, which is of genuine interest to the researchers. This was, for example, the case in our study of the recruitment of minority ethnic participants into asthma research (Table 1 ) as our earlier work had demonstrated the marginalisation of minority ethnic people with asthma, despite evidence of disproportionate asthma morbidity[ 14 , 15 ]. In another example of an intrinsic case study, Hellstrom et al.[ 16 ] studied an elderly married couple living with dementia to explore how dementia had impacted on their understanding of home, their everyday life and their relationships.

For an instrumental case study, selecting a "typical" case can work well[ 8 ]. In contrast to the intrinsic case study, the particular case which is chosen is of less importance than selecting a case that allows the researcher to investigate an issue or phenomenon. For example, in order to gain an understanding of doctors' responses to health policy initiatives, Som undertook an instrumental case study interviewing clinicians who had a range of responsibilities for clinical governance in one NHS acute hospital trust[ 17 ]. Sampling a "deviant" or "atypical" case may however prove even more informative, potentially enabling the researcher to identify causal processes, generate hypotheses and develop theory.

In collective or multiple case studies, a number of cases are carefully selected. This offers the advantage of allowing comparisons to be made across several cases and/or replication. Choosing a "typical" case may enable the findings to be generalised to theory (i.e. analytical generalisation) or to test theory by replicating the findings in a second or even a third case (i.e. replication logic)[ 1 ]. Yin suggests two or three literal replications (i.e. predicting similar results) if the theory is straightforward and five or more if the theory is more subtle. However, critics might argue that selecting 'cases' in this way is insufficiently reflexive and ill-suited to the complexities of contemporary healthcare organisations.

The selected case study site(s) should allow the research team access to the group of individuals, the organisation, the processes or whatever else constitutes the chosen unit of analysis for the study. Access is therefore a central consideration; the researcher needs to come to know the case study site(s) well and to work cooperatively with them. Selected cases need to be not only interesting but also hospitable to the inquiry [ 8 ] if they are to be informative and answer the research question(s). Case study sites may also be pre-selected for the researcher, with decisions being influenced by key stakeholders. For example, our selection of case study sites in the evaluation of the implementation and adoption of electronic health record systems (see Table 3 ) was heavily influenced by NHS Connecting for Health, the government agency that was responsible for overseeing the National Programme for Information Technology (NPfIT)[ 5 ]. This prominent stakeholder had already selected the NHS sites (through a competitive bidding process) to be early adopters of the electronic health record systems and had negotiated contracts that detailed the deployment timelines.

It is also important to consider in advance the likely burden and risks associated with participation for those who (or the site(s) which) comprise the case study. Of particular importance is the obligation for the researcher to think through the ethical implications of the study (e.g. the risk of inadvertently breaching anonymity or confidentiality) and to ensure that potential participants/participating sites are provided with sufficient information to make an informed choice about joining the study. The outcome of providing this information might be that the emotive burden associated with participation, or the organisational disruption associated with supporting the fieldwork, is considered so high that the individuals or sites decide against participation.

In our example of evaluating implementations of electronic health record systems, given the restricted number of early adopter sites available to us, we sought purposively to select a diverse range of implementation cases among those that were available[ 5 ]. We chose a mixture of teaching, non-teaching and Foundation Trust hospitals, and examples of each of the three electronic health record systems procured centrally by the NPfIT. At one recruited site, it quickly became apparent that access was problematic because of competing demands on that organisation. Recognising the importance of full access and co-operative working for generating rich data, the research team decided not to pursue work at that site and instead to focus on other recruited sites.

Collecting the data

In order to develop a thorough understanding of the case, the case study approach usually involves the collection of multiple sources of evidence, using a range of quantitative (e.g. questionnaires, audits and analysis of routinely collected healthcare data) and more commonly qualitative techniques (e.g. interviews, focus groups and observations). The use of multiple sources of data (data triangulation) has been advocated as a way of increasing the internal validity of a study (i.e. the extent to which the method is appropriate to answer the research question)[ 8 , 18 – 21 ]. An underlying assumption is that data collected in different ways should lead to similar conclusions, and approaching the same issue from different angles can help develop a holistic picture of the phenomenon (Table 2 )[ 4 ].

Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme[ 22 ]. Here, quantitative measures were collected with questionnaires before, and five months after, the start of the intervention which did not yield any statistically significant results. Qualitative interviews with patients however helped provide an insight into potentially beneficial process-related aspects of the programme, such as greater, perceived patient involvement in care. The authors reported how this case study approach provided a number of contextual factors likely to influence the effectiveness of the intervention and which were not likely to have been obtained from quantitative methods alone.

In collective or multiple case studies, data collection needs to be flexible enough to allow a detailed description of each individual case to be developed (e.g. the nature of different cancer care programmes), before considering the emerging similarities and differences in cross-case comparisons (e.g. to explore why one programme is more effective than another). It is important that data sources from different cases are, where possible, broadly comparable for this purpose even though they may vary in nature and depth.

Analysing, interpreting and reporting case studies

Making sense and offering a coherent interpretation of the typically disparate sources of data (whether qualitative alone or together with quantitative) is far from straightforward. Repeated reviewing and sorting of the voluminous and detail-rich data are integral to the process of analysis. In collective case studies, it is helpful to analyse data relating to the individual component cases first, before making comparisons across cases. Attention needs to be paid to variations within each case and, where relevant, the relationship between different causes, effects and outcomes[ 23 ]. Data will need to be organised and coded to allow the key issues, both derived from the literature and emerging from the dataset, to be easily retrieved at a later stage. An initial coding frame can help capture these issues and can be applied systematically to the whole dataset with the aid of a qualitative data analysis software package.

The Framework approach is a practical approach, comprising of five stages (familiarisation; identifying a thematic framework; indexing; charting; mapping and interpretation) , to managing and analysing large datasets particularly if time is limited, as was the case in our study of recruitment of South Asians into asthma research (Table 1 )[ 3 , 24 ]. Theoretical frameworks may also play an important role in integrating different sources of data and examining emerging themes. For example, we drew on a socio-technical framework to help explain the connections between different elements - technology; people; and the organisational settings within which they worked - in our study of the introduction of electronic health record systems (Table 3 )[ 5 ]. Our study of patient safety in undergraduate curricula drew on an evaluation-based approach to design and analysis, which emphasised the importance of the academic, organisational and practice contexts through which students learn (Table 4 )[ 6 ].

Case study findings can have implications both for theory development and theory testing. They may establish, strengthen or weaken historical explanations of a case and, in certain circumstances, allow theoretical (as opposed to statistical) generalisation beyond the particular cases studied[ 12 ]. These theoretical lenses should not, however, constitute a strait-jacket and the cases should not be "forced to fit" the particular theoretical framework that is being employed.

When reporting findings, it is important to provide the reader with enough contextual information to understand the processes that were followed and how the conclusions were reached. In a collective case study, researchers may choose to present the findings from individual cases separately before amalgamating across cases. Care must be taken to ensure the anonymity of both case sites and individual participants (if agreed in advance) by allocating appropriate codes or withholding descriptors. In the example given in Table 3 , we decided against providing detailed information on the NHS sites and individual participants in order to avoid the risk of inadvertent disclosure of identities[ 5 , 25 ].

What are the potential pitfalls and how can these be avoided?

The case study approach is, as with all research, not without its limitations. When investigating the formal and informal ways undergraduate students learn about patient safety (Table 4 ), for example, we rapidly accumulated a large quantity of data. The volume of data, together with the time restrictions in place, impacted on the depth of analysis that was possible within the available resources. This highlights a more general point of the importance of avoiding the temptation to collect as much data as possible; adequate time also needs to be set aside for data analysis and interpretation of what are often highly complex datasets.

Case study research has sometimes been criticised for lacking scientific rigour and providing little basis for generalisation (i.e. producing findings that may be transferable to other settings)[ 1 ]. There are several ways to address these concerns, including: the use of theoretical sampling (i.e. drawing on a particular conceptual framework); respondent validation (i.e. participants checking emerging findings and the researcher's interpretation, and providing an opinion as to whether they feel these are accurate); and transparency throughout the research process (see Table 8 )[ 8 , 18 – 21 , 23 , 26 ]. Transparency can be achieved by describing in detail the steps involved in case selection, data collection, the reasons for the particular methods chosen, and the researcher's background and level of involvement (i.e. being explicit about how the researcher has influenced data collection and interpretation). Seeking potential, alternative explanations, and being explicit about how interpretations and conclusions were reached, help readers to judge the trustworthiness of the case study report. Stake provides a critique checklist for a case study report (Table 9 )[ 8 ].

Conclusions

The case study approach allows, amongst other things, critical events, interventions, policy developments and programme-based service reforms to be studied in detail in a real-life context. It should therefore be considered when an experimental design is either inappropriate to answer the research questions posed or impossible to undertake. Considering the frequency with which implementations of innovations are now taking place in healthcare settings and how well the case study approach lends itself to in-depth, complex health service research, we believe this approach should be more widely considered by researchers. Though inherently challenging, the research case study can, if carefully conceptualised and thoughtfully undertaken and reported, yield powerful insights into many important aspects of health and healthcare delivery.

Yin RK: Case study research, design and method. 2009, London: Sage Publications Ltd., 4

Google Scholar  

Keen J, Packwood T: Qualitative research; case study evaluation. BMJ. 1995, 311: 444-446.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sheikh A, Halani L, Bhopal R, Netuveli G, Partridge M, Car J, et al: Facilitating the Recruitment of Minority Ethnic People into Research: Qualitative Case Study of South Asians and Asthma. PLoS Med. 2009, 6 (10): 1-11.

Article   Google Scholar  

Pinnock H, Huby G, Powell A, Kielmann T, Price D, Williams S, et al: The process of planning, development and implementation of a General Practitioner with a Special Interest service in Primary Care Organisations in England and Wales: a comparative prospective case study. Report for the National Co-ordinating Centre for NHS Service Delivery and Organisation R&D (NCCSDO). 2008, [ http://www.sdo.nihr.ac.uk/files/project/99-final-report.pdf ]

Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T, et al: Prospective evaluation of the implementation and adoption of NHS Connecting for Health's national electronic health record in secondary care in England: interim findings. BMJ. 2010, 41: c4564-

Pearson P, Steven A, Howe A, Sheikh A, Ashcroft D, Smith P, the Patient Safety Education Study Group: Learning about patient safety: organisational context and culture in the education of healthcare professionals. J Health Serv Res Policy. 2010, 15: 4-10. 10.1258/jhsrp.2009.009052.

Article   PubMed   Google Scholar  

van Harten WH, Casparie TF, Fisscher OA: The evaluation of the introduction of a quality management system: a process-oriented case study in a large rehabilitation hospital. Health Policy. 2002, 60 (1): 17-37. 10.1016/S0168-8510(01)00187-7.

Stake RE: The art of case study research. 1995, London: Sage Publications Ltd.

Sheikh A, Smeeth L, Ashcroft R: Randomised controlled trials in primary care: scope and application. Br J Gen Pract. 2002, 52 (482): 746-51.

PubMed   PubMed Central   Google Scholar  

King G, Keohane R, Verba S: Designing Social Inquiry. 1996, Princeton: Princeton University Press

Doolin B: Information technology as disciplinary technology: being critical in interpretative research on information systems. Journal of Information Technology. 1998, 13: 301-311. 10.1057/jit.1998.8.

George AL, Bennett A: Case studies and theory development in the social sciences. 2005, Cambridge, MA: MIT Press

Eccles M, the Improved Clinical Effectiveness through Behavioural Research Group (ICEBeRG): Designing theoretically-informed implementation interventions. Implementation Science. 2006, 1: 1-8. 10.1186/1748-5908-1-1.

Article   PubMed Central   Google Scholar  

Netuveli G, Hurwitz B, Levy M, Fletcher M, Barnes G, Durham SR, Sheikh A: Ethnic variations in UK asthma frequency, morbidity, and health-service use: a systematic review and meta-analysis. Lancet. 2005, 365 (9456): 312-7.

Sheikh A, Panesar SS, Lasserson T, Netuveli G: Recruitment of ethnic minorities to asthma studies. Thorax. 2004, 59 (7): 634-

CAS   PubMed   PubMed Central   Google Scholar  

Hellström I, Nolan M, Lundh U: 'We do things together': A case study of 'couplehood' in dementia. Dementia. 2005, 4: 7-22. 10.1177/1471301205049188.

Som CV: Nothing seems to have changed, nothing seems to be changing and perhaps nothing will change in the NHS: doctors' response to clinical governance. International Journal of Public Sector Management. 2005, 18: 463-477. 10.1108/09513550510608903.

Lincoln Y, Guba E: Naturalistic inquiry. 1985, Newbury Park: Sage Publications

Barbour RS: Checklists for improving rigour in qualitative research: a case of the tail wagging the dog?. BMJ. 2001, 322: 1115-1117. 10.1136/bmj.322.7294.1115.

Mays N, Pope C: Qualitative research in health care: Assessing quality in qualitative research. BMJ. 2000, 320: 50-52. 10.1136/bmj.320.7226.50.

Mason J: Qualitative researching. 2002, London: Sage

Brazier A, Cooke K, Moravan V: Using Mixed Methods for Evaluating an Integrative Approach to Cancer Care: A Case Study. Integr Cancer Ther. 2008, 7: 5-17. 10.1177/1534735407313395.

Miles MB, Huberman M: Qualitative data analysis: an expanded sourcebook. 1994, CA: Sage Publications Inc., 2

Pope C, Ziebland S, Mays N: Analysing qualitative data. Qualitative research in health care. BMJ. 2000, 320: 114-116. 10.1136/bmj.320.7227.114.

Cresswell KM, Worth A, Sheikh A: Actor-Network Theory and its role in understanding the implementation of information technology developments in healthcare. BMC Med Inform Decis Mak. 2010, 10 (1): 67-10.1186/1472-6947-10-67.

Article   PubMed   PubMed Central   Google Scholar  

Malterud K: Qualitative research: standards, challenges, and guidelines. Lancet. 2001, 358: 483-488. 10.1016/S0140-6736(01)05627-6.

Article   CAS   PubMed   Google Scholar  

Yin R: Case study research: design and methods. 1994, Thousand Oaks, CA: Sage Publishing, 2

Yin R: Enhancing the quality of case studies in health services research. Health Serv Res. 1999, 34: 1209-1224.

Green J, Thorogood N: Qualitative methods for health research. 2009, Los Angeles: Sage, 2

Howcroft D, Trauth E: Handbook of Critical Information Systems Research, Theory and Application. 2005, Cheltenham, UK: Northampton, MA, USA: Edward Elgar

Book   Google Scholar  

Blakie N: Approaches to Social Enquiry. 1993, Cambridge: Polity Press

Doolin B: Power and resistance in the implementation of a medical management information system. Info Systems J. 2004, 14: 343-362. 10.1111/j.1365-2575.2004.00176.x.

Bloomfield BP, Best A: Management consultants: systems development, power and the translation of problems. Sociological Review. 1992, 40: 533-560.

Shanks G, Parr A: Positivist, single case study research in information systems: A critical analysis. Proceedings of the European Conference on Information Systems. 2003, Naples

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/11/100/prepub

Download references

Acknowledgements

We are grateful to the participants and colleagues who contributed to the individual case studies that we have drawn on. This work received no direct funding, but it has been informed by projects funded by Asthma UK, the NHS Service Delivery Organisation, NHS Connecting for Health Evaluation Programme, and Patient Safety Research Portfolio. We would also like to thank the expert reviewers for their insightful and constructive feedback. Our thanks are also due to Dr. Allison Worth who commented on an earlier draft of this manuscript.

Author information

Authors and affiliations.

Division of Primary Care, The University of Nottingham, Nottingham, UK

Sarah Crowe & Anthony Avery

Centre for Population Health Sciences, The University of Edinburgh, Edinburgh, UK

Kathrin Cresswell, Ann Robertson & Aziz Sheikh

School of Health in Social Science, The University of Edinburgh, Edinburgh, UK

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sarah Crowe .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors' contributions

AS conceived this article. SC, KC and AR wrote this paper with GH, AA and AS all commenting on various drafts. SC and AS are guarantors.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article.

Crowe, S., Cresswell, K., Robertson, A. et al. The case study approach. BMC Med Res Methodol 11 , 100 (2011). https://doi.org/10.1186/1471-2288-11-100

Download citation

Received : 29 November 2010

Accepted : 27 June 2011

Published : 27 June 2011

DOI : https://doi.org/10.1186/1471-2288-11-100

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Case Study Approach
  • Electronic Health Record System
  • Case Study Design
  • Case Study Site
  • Case Study Report

BMC Medical Research Methodology

ISSN: 1471-2288

a case study methodology is useful in

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Case Study | Definition, Examples & Methods

Case Study | Definition, Examples & Methods

Published on 5 May 2022 by Shona McCombes . Revised on 30 January 2023.

A case study is a detailed study of a specific subject, such as a person, group, place, event, organisation, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research.

A case study research design usually involves qualitative methods , but quantitative methods are sometimes also used. Case studies are good for describing , comparing, evaluating, and understanding different aspects of a research problem .

Table of contents

When to do a case study, step 1: select a case, step 2: build a theoretical framework, step 3: collect your data, step 4: describe and analyse the case.

A case study is an appropriate research design when you want to gain concrete, contextual, in-depth knowledge about a specific real-world subject. It allows you to explore the key characteristics, meanings, and implications of the case.

Case studies are often a good choice in a thesis or dissertation . They keep your project focused and manageable when you don’t have the time or resources to do large-scale research.

You might use just one complex case study where you explore a single subject in depth, or conduct multiple case studies to compare and illuminate different aspects of your research problem.

Case study examples
Research question Case study
What are the ecological effects of wolf reintroduction? Case study of wolf reintroduction in Yellowstone National Park in the US
How do populist politicians use narratives about history to gain support? Case studies of Hungarian prime minister Viktor Orbán and US president Donald Trump
How can teachers implement active learning strategies in mixed-level classrooms? Case study of a local school that promotes active learning
What are the main advantages and disadvantages of wind farms for rural communities? Case studies of three rural wind farm development projects in different parts of the country
How are viral marketing strategies changing the relationship between companies and consumers? Case study of the iPhone X marketing campaign
How do experiences of work in the gig economy differ by gender, race, and age? Case studies of Deliveroo and Uber drivers in London

Prevent plagiarism, run a free check.

Once you have developed your problem statement and research questions , you should be ready to choose the specific case that you want to focus on. A good case study should have the potential to:

  • Provide new or unexpected insights into the subject
  • Challenge or complicate existing assumptions and theories
  • Propose practical courses of action to resolve a problem
  • Open up new directions for future research

Unlike quantitative or experimental research, a strong case study does not require a random or representative sample. In fact, case studies often deliberately focus on unusual, neglected, or outlying cases which may shed new light on the research problem.

If you find yourself aiming to simultaneously investigate and solve an issue, consider conducting action research . As its name suggests, action research conducts research and takes action at the same time, and is highly iterative and flexible. 

However, you can also choose a more common or representative case to exemplify a particular category, experience, or phenomenon.

While case studies focus more on concrete details than general theories, they should usually have some connection with theory in the field. This way the case study is not just an isolated description, but is integrated into existing knowledge about the topic. It might aim to:

  • Exemplify a theory by showing how it explains the case under investigation
  • Expand on a theory by uncovering new concepts and ideas that need to be incorporated
  • Challenge a theory by exploring an outlier case that doesn’t fit with established assumptions

To ensure that your analysis of the case has a solid academic grounding, you should conduct a literature review of sources related to the topic and develop a theoretical framework . This means identifying key concepts and theories to guide your analysis and interpretation.

There are many different research methods you can use to collect data on your subject. Case studies tend to focus on qualitative data using methods such as interviews, observations, and analysis of primary and secondary sources (e.g., newspaper articles, photographs, official records). Sometimes a case study will also collect quantitative data .

The aim is to gain as thorough an understanding as possible of the case and its context.

In writing up the case study, you need to bring together all the relevant aspects to give as complete a picture as possible of the subject.

How you report your findings depends on the type of research you are doing. Some case studies are structured like a standard scientific paper or thesis, with separate sections or chapters for the methods , results , and discussion .

Others are written in a more narrative style, aiming to explore the case from various angles and analyse its meanings and implications (for example, by using textual analysis or discourse analysis ).

In all cases, though, make sure to give contextual details about the case, connect it back to the literature and theory, and discuss how it fits into wider patterns or debates.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2023, January 30). Case Study | Definition, Examples & Methods. Scribbr. Retrieved 7 June 2024, from https://www.scribbr.co.uk/research-methods/case-studies/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, correlational research | guide, design & examples, a quick guide to experimental design | 5 steps & examples, descriptive research design | definition, methods & examples.

  • Case Reports

Qualitative Case Study Methodology: Study Design and Implementation for Novice Researchers

  • January 2010
  • The Qualitative Report 13(4)

Pamela Elizabeth Baxter at McMaster University

  • McMaster University

Susan M Jack at McMaster University

Abstract and Figures

a case study methodology is useful in

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations
  • Nurul Atiqah Mohd Zin
  • Muhammad Sofwan Mahmud

Christy Cabote

  • Lucie Ramjan

Jed Ray Montayre

  • INT J TECHNOL DES ED

David Gill

  • Sebonkile Cynthia Thaba

Charles Mbohwa

  • Edward Domina

David Ackah

  • Tammy Borgen-Flood

Ann Pegoraro

  • Budiono Budiono
  • Mohamad Ahmad

Mohammad AHMAD Khasawneh

  • Bob Algozzine

Patti Lather

  • B J Breitmayer
  • Y.S. Lincoln
  • ADV NURS SCI
  • Margarete Sandelowski

Pamela Elizabeth Baxter

  • John W. Scheib
  • Robert E. Stake
  • Thomas J. Richards

Lyn Richards

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up
  • First Online: 27 October 2022

Cite this chapter

a case study methodology is useful in

  • R. M. Channaveer 4 &
  • Rajendra Baikady 5  

2682 Accesses

1 Citations

This chapter reviews the strengths and limitations of case study as a research method in social sciences. It provides an account of an evidence base to justify why a case study is best suitable for some research questions and why not for some other research questions. Case study designing around the research context, defining the structure and modality, conducting the study, collecting the data through triangulation mode, analysing the data, and interpreting the data and theory building at the end give a holistic view of it. In addition, the chapter also focuses on the types of case study and when and where to use case study as a research method in social science research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

a case study methodology is useful in

Case Study Research

a case study methodology is useful in

Ang, C. S., Lee, K. F., & Dipolog-Ubanan, G. F. (2019). Determinants of first-year student identity and satisfaction in higher education: A quantitative case study. SAGE Open, 9 (2), 215824401984668. https://doi.org/10.1177/2158244019846689

Baxter, P., & Jack, S. (2015). Qualitative case study methodology: Study design and implementation for novice researchers. The Qualitative Report . Published. https://doi.org/10.46743/2160-3715/2008.1573

Bhatta, T. P. (2018). Case study research, philosophical position and theory building: A methodological discussion. Dhaulagiri Journal of Sociology and Anthropology, 12 , 72–79. https://doi.org/10.3126/dsaj.v12i0.22182

Article   Google Scholar  

Bromley, P. D. (1990). Academic contributions to psychological counselling. A philosophy of science for the study of individual cases. Counselling Psychology Quarterly , 3 (3), 299–307.

Google Scholar  

Crowe, S., Cresswell, K., Robertson, A., Huby, G., Avery, A., & Sheikh, A. (2011). The case study approach. BMC Medical Research Methodology, 11 (1), 1–9.

Grässel, E., & Schirmer, B. (2006). The use of volunteers to support family carers of dementia patients: Results of a prospective longitudinal study investigating expectations towards and experience with training and professional support. Zeitschrift Fur Gerontologie Und Geriatrie, 39 (3), 217–226.

Greenwood, D., & Lowenthal, D. (2005). Case study as a means of researching social work and improving practitioner education. Journal of Social Work Practice, 19 (2), 181–193. https://doi.org/10.1080/02650530500144782

Gülseçen, S., & Kubat, A. (2006). Teaching ICT to teacher candidates using PBL: A qualitative and quantitative evaluation. Journal of Educational Technology & Society, 9 (2), 96–106.

Gomm, R., Hammersley, M., & Foster, P. (2000). Case study and generalization. Case study method , 98–115.

Hamera, J., Denzin, N. K., & Lincoln, Y. S. (2011). Performance ethnography . SAGE.

Hayes, N. (2000). Doing psychological research (p. 133). Open University Press.

Harrison, H., Birks, M., Franklin, R., & Mills, J. (2017). Case study research: Foundations and methodological orientations. In Forum qualitative sozialforschung/forum: Qualitative social research (Vol. 18, No. 1).

Iwakabe, S., & Gazzola, N. (2009). From single-case studies to practice-based knowledge: Aggregating and synthesizing case studies. Psychotherapy Research, 19 (4–5), 601–611. https://doi.org/10.1080/10503300802688494

Johnson, M. P. (2006). Decision models for the location of community corrections centers. Environment and Planning b: Planning and Design, 33 (3), 393–412. https://doi.org/10.1068/b3125

Kaarbo, J., & Beasley, R. K. (1999). A practical guide to the comparative case study method in political psychology. Political Psychology, 20 (2), 369–391. https://doi.org/10.1111/0162-895x.00149

Lovell, G. I. (2006). Justice excused: The deployment of law in everyday political encounters. Law Society Review, 40 (2), 283–324. https://doi.org/10.1111/j.1540-5893.2006.00265.x

McDonough, S., & McDonough, S. (1997). Research methods as part of English language teacher education. English Language Teacher Education and Development, 3 (1), 84–96.

Meredith, J. (1998). Building operations management theory through case and field research. Journal of Operations Management, 16 (4), 441–454. https://doi.org/10.1016/s0272-6963(98)00023-0

Mills, A. J., Durepos, G., & Wiebe, E. (Eds.). (2009). Encyclopedia of case study research . Sage Publications.

Ochieng, P. A. (2009). An analysis of the strengths and limitation of qualitative and quantitative research paradigms. Problems of Education in the 21st Century , 13 , 13.

Page, E. B., Webb, E. J., Campell, D. T., Schwart, R. D., & Sechrest, L. (1966). Unobtrusive measures: Nonreactive research in the social sciences. American Educational Research Journal, 3 (4), 317. https://doi.org/10.2307/1162043

Rashid, Y., Rashid, A., Warraich, M. A., Sabir, S. S., & Waseem, A. (2019). Case study method: A step-by-step guide for business researchers. International Journal of Qualitative Methods, 18 , 160940691986242. https://doi.org/10.1177/1609406919862424

Ridder, H. G. (2017). The theory contribution of case study research designs. Business Research, 10 (2), 281–305. https://doi.org/10.1007/s40685-017-0045-z

Sadeghi Moghadam, M. R., Ghasemnia Arabi, N., & Khoshsima, G. (2021). A Review of case study method in operations management research. International Journal of Qualitative Methods, 20 , 160940692110100. https://doi.org/10.1177/16094069211010088

Sommer, B. B., & Sommer, R. (1997). A practical guide to behavioral research: Tools and techniques . Oxford University Press.

Stake, R. E. (2010). Qualitative research: Studying how things work .

Stake, R. E. (1995). The Art of Case Study Research . Sage Publications.

Stoecker, R. (1991). Evaluating and rethinking the case study. The Sociological Review, 39 (1), 88–112.

Suryani, A. (2013). Comparing case study and ethnography as qualitative research approaches .

Taylor, S., & Berridge, V. (2006). Medicinal plants and malaria: An historical case study of research at the London School of Hygiene and Tropical Medicine in the twentieth century. Transactions of the Royal Society of Tropical Medicine and Hygiene, 100 (8), 707–714. https://doi.org/10.1016/j.trstmh.2005.11.017

Tellis, W. (1997). Introduction to case study. The Qualitative Report . Published. https://doi.org/10.46743/2160-3715/1997.2024

Towne, L., & Shavelson, R. J. (2002). Scientific research in education . National Academy Press Publications Sales Office.

Widdowson, M. D. J. (2011). Case study research methodology. International Journal of Transactional Analysis Research, 2 (1), 25–34.

Yin, R. K. (2004). The case study anthology . Sage.

Yin, R. K. (2003). Design and methods. Case Study Research , 3 (9.2).

Yin, R. K. (1994). Case study research: Design and methods (2nd ed.). Sage Publishing.

Yin, R. (1984). Case study research: Design and methods . Sage Publications Beverly Hills.

Yin, R. (1993). Applications of case study research . Sage Publishing.

Zainal, Z. (2003). An investigation into the effects of discipline-specific knowledge, proficiency and genre on reading comprehension and strategies of Malaysia ESP Students. Unpublished Ph. D. Thesis. University of Reading , 1 (1).

Zeisel, J. (1984). Inquiry by design: Tools for environment-behaviour research (No. 5). CUP archive.

Download references

Author information

Authors and affiliations.

Department of Social Work, Central University of Karnataka, Kadaganchi, India

R. M. Channaveer

Department of Social Work, University of Johannesburg, Johannesburg, South Africa

Rajendra Baikady

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to R. M. Channaveer .

Editor information

Editors and affiliations.

Centre for Family and Child Studies, Research Institute of Humanities and Social Sciences, University of Sharjah, Sharjah, United Arab Emirates

M. Rezaul Islam

Department of Development Studies, University of Dhaka, Dhaka, Bangladesh

Niaz Ahmed Khan

Department of Social Work, School of Humanities, University of Johannesburg, Johannesburg, South Africa

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Channaveer, R.M., Baikady, R. (2022). Case Study. In: Islam, M.R., Khan, N.A., Baikady, R. (eds) Principles of Social Research Methodology. Springer, Singapore. https://doi.org/10.1007/978-981-19-5441-2_21

Download citation

DOI : https://doi.org/10.1007/978-981-19-5441-2_21

Published : 27 October 2022

Publisher Name : Springer, Singapore

Print ISBN : 978-981-19-5219-7

Online ISBN : 978-981-19-5441-2

eBook Packages : Social Sciences

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals

You are here

  • Volume 21, Issue 1
  • What is a case study?
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • Roberta Heale 1 ,
  • Alison Twycross 2
  • 1 School of Nursing , Laurentian University , Sudbury , Ontario , Canada
  • 2 School of Health and Social Care , London South Bank University , London , UK
  • Correspondence to Dr Roberta Heale, School of Nursing, Laurentian University, Sudbury, ON P3E2C6, Canada; rheale{at}laurentian.ca

https://doi.org/10.1136/eb-2017-102845

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What is it?

Case study is a research methodology, typically seen in social and life sciences. There is no one definition of case study research. 1 However, very simply… ‘a case study can be defined as an intensive study about a person, a group of people or a unit, which is aimed to generalize over several units’. 1 A case study has also been described as an intensive, systematic investigation of a single individual, group, community or some other unit in which the researcher examines in-depth data relating to several variables. 2

Often there are several similar cases to consider such as educational or social service programmes that are delivered from a number of locations. Although similar, they are complex and have unique features. In these circumstances, the evaluation of several, similar cases will provide a better answer to a research question than if only one case is examined, hence the multiple-case study. Stake asserts that the cases are grouped and viewed as one entity, called the quintain . 6  ‘We study what is similar and different about the cases to understand the quintain better’. 6

The steps when using case study methodology are the same as for other types of research. 6 The first step is defining the single case or identifying a group of similar cases that can then be incorporated into a multiple-case study. A search to determine what is known about the case(s) is typically conducted. This may include a review of the literature, grey literature, media, reports and more, which serves to establish a basic understanding of the cases and informs the development of research questions. Data in case studies are often, but not exclusively, qualitative in nature. In multiple-case studies, analysis within cases and across cases is conducted. Themes arise from the analyses and assertions about the cases as a whole, or the quintain, emerge. 6

Benefits and limitations of case studies

If a researcher wants to study a specific phenomenon arising from a particular entity, then a single-case study is warranted and will allow for a in-depth understanding of the single phenomenon and, as discussed above, would involve collecting several different types of data. This is illustrated in example 1 below.

Using a multiple-case research study allows for a more in-depth understanding of the cases as a unit, through comparison of similarities and differences of the individual cases embedded within the quintain. Evidence arising from multiple-case studies is often stronger and more reliable than from single-case research. Multiple-case studies allow for more comprehensive exploration of research questions and theory development. 6

Despite the advantages of case studies, there are limitations. The sheer volume of data is difficult to organise and data analysis and integration strategies need to be carefully thought through. There is also sometimes a temptation to veer away from the research focus. 2 Reporting of findings from multiple-case research studies is also challenging at times, 1 particularly in relation to the word limits for some journal papers.

Examples of case studies

Example 1: nurses’ paediatric pain management practices.

One of the authors of this paper (AT) has used a case study approach to explore nurses’ paediatric pain management practices. This involved collecting several datasets:

Observational data to gain a picture about actual pain management practices.

Questionnaire data about nurses’ knowledge about paediatric pain management practices and how well they felt they managed pain in children.

Questionnaire data about how critical nurses perceived pain management tasks to be.

These datasets were analysed separately and then compared 7–9 and demonstrated that nurses’ level of theoretical did not impact on the quality of their pain management practices. 7 Nor did individual nurse’s perceptions of how critical a task was effect the likelihood of them carrying out this task in practice. 8 There was also a difference in self-reported and observed practices 9 ; actual (observed) practices did not confirm to best practice guidelines, whereas self-reported practices tended to.

Example 2: quality of care for complex patients at Nurse Practitioner-Led Clinics (NPLCs)

The other author of this paper (RH) has conducted a multiple-case study to determine the quality of care for patients with complex clinical presentations in NPLCs in Ontario, Canada. 10 Five NPLCs served as individual cases that, together, represented the quatrain. Three types of data were collected including:

Review of documentation related to the NPLC model (media, annual reports, research articles, grey literature and regulatory legislation).

Interviews with nurse practitioners (NPs) practising at the five NPLCs to determine their perceptions of the impact of the NPLC model on the quality of care provided to patients with multimorbidity.

Chart audits conducted at the five NPLCs to determine the extent to which evidence-based guidelines were followed for patients with diabetes and at least one other chronic condition.

The three sources of data collected from the five NPLCs were analysed and themes arose related to the quality of care for complex patients at NPLCs. The multiple-case study confirmed that nurse practitioners are the primary care providers at the NPLCs, and this positively impacts the quality of care for patients with multimorbidity. Healthcare policy, such as lack of an increase in salary for NPs for 10 years, has resulted in issues in recruitment and retention of NPs at NPLCs. This, along with insufficient resources in the communities where NPLCs are located and high patient vulnerability at NPLCs, have a negative impact on the quality of care. 10

These examples illustrate how collecting data about a single case or multiple cases helps us to better understand the phenomenon in question. Case study methodology serves to provide a framework for evaluation and analysis of complex issues. It shines a light on the holistic nature of nursing practice and offers a perspective that informs improved patient care.

  • Gustafsson J
  • Calanzaro M
  • Sandelowski M

Competing interests None declared.

Provenance and peer review Commissioned; internally peer reviewed.

Read the full text or download the PDF:

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

What Is a Case Study?

Weighing the pros and cons of this method of research

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

a case study methodology is useful in

Cara Lustik is a fact-checker and copywriter.

a case study methodology is useful in

Verywell / Colleen Tighe

  • Pros and Cons

What Types of Case Studies Are Out There?

Where do you find data for a case study, how do i write a psychology case study.

A case study is an in-depth study of one person, group, or event. In a case study, nearly every aspect of the subject's life and history is analyzed to seek patterns and causes of behavior. Case studies can be used in many different fields, including psychology, medicine, education, anthropology, political science, and social work.

The point of a case study is to learn as much as possible about an individual or group so that the information can be generalized to many others. Unfortunately, case studies tend to be highly subjective, and it is sometimes difficult to generalize results to a larger population.

While case studies focus on a single individual or group, they follow a format similar to other types of psychology writing. If you are writing a case study, we got you—here are some rules of APA format to reference.  

At a Glance

A case study, or an in-depth study of a person, group, or event, can be a useful research tool when used wisely. In many cases, case studies are best used in situations where it would be difficult or impossible for you to conduct an experiment. They are helpful for looking at unique situations and allow researchers to gather a lot of˜ information about a specific individual or group of people. However, it's important to be cautious of any bias we draw from them as they are highly subjective.

What Are the Benefits and Limitations of Case Studies?

A case study can have its strengths and weaknesses. Researchers must consider these pros and cons before deciding if this type of study is appropriate for their needs.

One of the greatest advantages of a case study is that it allows researchers to investigate things that are often difficult or impossible to replicate in a lab. Some other benefits of a case study:

  • Allows researchers to capture information on the 'how,' 'what,' and 'why,' of something that's implemented
  • Gives researchers the chance to collect information on why one strategy might be chosen over another
  • Permits researchers to develop hypotheses that can be explored in experimental research

On the other hand, a case study can have some drawbacks:

  • It cannot necessarily be generalized to the larger population
  • Cannot demonstrate cause and effect
  • It may not be scientifically rigorous
  • It can lead to bias

Researchers may choose to perform a case study if they want to explore a unique or recently discovered phenomenon. Through their insights, researchers develop additional ideas and study questions that might be explored in future studies.

It's important to remember that the insights from case studies cannot be used to determine cause-and-effect relationships between variables. However, case studies may be used to develop hypotheses that can then be addressed in experimental research.

Case Study Examples

There have been a number of notable case studies in the history of psychology. Much of  Freud's work and theories were developed through individual case studies. Some great examples of case studies in psychology include:

  • Anna O : Anna O. was a pseudonym of a woman named Bertha Pappenheim, a patient of a physician named Josef Breuer. While she was never a patient of Freud's, Freud and Breuer discussed her case extensively. The woman was experiencing symptoms of a condition that was then known as hysteria and found that talking about her problems helped relieve her symptoms. Her case played an important part in the development of talk therapy as an approach to mental health treatment.
  • Phineas Gage : Phineas Gage was a railroad employee who experienced a terrible accident in which an explosion sent a metal rod through his skull, damaging important portions of his brain. Gage recovered from his accident but was left with serious changes in both personality and behavior.
  • Genie : Genie was a young girl subjected to horrific abuse and isolation. The case study of Genie allowed researchers to study whether language learning was possible, even after missing critical periods for language development. Her case also served as an example of how scientific research may interfere with treatment and lead to further abuse of vulnerable individuals.

Such cases demonstrate how case research can be used to study things that researchers could not replicate in experimental settings. In Genie's case, her horrific abuse denied her the opportunity to learn a language at critical points in her development.

This is clearly not something researchers could ethically replicate, but conducting a case study on Genie allowed researchers to study phenomena that are otherwise impossible to reproduce.

There are a few different types of case studies that psychologists and other researchers might use:

  • Collective case studies : These involve studying a group of individuals. Researchers might study a group of people in a certain setting or look at an entire community. For example, psychologists might explore how access to resources in a community has affected the collective mental well-being of those who live there.
  • Descriptive case studies : These involve starting with a descriptive theory. The subjects are then observed, and the information gathered is compared to the pre-existing theory.
  • Explanatory case studies : These   are often used to do causal investigations. In other words, researchers are interested in looking at factors that may have caused certain things to occur.
  • Exploratory case studies : These are sometimes used as a prelude to further, more in-depth research. This allows researchers to gather more information before developing their research questions and hypotheses .
  • Instrumental case studies : These occur when the individual or group allows researchers to understand more than what is initially obvious to observers.
  • Intrinsic case studies : This type of case study is when the researcher has a personal interest in the case. Jean Piaget's observations of his own children are good examples of how an intrinsic case study can contribute to the development of a psychological theory.

The three main case study types often used are intrinsic, instrumental, and collective. Intrinsic case studies are useful for learning about unique cases. Instrumental case studies help look at an individual to learn more about a broader issue. A collective case study can be useful for looking at several cases simultaneously.

The type of case study that psychology researchers use depends on the unique characteristics of the situation and the case itself.

There are a number of different sources and methods that researchers can use to gather information about an individual or group. Six major sources that have been identified by researchers are:

  • Archival records : Census records, survey records, and name lists are examples of archival records.
  • Direct observation : This strategy involves observing the subject, often in a natural setting . While an individual observer is sometimes used, it is more common to utilize a group of observers.
  • Documents : Letters, newspaper articles, administrative records, etc., are the types of documents often used as sources.
  • Interviews : Interviews are one of the most important methods for gathering information in case studies. An interview can involve structured survey questions or more open-ended questions.
  • Participant observation : When the researcher serves as a participant in events and observes the actions and outcomes, it is called participant observation.
  • Physical artifacts : Tools, objects, instruments, and other artifacts are often observed during a direct observation of the subject.

If you have been directed to write a case study for a psychology course, be sure to check with your instructor for any specific guidelines you need to follow. If you are writing your case study for a professional publication, check with the publisher for their specific guidelines for submitting a case study.

Here is a general outline of what should be included in a case study.

Section 1: A Case History

This section will have the following structure and content:

Background information : The first section of your paper will present your client's background. Include factors such as age, gender, work, health status, family mental health history, family and social relationships, drug and alcohol history, life difficulties, goals, and coping skills and weaknesses.

Description of the presenting problem : In the next section of your case study, you will describe the problem or symptoms that the client presented with.

Describe any physical, emotional, or sensory symptoms reported by the client. Thoughts, feelings, and perceptions related to the symptoms should also be noted. Any screening or diagnostic assessments that are used should also be described in detail and all scores reported.

Your diagnosis : Provide your diagnosis and give the appropriate Diagnostic and Statistical Manual code. Explain how you reached your diagnosis, how the client's symptoms fit the diagnostic criteria for the disorder(s), or any possible difficulties in reaching a diagnosis.

Section 2: Treatment Plan

This portion of the paper will address the chosen treatment for the condition. This might also include the theoretical basis for the chosen treatment or any other evidence that might exist to support why this approach was chosen.

  • Cognitive behavioral approach : Explain how a cognitive behavioral therapist would approach treatment. Offer background information on cognitive behavioral therapy and describe the treatment sessions, client response, and outcome of this type of treatment. Make note of any difficulties or successes encountered by your client during treatment.
  • Humanistic approach : Describe a humanistic approach that could be used to treat your client, such as client-centered therapy . Provide information on the type of treatment you chose, the client's reaction to the treatment, and the end result of this approach. Explain why the treatment was successful or unsuccessful.
  • Psychoanalytic approach : Describe how a psychoanalytic therapist would view the client's problem. Provide some background on the psychoanalytic approach and cite relevant references. Explain how psychoanalytic therapy would be used to treat the client, how the client would respond to therapy, and the effectiveness of this treatment approach.
  • Pharmacological approach : If treatment primarily involves the use of medications, explain which medications were used and why. Provide background on the effectiveness of these medications and how monotherapy may compare with an approach that combines medications with therapy or other treatments.

This section of a case study should also include information about the treatment goals, process, and outcomes.

When you are writing a case study, you should also include a section where you discuss the case study itself, including the strengths and limitiations of the study. You should note how the findings of your case study might support previous research. 

In your discussion section, you should also describe some of the implications of your case study. What ideas or findings might require further exploration? How might researchers go about exploring some of these questions in additional studies?

Need More Tips?

Here are a few additional pointers to keep in mind when formatting your case study:

  • Never refer to the subject of your case study as "the client." Instead, use their name or a pseudonym.
  • Read examples of case studies to gain an idea about the style and format.
  • Remember to use APA format when citing references .

Crowe S, Cresswell K, Robertson A, Huby G, Avery A, Sheikh A. The case study approach .  BMC Med Res Methodol . 2011;11:100.

Crowe S, Cresswell K, Robertson A, Huby G, Avery A, Sheikh A. The case study approach . BMC Med Res Methodol . 2011 Jun 27;11:100. doi:10.1186/1471-2288-11-100

Gagnon, Yves-Chantal.  The Case Study as Research Method: A Practical Handbook . Canada, Chicago Review Press Incorporated DBA Independent Pub Group, 2010.

Yin, Robert K. Case Study Research and Applications: Design and Methods . United States, SAGE Publications, 2017.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Case Study Research Method in Psychology

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Case studies are in-depth investigations of a person, group, event, or community. Typically, data is gathered from various sources using several methods (e.g., observations & interviews).

The case study research method originated in clinical medicine (the case history, i.e., the patient’s personal history). In psychology, case studies are often confined to the study of a particular individual.

The information is mainly biographical and relates to events in the individual’s past (i.e., retrospective), as well as to significant events that are currently occurring in his or her everyday life.

The case study is not a research method, but researchers select methods of data collection and analysis that will generate material suitable for case studies.

Freud (1909a, 1909b) conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

This makes it clear that the case study is a method that should only be used by a psychologist, therapist, or psychiatrist, i.e., someone with a professional qualification.

There is an ethical issue of competence. Only someone qualified to diagnose and treat a person can conduct a formal case study relating to atypical (i.e., abnormal) behavior or atypical development.

case study

 Famous Case Studies

  • Anna O – One of the most famous case studies, documenting psychoanalyst Josef Breuer’s treatment of “Anna O” (real name Bertha Pappenheim) for hysteria in the late 1800s using early psychoanalytic theory.
  • Little Hans – A child psychoanalysis case study published by Sigmund Freud in 1909 analyzing his five-year-old patient Herbert Graf’s house phobia as related to the Oedipus complex.
  • Bruce/Brenda – Gender identity case of the boy (Bruce) whose botched circumcision led psychologist John Money to advise gender reassignment and raise him as a girl (Brenda) in the 1960s.
  • Genie Wiley – Linguistics/psychological development case of the victim of extreme isolation abuse who was studied in 1970s California for effects of early language deprivation on acquiring speech later in life.
  • Phineas Gage – One of the most famous neuropsychology case studies analyzes personality changes in railroad worker Phineas Gage after an 1848 brain injury involving a tamping iron piercing his skull.

Clinical Case Studies

  • Studying the effectiveness of psychotherapy approaches with an individual patient
  • Assessing and treating mental illnesses like depression, anxiety disorders, PTSD
  • Neuropsychological cases investigating brain injuries or disorders

Child Psychology Case Studies

  • Studying psychological development from birth through adolescence
  • Cases of learning disabilities, autism spectrum disorders, ADHD
  • Effects of trauma, abuse, deprivation on development

Types of Case Studies

  • Explanatory case studies : Used to explore causation in order to find underlying principles. Helpful for doing qualitative analysis to explain presumed causal links.
  • Exploratory case studies : Used to explore situations where an intervention being evaluated has no clear set of outcomes. It helps define questions and hypotheses for future research.
  • Descriptive case studies : Describe an intervention or phenomenon and the real-life context in which it occurred. It is helpful for illustrating certain topics within an evaluation.
  • Multiple-case studies : Used to explore differences between cases and replicate findings across cases. Helpful for comparing and contrasting specific cases.
  • Intrinsic : Used to gain a better understanding of a particular case. Helpful for capturing the complexity of a single case.
  • Collective : Used to explore a general phenomenon using multiple case studies. Helpful for jointly studying a group of cases in order to inquire into the phenomenon.

Where Do You Find Data for a Case Study?

There are several places to find data for a case study. The key is to gather data from multiple sources to get a complete picture of the case and corroborate facts or findings through triangulation of evidence. Most of this information is likely qualitative (i.e., verbal description rather than measurement), but the psychologist might also collect numerical data.

1. Primary sources

  • Interviews – Interviewing key people related to the case to get their perspectives and insights. The interview is an extremely effective procedure for obtaining information about an individual, and it may be used to collect comments from the person’s friends, parents, employer, workmates, and others who have a good knowledge of the person, as well as to obtain facts from the person him or herself.
  • Observations – Observing behaviors, interactions, processes, etc., related to the case as they unfold in real-time.
  • Documents & Records – Reviewing private documents, diaries, public records, correspondence, meeting minutes, etc., relevant to the case.

2. Secondary sources

  • News/Media – News coverage of events related to the case study.
  • Academic articles – Journal articles, dissertations etc. that discuss the case.
  • Government reports – Official data and records related to the case context.
  • Books/films – Books, documentaries or films discussing the case.

3. Archival records

Searching historical archives, museum collections and databases to find relevant documents, visual/audio records related to the case history and context.

Public archives like newspapers, organizational records, photographic collections could all include potentially relevant pieces of information to shed light on attitudes, cultural perspectives, common practices and historical contexts related to psychology.

4. Organizational records

Organizational records offer the advantage of often having large datasets collected over time that can reveal or confirm psychological insights.

Of course, privacy and ethical concerns regarding confidential data must be navigated carefully.

However, with proper protocols, organizational records can provide invaluable context and empirical depth to qualitative case studies exploring the intersection of psychology and organizations.

  • Organizational/industrial psychology research : Organizational records like employee surveys, turnover/retention data, policies, incident reports etc. may provide insight into topics like job satisfaction, workplace culture and dynamics, leadership issues, employee behaviors etc.
  • Clinical psychology : Therapists/hospitals may grant access to anonymized medical records to study aspects like assessments, diagnoses, treatment plans etc. This could shed light on clinical practices.
  • School psychology : Studies could utilize anonymized student records like test scores, grades, disciplinary issues, and counseling referrals to study child development, learning barriers, effectiveness of support programs, and more.

How do I Write a Case Study in Psychology?

Follow specified case study guidelines provided by a journal or your psychology tutor. General components of clinical case studies include: background, symptoms, assessments, diagnosis, treatment, and outcomes. Interpreting the information means the researcher decides what to include or leave out. A good case study should always clarify which information is the factual description and which is an inference or the researcher’s opinion.

1. Introduction

  • Provide background on the case context and why it is of interest, presenting background information like demographics, relevant history, and presenting problem.
  • Compare briefly to similar published cases if applicable. Clearly state the focus/importance of the case.

2. Case Presentation

  • Describe the presenting problem in detail, including symptoms, duration,and impact on daily life.
  • Include client demographics like age and gender, information about social relationships, and mental health history.
  • Describe all physical, emotional, and/or sensory symptoms reported by the client.
  • Use patient quotes to describe the initial complaint verbatim. Follow with full-sentence summaries of relevant history details gathered, including key components that led to a working diagnosis.
  • Summarize clinical exam results, namely orthopedic/neurological tests, imaging, lab tests, etc. Note actual results rather than subjective conclusions. Provide images if clearly reproducible/anonymized.
  • Clearly state the working diagnosis or clinical impression before transitioning to management.

3. Management and Outcome

  • Indicate the total duration of care and number of treatments given over what timeframe. Use specific names/descriptions for any therapies/interventions applied.
  • Present the results of the intervention,including any quantitative or qualitative data collected.
  • For outcomes, utilize visual analog scales for pain, medication usage logs, etc., if possible. Include patient self-reports of improvement/worsening of symptoms. Note the reason for discharge/end of care.

4. Discussion

  • Analyze the case, exploring contributing factors, limitations of the study, and connections to existing research.
  • Analyze the effectiveness of the intervention,considering factors like participant adherence, limitations of the study, and potential alternative explanations for the results.
  • Identify any questions raised in the case analysis and relate insights to established theories and current research if applicable. Avoid definitive claims about physiological explanations.
  • Offer clinical implications, and suggest future research directions.

5. Additional Items

  • Thank specific assistants for writing support only. No patient acknowledgments.
  • References should directly support any key claims or quotes included.
  • Use tables/figures/images only if substantially informative. Include permissions and legends/explanatory notes.
  • Provides detailed (rich qualitative) information.
  • Provides insight for further research.
  • Permitting investigation of otherwise impractical (or unethical) situations.

Case studies allow a researcher to investigate a topic in far more detail than might be possible if they were trying to deal with a large number of research participants (nomothetic approach) with the aim of ‘averaging’.

Because of their in-depth, multi-sided approach, case studies often shed light on aspects of human thinking and behavior that would be unethical or impractical to study in other ways.

Research that only looks into the measurable aspects of human behavior is not likely to give us insights into the subjective dimension of experience, which is important to psychoanalytic and humanistic psychologists.

Case studies are often used in exploratory research. They can help us generate new ideas (that might be tested by other methods). They are an important way of illustrating theories and can help show how different aspects of a person’s life are related to each other.

The method is, therefore, important for psychologists who adopt a holistic point of view (i.e., humanistic psychologists ).

Limitations

  • Lacking scientific rigor and providing little basis for generalization of results to the wider population.
  • Researchers’ own subjective feelings may influence the case study (researcher bias).
  • Difficult to replicate.
  • Time-consuming and expensive.
  • The volume of data, together with the time restrictions in place, impacted the depth of analysis that was possible within the available resources.

Because a case study deals with only one person/event/group, we can never be sure if the case study investigated is representative of the wider body of “similar” instances. This means the conclusions drawn from a particular case may not be transferable to other settings.

Because case studies are based on the analysis of qualitative (i.e., descriptive) data , a lot depends on the psychologist’s interpretation of the information she has acquired.

This means that there is a lot of scope for Anna O , and it could be that the subjective opinions of the psychologist intrude in the assessment of what the data means.

For example, Freud has been criticized for producing case studies in which the information was sometimes distorted to fit particular behavioral theories (e.g., Little Hans ).

This is also true of Money’s interpretation of the Bruce/Brenda case study (Diamond, 1997) when he ignored evidence that went against his theory.

Breuer, J., & Freud, S. (1895).  Studies on hysteria . Standard Edition 2: London.

Curtiss, S. (1981). Genie: The case of a modern wild child .

Diamond, M., & Sigmundson, K. (1997). Sex Reassignment at Birth: Long-term Review and Clinical Implications. Archives of Pediatrics & Adolescent Medicine , 151(3), 298-304

Freud, S. (1909a). Analysis of a phobia of a five year old boy. In The Pelican Freud Library (1977), Vol 8, Case Histories 1, pages 169-306

Freud, S. (1909b). Bemerkungen über einen Fall von Zwangsneurose (Der “Rattenmann”). Jb. psychoanal. psychopathol. Forsch ., I, p. 357-421; GW, VII, p. 379-463; Notes upon a case of obsessional neurosis, SE , 10: 151-318.

Harlow J. M. (1848). Passage of an iron rod through the head.  Boston Medical and Surgical Journal, 39 , 389–393.

Harlow, J. M. (1868).  Recovery from the Passage of an Iron Bar through the Head .  Publications of the Massachusetts Medical Society. 2  (3), 327-347.

Money, J., & Ehrhardt, A. A. (1972).  Man & Woman, Boy & Girl : The Differentiation and Dimorphism of Gender Identity from Conception to Maturity. Baltimore, Maryland: Johns Hopkins University Press.

Money, J., & Tucker, P. (1975). Sexual signatures: On being a man or a woman.

Further Information

  • Case Study Approach
  • Case Study Method
  • Enhancing the Quality of Case Studies in Health Services Research
  • “We do things together” A case study of “couplehood” in dementia
  • Using mixed methods for evaluating an integrative approach to cancer care: a case study

Print Friendly, PDF & Email

Related Articles

Qualitative Data Coding

Research Methodology

Qualitative Data Coding

What Is a Focus Group?

What Is a Focus Group?

Cross-Cultural Research Methodology In Psychology

Cross-Cultural Research Methodology In Psychology

What Is Internal Validity In Research?

What Is Internal Validity In Research?

What Is Face Validity In Research? Importance & How To Measure

Research Methodology , Statistics

What Is Face Validity In Research? Importance & How To Measure

Criterion Validity: Definition & Examples

Criterion Validity: Definition & Examples

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Med Res Methodol

Logo of bmcmrm

The case study approach

Sarah crowe.

1 Division of Primary Care, The University of Nottingham, Nottingham, UK

Kathrin Cresswell

2 Centre for Population Health Sciences, The University of Edinburgh, Edinburgh, UK

Ann Robertson

3 School of Health in Social Science, The University of Edinburgh, Edinburgh, UK

Anthony Avery

Aziz sheikh.

The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.

Introduction

The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations in relation to the design, planning, analysis, interpretation and reporting of case studies.

The illustrative 'grand round', 'case report' and 'case series' have a long tradition in clinical practice and research. Presenting detailed critiques, typically of one or more patients, aims to provide insights into aspects of the clinical case and, in doing so, illustrate broader lessons that may be learnt. In research, the conceptually-related case study approach can be used, for example, to describe in detail a patient's episode of care, explore professional attitudes to and experiences of a new policy initiative or service development or more generally to 'investigate contemporary phenomena within its real-life context' [ 1 ]. Based on our experiences of conducting a range of case studies, we reflect on when to consider using this approach, discuss the key steps involved and illustrate, with examples, some of the practical challenges of attaining an in-depth understanding of a 'case' as an integrated whole. In keeping with previously published work, we acknowledge the importance of theory to underpin the design, selection, conduct and interpretation of case studies[ 2 ]. In so doing, we make passing reference to the different epistemological approaches used in case study research by key theoreticians and methodologists in this field of enquiry.

This paper is structured around the following main questions: What is a case study? What are case studies used for? How are case studies conducted? What are the potential pitfalls and how can these be avoided? We draw in particular on four of our own recently published examples of case studies (see Tables ​ Tables1, 1 , ​ ,2, 2 , ​ ,3 3 and ​ and4) 4 ) and those of others to illustrate our discussion[ 3 - 7 ].

Example of a case study investigating the reasons for differences in recruitment rates of minority ethnic people in asthma research[ 3 ]

Minority ethnic people experience considerably greater morbidity from asthma than the White majority population. Research has shown however that these minority ethnic populations are likely to be under-represented in research undertaken in the UK; there is comparatively less marginalisation in the US.
To investigate approaches to bolster recruitment of South Asians into UK asthma studies through qualitative research with US and UK researchers, and UK community leaders.
Single intrinsic case study
Centred on the issue of recruitment of South Asian people with asthma.
In-depth interviews were conducted with asthma researchers from the UK and US. A supplementary questionnaire was also provided to researchers.
Framework approach.
Barriers to ethnic minority recruitment were found to centre around:
 1. The attitudes of the researchers' towards inclusion: The majority of UK researchers interviewed were generally supportive of the idea of recruiting ethnically diverse participants but expressed major concerns about the practicalities of achieving this; in contrast, the US researchers appeared much more committed to the policy of inclusion.
 2. Stereotypes and prejudices: We found that some of the UK researchers' perceptions of ethnic minorities may have influenced their decisions on whether to approach individuals from particular ethnic groups. These stereotypes centred on issues to do with, amongst others, language barriers and lack of altruism.
 3. Demographic, political and socioeconomic contexts of the two countries: Researchers suggested that the demographic profile of ethnic minorities, their political engagement and the different configuration of the health services in the UK and the US may have contributed to differential rates.
 4. Above all, however, it appeared that the overriding importance of the US National Institute of Health's policy to mandate the inclusion of minority ethnic people (and women) had a major impact on shaping the attitudes and in turn the experiences of US researchers'; the absence of any similar mandate in the UK meant that UK-based researchers had not been forced to challenge their existing practices and they were hence unable to overcome any stereotypical/prejudicial attitudes through experiential learning.

Example of a case study investigating the process of planning and implementing a service in Primary Care Organisations[ 4 ]

Health work forces globally are needing to reorganise and reconfigure in order to meet the challenges posed by the increased numbers of people living with long-term conditions in an efficient and sustainable manner. Through studying the introduction of General Practitioners with a Special Interest in respiratory disorders, this study aimed to provide insights into this important issue by focusing on community respiratory service development.
To understand and compare the process of workforce change in respiratory services and the impact on patient experience (specifically in relation to the role of general practitioners with special interests) in a theoretically selected sample of Primary Care Organisations (PCOs), in order to derive models of good practice in planning and the implementation of a broad range of workforce issues.
Multiple-case design of respiratory services in health regions in England and Wales.
Four PCOs.
Face-to-face and telephone interviews, e-mail discussions, local documents, patient diaries, news items identified from local and national websites, national workshop.
Reading, coding and comparison progressed iteratively.
 1. In the screening phase of this study (which involved semi-structured telephone interviews with the person responsible for driving the reconfiguration of respiratory services in 30 PCOs), the barriers of financial deficit, organisational uncertainty, disengaged clinicians and contradictory policies proved insurmountable for many PCOs to developing sustainable services. A key rationale for PCO re-organisation in 2006 was to strengthen their commissioning function and those of clinicians through Practice-Based Commissioning. However, the turbulence, which surrounded reorganisation was found to have the opposite desired effect.
 2. Implementing workforce reconfiguration was strongly influenced by the negotiation and contest among local clinicians and managers about "ownership" of work and income.
 3. Despite the intention to make the commissioning system more transparent, personal relationships based on common professional interests, past work history, friendships and collegiality, remained as key drivers for sustainable innovation in service development.
It was only possible to undertake in-depth work in a selective number of PCOs and, even within these selected PCOs, it was not possible to interview all informants of potential interest and/or obtain all relevant documents. This work was conducted in the early stages of a major NHS reorganisation in England and Wales and thus, events are likely to have continued to evolve beyond the study period; we therefore cannot claim to have seen any of the stories through to their conclusion.

Example of a case study investigating the introduction of the electronic health records[ 5 ]

Healthcare systems globally are moving from paper-based record systems to electronic health record systems. In 2002, the NHS in England embarked on the most ambitious and expensive IT-based transformation in healthcare in history seeking to introduce electronic health records into all hospitals in England by 2010.
To describe and evaluate the implementation and adoption of detailed electronic health records in secondary care in England and thereby provide formative feedback for local and national rollout of the NHS Care Records Service.
A mixed methods, longitudinal, multi-site, socio-technical collective case study.
Five NHS acute hospital and mental health Trusts that have been the focus of early implementation efforts.
Semi-structured interviews, documentary data and field notes, observations and quantitative data.
Qualitative data were analysed thematically using a socio-technical coding matrix, combined with additional themes that emerged from the data.
 1. Hospital electronic health record systems have developed and been implemented far more slowly than was originally envisioned.
 2. The top-down, government-led standardised approach needed to evolve to admit more variation and greater local choice for hospitals in order to support local service delivery.
 3. A range of adverse consequences were associated with the centrally negotiated contracts, which excluded the hospitals in question.
 4. The unrealistic, politically driven, timeline (implementation over 10 years) was found to be a major source of frustration for developers, implementers and healthcare managers and professionals alike.
We were unable to access details of the contracts between government departments and the Local Service Providers responsible for delivering and implementing the software systems. This, in turn, made it difficult to develop a holistic understanding of some key issues impacting on the overall slow roll-out of the NHS Care Record Service. Early adopters may also have differed in important ways from NHS hospitals that planned to join the National Programme for Information Technology and implement the NHS Care Records Service at a later point in time.

Example of a case study investigating the formal and informal ways students learn about patient safety[ 6 ]

There is a need to reduce the disease burden associated with iatrogenic harm and considering that healthcare education represents perhaps the most sustained patient safety initiative ever undertaken, it is important to develop a better appreciation of the ways in which undergraduate and newly qualified professionals receive and make sense of the education they receive.
To investigate the formal and informal ways pre-registration students from a range of healthcare professions (medicine, nursing, physiotherapy and pharmacy) learn about patient safety in order to become safe practitioners.
Multi-site, mixed method collective case study.
: Eight case studies (two for each professional group) were carried out in educational provider sites considering different programmes, practice environments and models of teaching and learning.
Structured in phases relevant to the three knowledge contexts:
Documentary evidence (including undergraduate curricula, handbooks and module outlines), complemented with a range of views (from course leads, tutors and students) and observations in a range of academic settings.
Policy and management views of patient safety and influences on patient safety education and practice. NHS policies included, for example, implementation of the National Patient Safety Agency's , which encourages organisations to develop an organisational safety culture in which staff members feel comfortable identifying dangers and reporting hazards.
The cultures to which students are exposed i.e. patient safety in relation to day-to-day working. NHS initiatives included, for example, a hand washing initiative or introduction of infection control measures.
 1. Practical, informal, learning opportunities were valued by students. On the whole, however, students were not exposed to nor engaged with important NHS initiatives such as risk management activities and incident reporting schemes.
 2. NHS policy appeared to have been taken seriously by course leaders. Patient safety materials were incorporated into both formal and informal curricula, albeit largely implicit rather than explicit.
 3. Resource issues and peer pressure were found to influence safe practice. Variations were also found to exist in students' experiences and the quality of the supervision available.
The curriculum and organisational documents collected differed between sites, which possibly reflected gatekeeper influences at each site. The recruitment of participants for focus group discussions proved difficult, so interviews or paired discussions were used as a substitute.

What is a case study?

A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table ​ (Table5), 5 ), the central tenet being the need to explore an event or phenomenon in depth and in its natural context. It is for this reason sometimes referred to as a "naturalistic" design; this is in contrast to an "experimental" design (such as a randomised controlled trial) in which the investigator seeks to exert control over and manipulate the variable(s) of interest.

Definitions of a case study

AuthorDefinition
Stake[ ] (p.237)
Yin[ , , ] (Yin 1999 p. 1211, Yin 1994 p. 13)
 •
 • (Yin 2009 p18)
Miles and Huberman[ ] (p. 25)
Green and Thorogood[ ] (p. 284)
George and Bennett[ ] (p. 17)"

Stake's work has been particularly influential in defining the case study approach to scientific enquiry. He has helpfully characterised three main types of case study: intrinsic , instrumental and collective [ 8 ]. An intrinsic case study is typically undertaken to learn about a unique phenomenon. The researcher should define the uniqueness of the phenomenon, which distinguishes it from all others. In contrast, the instrumental case study uses a particular case (some of which may be better than others) to gain a broader appreciation of an issue or phenomenon. The collective case study involves studying multiple cases simultaneously or sequentially in an attempt to generate a still broader appreciation of a particular issue.

These are however not necessarily mutually exclusive categories. In the first of our examples (Table ​ (Table1), 1 ), we undertook an intrinsic case study to investigate the issue of recruitment of minority ethnic people into the specific context of asthma research studies, but it developed into a instrumental case study through seeking to understand the issue of recruitment of these marginalised populations more generally, generating a number of the findings that are potentially transferable to other disease contexts[ 3 ]. In contrast, the other three examples (see Tables ​ Tables2, 2 , ​ ,3 3 and ​ and4) 4 ) employed collective case study designs to study the introduction of workforce reconfiguration in primary care, the implementation of electronic health records into hospitals, and to understand the ways in which healthcare students learn about patient safety considerations[ 4 - 6 ]. Although our study focusing on the introduction of General Practitioners with Specialist Interests (Table ​ (Table2) 2 ) was explicitly collective in design (four contrasting primary care organisations were studied), is was also instrumental in that this particular professional group was studied as an exemplar of the more general phenomenon of workforce redesign[ 4 ].

What are case studies used for?

According to Yin, case studies can be used to explain, describe or explore events or phenomena in the everyday contexts in which they occur[ 1 ]. These can, for example, help to understand and explain causal links and pathways resulting from a new policy initiative or service development (see Tables ​ Tables2 2 and ​ and3, 3 , for example)[ 1 ]. In contrast to experimental designs, which seek to test a specific hypothesis through deliberately manipulating the environment (like, for example, in a randomised controlled trial giving a new drug to randomly selected individuals and then comparing outcomes with controls),[ 9 ] the case study approach lends itself well to capturing information on more explanatory ' how ', 'what' and ' why ' questions, such as ' how is the intervention being implemented and received on the ground?'. The case study approach can offer additional insights into what gaps exist in its delivery or why one implementation strategy might be chosen over another. This in turn can help develop or refine theory, as shown in our study of the teaching of patient safety in undergraduate curricula (Table ​ (Table4 4 )[ 6 , 10 ]. Key questions to consider when selecting the most appropriate study design are whether it is desirable or indeed possible to undertake a formal experimental investigation in which individuals and/or organisations are allocated to an intervention or control arm? Or whether the wish is to obtain a more naturalistic understanding of an issue? The former is ideally studied using a controlled experimental design, whereas the latter is more appropriately studied using a case study design.

Case studies may be approached in different ways depending on the epistemological standpoint of the researcher, that is, whether they take a critical (questioning one's own and others' assumptions), interpretivist (trying to understand individual and shared social meanings) or positivist approach (orientating towards the criteria of natural sciences, such as focusing on generalisability considerations) (Table ​ (Table6). 6 ). Whilst such a schema can be conceptually helpful, it may be appropriate to draw on more than one approach in any case study, particularly in the context of conducting health services research. Doolin has, for example, noted that in the context of undertaking interpretative case studies, researchers can usefully draw on a critical, reflective perspective which seeks to take into account the wider social and political environment that has shaped the case[ 11 ].

Example of epistemological approaches that may be used in case study research

ApproachCharacteristicsCriticismsKey references
Involves questioning one's own assumptions taking into account the wider political and social environment.It can possibly neglect other factors by focussing only on power relationships and may give the researcher a position that is too privileged.Howcroft and Trauth[ ] Blakie[ ] Doolin[ , ]
Interprets the limiting conditions in relation to power and control that are thought to influence behaviour.Bloomfield and Best[ ]
Involves understanding meanings/contexts and processes as perceived from different perspectives, trying to understand individual and shared social meanings. Focus is on theory building.Often difficult to explain unintended consequences and for neglecting surrounding historical contextsStake[ ] Doolin[ ]
Involves establishing which variables one wishes to study in advance and seeing whether they fit in with the findings. Focus is often on testing and refining theory on the basis of case study findings.It does not take into account the role of the researcher in influencing findings.Yin[ , , ] Shanks and Parr[ ]

How are case studies conducted?

Here, we focus on the main stages of research activity when planning and undertaking a case study; the crucial stages are: defining the case; selecting the case(s); collecting and analysing the data; interpreting data; and reporting the findings.

Defining the case

Carefully formulated research question(s), informed by the existing literature and a prior appreciation of the theoretical issues and setting(s), are all important in appropriately and succinctly defining the case[ 8 , 12 ]. Crucially, each case should have a pre-defined boundary which clarifies the nature and time period covered by the case study (i.e. its scope, beginning and end), the relevant social group, organisation or geographical area of interest to the investigator, the types of evidence to be collected, and the priorities for data collection and analysis (see Table ​ Table7 7 )[ 1 ]. A theory driven approach to defining the case may help generate knowledge that is potentially transferable to a range of clinical contexts and behaviours; using theory is also likely to result in a more informed appreciation of, for example, how and why interventions have succeeded or failed[ 13 ].

Example of a checklist for rating a case study proposal[ 8 ]

Clarity: Does the proposal read well?
Integrity: Do its pieces fit together?
Attractiveness: Does it pique the reader's interest?
The case: Is the case adequately defined?
The issues: Are major research questions identified?
Data Resource: Are sufficient data sources identified?
Case Selection: Is the selection plan reasonable?
Data Gathering: Are data-gathering activities outlined?
Validation: Is the need and opportunity for triangulation indicated?
Access: Are arrangements for start-up anticipated?
Confidentiality: Is there sensitivity to the protection of people?
Cost: Are time and resource estimates reasonable?

For example, in our evaluation of the introduction of electronic health records in English hospitals (Table ​ (Table3), 3 ), we defined our cases as the NHS Trusts that were receiving the new technology[ 5 ]. Our focus was on how the technology was being implemented. However, if the primary research interest had been on the social and organisational dimensions of implementation, we might have defined our case differently as a grouping of healthcare professionals (e.g. doctors and/or nurses). The precise beginning and end of the case may however prove difficult to define. Pursuing this same example, when does the process of implementation and adoption of an electronic health record system really begin or end? Such judgements will inevitably be influenced by a range of factors, including the research question, theory of interest, the scope and richness of the gathered data and the resources available to the research team.

Selecting the case(s)

The decision on how to select the case(s) to study is a very important one that merits some reflection. In an intrinsic case study, the case is selected on its own merits[ 8 ]. The case is selected not because it is representative of other cases, but because of its uniqueness, which is of genuine interest to the researchers. This was, for example, the case in our study of the recruitment of minority ethnic participants into asthma research (Table ​ (Table1) 1 ) as our earlier work had demonstrated the marginalisation of minority ethnic people with asthma, despite evidence of disproportionate asthma morbidity[ 14 , 15 ]. In another example of an intrinsic case study, Hellstrom et al.[ 16 ] studied an elderly married couple living with dementia to explore how dementia had impacted on their understanding of home, their everyday life and their relationships.

For an instrumental case study, selecting a "typical" case can work well[ 8 ]. In contrast to the intrinsic case study, the particular case which is chosen is of less importance than selecting a case that allows the researcher to investigate an issue or phenomenon. For example, in order to gain an understanding of doctors' responses to health policy initiatives, Som undertook an instrumental case study interviewing clinicians who had a range of responsibilities for clinical governance in one NHS acute hospital trust[ 17 ]. Sampling a "deviant" or "atypical" case may however prove even more informative, potentially enabling the researcher to identify causal processes, generate hypotheses and develop theory.

In collective or multiple case studies, a number of cases are carefully selected. This offers the advantage of allowing comparisons to be made across several cases and/or replication. Choosing a "typical" case may enable the findings to be generalised to theory (i.e. analytical generalisation) or to test theory by replicating the findings in a second or even a third case (i.e. replication logic)[ 1 ]. Yin suggests two or three literal replications (i.e. predicting similar results) if the theory is straightforward and five or more if the theory is more subtle. However, critics might argue that selecting 'cases' in this way is insufficiently reflexive and ill-suited to the complexities of contemporary healthcare organisations.

The selected case study site(s) should allow the research team access to the group of individuals, the organisation, the processes or whatever else constitutes the chosen unit of analysis for the study. Access is therefore a central consideration; the researcher needs to come to know the case study site(s) well and to work cooperatively with them. Selected cases need to be not only interesting but also hospitable to the inquiry [ 8 ] if they are to be informative and answer the research question(s). Case study sites may also be pre-selected for the researcher, with decisions being influenced by key stakeholders. For example, our selection of case study sites in the evaluation of the implementation and adoption of electronic health record systems (see Table ​ Table3) 3 ) was heavily influenced by NHS Connecting for Health, the government agency that was responsible for overseeing the National Programme for Information Technology (NPfIT)[ 5 ]. This prominent stakeholder had already selected the NHS sites (through a competitive bidding process) to be early adopters of the electronic health record systems and had negotiated contracts that detailed the deployment timelines.

It is also important to consider in advance the likely burden and risks associated with participation for those who (or the site(s) which) comprise the case study. Of particular importance is the obligation for the researcher to think through the ethical implications of the study (e.g. the risk of inadvertently breaching anonymity or confidentiality) and to ensure that potential participants/participating sites are provided with sufficient information to make an informed choice about joining the study. The outcome of providing this information might be that the emotive burden associated with participation, or the organisational disruption associated with supporting the fieldwork, is considered so high that the individuals or sites decide against participation.

In our example of evaluating implementations of electronic health record systems, given the restricted number of early adopter sites available to us, we sought purposively to select a diverse range of implementation cases among those that were available[ 5 ]. We chose a mixture of teaching, non-teaching and Foundation Trust hospitals, and examples of each of the three electronic health record systems procured centrally by the NPfIT. At one recruited site, it quickly became apparent that access was problematic because of competing demands on that organisation. Recognising the importance of full access and co-operative working for generating rich data, the research team decided not to pursue work at that site and instead to focus on other recruited sites.

Collecting the data

In order to develop a thorough understanding of the case, the case study approach usually involves the collection of multiple sources of evidence, using a range of quantitative (e.g. questionnaires, audits and analysis of routinely collected healthcare data) and more commonly qualitative techniques (e.g. interviews, focus groups and observations). The use of multiple sources of data (data triangulation) has been advocated as a way of increasing the internal validity of a study (i.e. the extent to which the method is appropriate to answer the research question)[ 8 , 18 - 21 ]. An underlying assumption is that data collected in different ways should lead to similar conclusions, and approaching the same issue from different angles can help develop a holistic picture of the phenomenon (Table ​ (Table2 2 )[ 4 ].

Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme[ 22 ]. Here, quantitative measures were collected with questionnaires before, and five months after, the start of the intervention which did not yield any statistically significant results. Qualitative interviews with patients however helped provide an insight into potentially beneficial process-related aspects of the programme, such as greater, perceived patient involvement in care. The authors reported how this case study approach provided a number of contextual factors likely to influence the effectiveness of the intervention and which were not likely to have been obtained from quantitative methods alone.

In collective or multiple case studies, data collection needs to be flexible enough to allow a detailed description of each individual case to be developed (e.g. the nature of different cancer care programmes), before considering the emerging similarities and differences in cross-case comparisons (e.g. to explore why one programme is more effective than another). It is important that data sources from different cases are, where possible, broadly comparable for this purpose even though they may vary in nature and depth.

Analysing, interpreting and reporting case studies

Making sense and offering a coherent interpretation of the typically disparate sources of data (whether qualitative alone or together with quantitative) is far from straightforward. Repeated reviewing and sorting of the voluminous and detail-rich data are integral to the process of analysis. In collective case studies, it is helpful to analyse data relating to the individual component cases first, before making comparisons across cases. Attention needs to be paid to variations within each case and, where relevant, the relationship between different causes, effects and outcomes[ 23 ]. Data will need to be organised and coded to allow the key issues, both derived from the literature and emerging from the dataset, to be easily retrieved at a later stage. An initial coding frame can help capture these issues and can be applied systematically to the whole dataset with the aid of a qualitative data analysis software package.

The Framework approach is a practical approach, comprising of five stages (familiarisation; identifying a thematic framework; indexing; charting; mapping and interpretation) , to managing and analysing large datasets particularly if time is limited, as was the case in our study of recruitment of South Asians into asthma research (Table ​ (Table1 1 )[ 3 , 24 ]. Theoretical frameworks may also play an important role in integrating different sources of data and examining emerging themes. For example, we drew on a socio-technical framework to help explain the connections between different elements - technology; people; and the organisational settings within which they worked - in our study of the introduction of electronic health record systems (Table ​ (Table3 3 )[ 5 ]. Our study of patient safety in undergraduate curricula drew on an evaluation-based approach to design and analysis, which emphasised the importance of the academic, organisational and practice contexts through which students learn (Table ​ (Table4 4 )[ 6 ].

Case study findings can have implications both for theory development and theory testing. They may establish, strengthen or weaken historical explanations of a case and, in certain circumstances, allow theoretical (as opposed to statistical) generalisation beyond the particular cases studied[ 12 ]. These theoretical lenses should not, however, constitute a strait-jacket and the cases should not be "forced to fit" the particular theoretical framework that is being employed.

When reporting findings, it is important to provide the reader with enough contextual information to understand the processes that were followed and how the conclusions were reached. In a collective case study, researchers may choose to present the findings from individual cases separately before amalgamating across cases. Care must be taken to ensure the anonymity of both case sites and individual participants (if agreed in advance) by allocating appropriate codes or withholding descriptors. In the example given in Table ​ Table3, 3 , we decided against providing detailed information on the NHS sites and individual participants in order to avoid the risk of inadvertent disclosure of identities[ 5 , 25 ].

What are the potential pitfalls and how can these be avoided?

The case study approach is, as with all research, not without its limitations. When investigating the formal and informal ways undergraduate students learn about patient safety (Table ​ (Table4), 4 ), for example, we rapidly accumulated a large quantity of data. The volume of data, together with the time restrictions in place, impacted on the depth of analysis that was possible within the available resources. This highlights a more general point of the importance of avoiding the temptation to collect as much data as possible; adequate time also needs to be set aside for data analysis and interpretation of what are often highly complex datasets.

Case study research has sometimes been criticised for lacking scientific rigour and providing little basis for generalisation (i.e. producing findings that may be transferable to other settings)[ 1 ]. There are several ways to address these concerns, including: the use of theoretical sampling (i.e. drawing on a particular conceptual framework); respondent validation (i.e. participants checking emerging findings and the researcher's interpretation, and providing an opinion as to whether they feel these are accurate); and transparency throughout the research process (see Table ​ Table8 8 )[ 8 , 18 - 21 , 23 , 26 ]. Transparency can be achieved by describing in detail the steps involved in case selection, data collection, the reasons for the particular methods chosen, and the researcher's background and level of involvement (i.e. being explicit about how the researcher has influenced data collection and interpretation). Seeking potential, alternative explanations, and being explicit about how interpretations and conclusions were reached, help readers to judge the trustworthiness of the case study report. Stake provides a critique checklist for a case study report (Table ​ (Table9 9 )[ 8 ].

Potential pitfalls and mitigating actions when undertaking case study research

Potential pitfallMitigating action
Selecting/conceptualising the wrong case(s) resulting in lack of theoretical generalisationsDeveloping in-depth knowledge of theoretical and empirical literature, justifying choices made
Collecting large volumes of data that are not relevant to the case or too little to be of any valueFocus data collection in line with research questions, whilst being flexible and allowing different paths to be explored
Defining/bounding the caseFocus on related components (either by time and/or space), be clear what is outside the scope of the case
Lack of rigourTriangulation, respondent validation, the use of theoretical sampling, transparency throughout the research process
Ethical issuesAnonymise appropriately as cases are often easily identifiable to insiders, informed consent of participants
Integration with theoretical frameworkAllow for unexpected issues to emerge and do not force fit, test out preliminary explanations, be clear about epistemological positions in advance

Stake's checklist for assessing the quality of a case study report[ 8 ]

1. Is this report easy to read?
2. Does it fit together, each sentence contributing to the whole?
3. Does this report have a conceptual structure (i.e. themes or issues)?
4. Are its issues developed in a series and scholarly way?
5. Is the case adequately defined?
6. Is there a sense of story to the presentation?
7. Is the reader provided some vicarious experience?
8. Have quotations been used effectively?
9. Are headings, figures, artefacts, appendices, indexes effectively used?
10. Was it edited well, then again with a last minute polish?
11. Has the writer made sound assertions, neither over- or under-interpreting?
12. Has adequate attention been paid to various contexts?
13. Were sufficient raw data presented?
14. Were data sources well chosen and in sufficient number?
15. Do observations and interpretations appear to have been triangulated?
16. Is the role and point of view of the researcher nicely apparent?
17. Is the nature of the intended audience apparent?
18. Is empathy shown for all sides?
19. Are personal intentions examined?
20. Does it appear individuals were put at risk?

Conclusions

The case study approach allows, amongst other things, critical events, interventions, policy developments and programme-based service reforms to be studied in detail in a real-life context. It should therefore be considered when an experimental design is either inappropriate to answer the research questions posed or impossible to undertake. Considering the frequency with which implementations of innovations are now taking place in healthcare settings and how well the case study approach lends itself to in-depth, complex health service research, we believe this approach should be more widely considered by researchers. Though inherently challenging, the research case study can, if carefully conceptualised and thoughtfully undertaken and reported, yield powerful insights into many important aspects of health and healthcare delivery.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AS conceived this article. SC, KC and AR wrote this paper with GH, AA and AS all commenting on various drafts. SC and AS are guarantors.

Pre-publication history

The pre-publication history for this paper can be accessed here:

http://www.biomedcentral.com/1471-2288/11/100/prepub

Acknowledgements

We are grateful to the participants and colleagues who contributed to the individual case studies that we have drawn on. This work received no direct funding, but it has been informed by projects funded by Asthma UK, the NHS Service Delivery Organisation, NHS Connecting for Health Evaluation Programme, and Patient Safety Research Portfolio. We would also like to thank the expert reviewers for their insightful and constructive feedback. Our thanks are also due to Dr. Allison Worth who commented on an earlier draft of this manuscript.

  • Yin RK. Case study research, design and method. 4. London: Sage Publications Ltd.; 2009. [ Google Scholar ]
  • Keen J, Packwood T. Qualitative research; case study evaluation. BMJ. 1995; 311 :444–446. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sheikh A, Halani L, Bhopal R, Netuveli G, Partridge M, Car J. et al. Facilitating the Recruitment of Minority Ethnic People into Research: Qualitative Case Study of South Asians and Asthma. PLoS Med. 2009; 6 (10):1–11. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pinnock H, Huby G, Powell A, Kielmann T, Price D, Williams S, The process of planning, development and implementation of a General Practitioner with a Special Interest service in Primary Care Organisations in England and Wales: a comparative prospective case study. Report for the National Co-ordinating Centre for NHS Service Delivery and Organisation R&D (NCCSDO) 2008. http://www.sdo.nihr.ac.uk/files/project/99-final-report.pdf
  • Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T. et al. Prospective evaluation of the implementation and adoption of NHS Connecting for Health's national electronic health record in secondary care in England: interim findings. BMJ. 2010; 41 :c4564. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pearson P, Steven A, Howe A, Sheikh A, Ashcroft D, Smith P. the Patient Safety Education Study Group. Learning about patient safety: organisational context and culture in the education of healthcare professionals. J Health Serv Res Policy. 2010; 15 :4–10. doi: 10.1258/jhsrp.2009.009052. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • van Harten WH, Casparie TF, Fisscher OA. The evaluation of the introduction of a quality management system: a process-oriented case study in a large rehabilitation hospital. Health Policy. 2002; 60 (1):17–37. doi: 10.1016/S0168-8510(01)00187-7. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Stake RE. The art of case study research. London: Sage Publications Ltd.; 1995. [ Google Scholar ]
  • Sheikh A, Smeeth L, Ashcroft R. Randomised controlled trials in primary care: scope and application. Br J Gen Pract. 2002; 52 (482):746–51. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • King G, Keohane R, Verba S. Designing Social Inquiry. Princeton: Princeton University Press; 1996. [ Google Scholar ]
  • Doolin B. Information technology as disciplinary technology: being critical in interpretative research on information systems. Journal of Information Technology. 1998; 13 :301–311. doi: 10.1057/jit.1998.8. [ CrossRef ] [ Google Scholar ]
  • George AL, Bennett A. Case studies and theory development in the social sciences. Cambridge, MA: MIT Press; 2005. [ Google Scholar ]
  • Eccles M. the Improved Clinical Effectiveness through Behavioural Research Group (ICEBeRG) Designing theoretically-informed implementation interventions. Implementation Science. 2006; 1 :1–8. doi: 10.1186/1748-5908-1-1. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Netuveli G, Hurwitz B, Levy M, Fletcher M, Barnes G, Durham SR, Sheikh A. Ethnic variations in UK asthma frequency, morbidity, and health-service use: a systematic review and meta-analysis. Lancet. 2005; 365 (9456):312–7. [ PubMed ] [ Google Scholar ]
  • Sheikh A, Panesar SS, Lasserson T, Netuveli G. Recruitment of ethnic minorities to asthma studies. Thorax. 2004; 59 (7):634. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hellström I, Nolan M, Lundh U. 'We do things together': A case study of 'couplehood' in dementia. Dementia. 2005; 4 :7–22. doi: 10.1177/1471301205049188. [ CrossRef ] [ Google Scholar ]
  • Som CV. Nothing seems to have changed, nothing seems to be changing and perhaps nothing will change in the NHS: doctors' response to clinical governance. International Journal of Public Sector Management. 2005; 18 :463–477. doi: 10.1108/09513550510608903. [ CrossRef ] [ Google Scholar ]
  • Lincoln Y, Guba E. Naturalistic inquiry. Newbury Park: Sage Publications; 1985. [ Google Scholar ]
  • Barbour RS. Checklists for improving rigour in qualitative research: a case of the tail wagging the dog? BMJ. 2001; 322 :1115–1117. doi: 10.1136/bmj.322.7294.1115. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mays N, Pope C. Qualitative research in health care: Assessing quality in qualitative research. BMJ. 2000; 320 :50–52. doi: 10.1136/bmj.320.7226.50. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mason J. Qualitative researching. London: Sage; 2002. [ Google Scholar ]
  • Brazier A, Cooke K, Moravan V. Using Mixed Methods for Evaluating an Integrative Approach to Cancer Care: A Case Study. Integr Cancer Ther. 2008; 7 :5–17. doi: 10.1177/1534735407313395. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Miles MB, Huberman M. Qualitative data analysis: an expanded sourcebook. 2. CA: Sage Publications Inc.; 1994. [ Google Scholar ]
  • Pope C, Ziebland S, Mays N. Analysing qualitative data. Qualitative research in health care. BMJ. 2000; 320 :114–116. doi: 10.1136/bmj.320.7227.114. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cresswell KM, Worth A, Sheikh A. Actor-Network Theory and its role in understanding the implementation of information technology developments in healthcare. BMC Med Inform Decis Mak. 2010; 10 (1):67. doi: 10.1186/1472-6947-10-67. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Malterud K. Qualitative research: standards, challenges, and guidelines. Lancet. 2001; 358 :483–488. doi: 10.1016/S0140-6736(01)05627-6. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Yin R. Case study research: design and methods. 2. Thousand Oaks, CA: Sage Publishing; 1994. [ Google Scholar ]
  • Yin R. Enhancing the quality of case studies in health services research. Health Serv Res. 1999; 34 :1209–1224. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Green J, Thorogood N. Qualitative methods for health research. 2. Los Angeles: Sage; 2009. [ Google Scholar ]
  • Howcroft D, Trauth E. Handbook of Critical Information Systems Research, Theory and Application. Cheltenham, UK: Northampton, MA, USA: Edward Elgar; 2005. [ Google Scholar ]
  • Blakie N. Approaches to Social Enquiry. Cambridge: Polity Press; 1993. [ Google Scholar ]
  • Doolin B. Power and resistance in the implementation of a medical management information system. Info Systems J. 2004; 14 :343–362. doi: 10.1111/j.1365-2575.2004.00176.x. [ CrossRef ] [ Google Scholar ]
  • Bloomfield BP, Best A. Management consultants: systems development, power and the translation of problems. Sociological Review. 1992; 40 :533–560. [ Google Scholar ]
  • Shanks G, Parr A. Proceedings of the European Conference on Information Systems. Naples; 2003. Positivist, single case study research in information systems: A critical analysis. [ Google Scholar ]
  • Open access
  • Published: 05 June 2024

Experiences of medical students and faculty regarding the use of long case as a formative assessment method at a tertiary care teaching hospital in a low resource setting: a qualitative study

  • Jacob Kumakech 1 ,
  • Ian Guyton Munabi 2 ,
  • Aloysius Gonzaga Mubuuke 3 &
  • Sarah Kiguli 4  

BMC Medical Education volume  24 , Article number:  621 ( 2024 ) Cite this article

117 Accesses

Metrics details

Introduction

The long case is used to assess medical students’ proficiency in performing clinical tasks. As a formative assessment, the purpose is to offer feedback on performance, aiming to enhance and expedite clinical learning. The long case stands out as one of the primary formative assessment methods for clinical clerkship in low-resource settings but has received little attention in the literature.

To explore the experiences of medical students and faculty regarding the use of the Long Case Study as a formative assessment method at a tertiary care teaching hospital in a low-resource setting.

Methodology

A qualitative study design was used. The study was conducted at Makerere University, a low-resource setting. The study participants were third- and fifth-year medical students as well as lecturers. Purposive sampling was utilized to recruit participants. Data collection comprised six Focus Group Discussions with students and five Key Informant Interviews with lecturers. The qualitative data were analyzed by inductive thematic analysis.

Three themes emerged from the study: ward placement, case presentation, and case assessment and feedback. The findings revealed that students conduct their long cases at patients’ bedside within specific wards/units assigned for the entire clerkship. Effective supervision, feedback, and marks were highlighted as crucial practices that positively impact the learning process. However, challenges such as insufficient orientation to the long case, the super-specialization of the hospital wards, pressure to hunt for marks, and inadequate feedback practices were identified.

The long case offers students exposure to real patients in a clinical setting. However, in tertiary care teaching hospitals, it’s crucial to ensure proper design and implementation of this practice to enable students’ exposure to a variety of cases. Adequate and effective supervision and feedback create valuable opportunities for each learner to present cases and receive corrections.

Peer Review reports

The long case serves as an authentic assessment method for evaluating medical students’ competence in clinical tasks [ 1 ]. This form of assessment requires students to independently spend time with patients taking their medical history, conducting physical examinations, and formulating diagnosis and management plans. Subsequently, students present their findings to senior clinicians for discussion and questioning [ 2 , 3 ]. While developed countries increasingly adopt simulation-based assessments for formative evaluation, logistical challenges hinder the widespread use of such methods in developing countries [ 4 ]. Consequently, the low-resource countries heavily rely on real patient encounters for formative assessment. The long case is one such method predominantly used as a primary formative assessment method during clinical clerkship and offers a great opportunity for feedback [ 5 ]. The assessment grounds students’ learning into practice by providing them with rich opportunities to interact with patients and have the feel of medical practice. The long case thus bridges the gap between theory and practice, immersing students in the real tasks of a physician [ 1 ]. The complexity of clinical scenarios and the anxiety associated with patient encounters may not be well replicated in simulation-based assessments because diseases often have atypical presentations not found in textbooks. Assessment methods should thus utilize authentic learning experiences to provide learners with applications of learning that they would expect to encounter in real life [ 6 ]. This requires medical education and the curriculum to focus attention on assessment because it plays a significant role in driving learning [ 7 ]. The long case thus remains crucial in medical education as one of the best ways of preparing for practice. It exposes the student repeatedly to taking medical history, examining patients, making clinical judgments, deciding treatment plans, and collaborating with senior clinicians.

The long case, however, has faced significant criticism in the medical education literature due to perceived psychometric deficiencies [ 8 , 9 , 10 ]. Consequently, many universities have begun to adopt assessment methods that yield more reliable and easily defensible results [ 2 ] due to concerns over the low reliability, generalizability, and validity of the long case, coupled with rising litigations and student appeals [ 11 , 12 ]. Despite these shortcomings, the long case remains an educationally valuable assessment tool that provides diagnostic feedback essential for the learning process during clinical clerkship [ 13 ]. Teachers can utilize long-case results to pinpoint neglected areas or teaching deficiencies and align with course outcomes.

However, there is a paucity of research into the long case as a formative assessment tool. A few studies conducted in developed countries highlighted its role in promoting a holistic approach to patient care, fostering students’ clinical skills, and a driving force for students to spend time with patients [ 2 , 13 ], . There is a notable absence of literature on the use of long case as a formative assessment method in low-resource countries, and no published work is available at Makerere University where it has been used for decades. This underscores the importance of conducting research in this area to provide insight into the effectiveness, challenges, and potentials for improvement. Therefore, this study aimed to investigate the experiences of medical students and faculty regarding the utilization of the long case as a formative assessment method within the context of a tertiary care teaching hospital in a low-resource setting.

Study design

This was an exploratory qualitative study.

Study setting

The research was conducted at Makerere University within the Department of Internal Medicine. The Bachelor of Medicine and Bachelor of Surgery (MBChB) degree at Makerere University is a five-year program with the first two years for pre-clinical (biomedical Sciences) course and the last three years dedicated to clinical clerkship. Medical students do Internal Medicine clerkships in third- and fifth-year at the two tertiary teaching hospitals namely; Mulago and Kiruddu National Referral Hospitals. The students are introduced to the long case in third-year as Junior Clerks and later in the fifth-year as Senior Clerks. During clerkship, students are assigned to various medical wards, where they interact with patients, take medical history from them, perform physical examinations, and develop diagnosis and management plans. Subsequently, students present their long cases to lecturers or postgraduate students, often in the presence of their peers, followed by feedback and comprehensive case discussions. Students are afforded ample time to prepare and present their cases during ward rounds, at their discretion. The students are formatively assessed and a mark is awarded on a scale of one to ten in the student’s logbook. Each student is required to make a minimum of ten long cases over the seven weeks of clerkship.

Study participants

The study participants were third- and fifth-year medical students who had completed junior and senior clerkship respectively, as well as lecturers who possessed at least five years of experience with the long case. The participants were selected through purposive sampling. The sample size for the study was determined by data saturation.

Data collection

Data were collected through Focus Group Discussions (FGDs) and Key Informant Interviews (KIIs). A total of 36 medical students participated in FGDs, reflecting on their experiences with the long case. Five faculty members participated in individual KIIs. The students were mobilized by their class representative and a brief recruitment presentation was made at the study site while the lecturers were approached via email and telephone invitation.

Six FGDs were conducted, three for junior clerks and three for senior clerks. Each FGD comprised of 5–7 participants with balanced male and female gender representation. Data saturation was achieved by the fifth FGD, at which point no additional new information emerged. A research assistant proficient in qualitative research methods moderated the FGDs. The discussions lasted between 55 min and 1 h 10 min and were audio recorded. The Principal Investigator attended all the FGDs to document interactions and record his perspectives and non-verbal cues of participants.

Semi-structured KIIs were used to collect data from Internal Medicine faculty. Five KIIs were conducted, and data saturation was achieved by the fourth interview, at which point no new theme emerged. The Principal Investigator conducted the KIIs via Zoom. Each interview lasted between 25 and 50 min and all were audio recorded. A research assistant proficient in qualitative methods attended all the Zoom meetings. The data collected were securely stored on a hard drive and Google Drive with password protection to prevent unauthorized access.

Data analysis

Data analysis was done through inductive thematic analysis method. Following each FGD or KII session, the data collection team listened to the recordings to familiarize themselves with the data and develop general ideas regarding the participants’ perspectives. The data were transcribed verbatim by the researchers to generate text data. Two separate transcripts were generated by the Principal Investigator and a research assistant. The transcripts were then compared and manually reviewed by the research team to compare the accuracy with the audio recordings. After transcript harmonization, data cleaning was done for both FGDs and KIIs transcripts.

The transcribed data from both FGDs and KIIs underwent inductive thematic analysis as aggregated data. This involved initial line-by-line coding, followed by focused coding where the relationships between initial codes were explored and similar codes were grouped. Throughout the analysis, the principle of constant comparison was applied, where emerging codes were compared for similarities and differences.

Study results

Socio-demographics.

A total of 36 medical students participated in the FGDs, comprising 18 junior clerks and 19 senior clerks. The participants were aged between 21 and 25 years except two participants who were aged above 25 (30 and 36 years old). Among the third-year students, there were 10 male and 9 female participants while the fifth-year student comprised of 8 male and 10 female participants.

Five lecturers participated in the Key Informant Interviews, three of whom were females and two male participants. They were aged between 40 and 50 years, and all had over 10 years of experience with the long case. The faculty members included one consultant physician, one associate professor, two senior lecturers, and one lecturer.

Themes that emerged

Three themes emerged from the study: ward placement, case presentations, and case assessment and feedback.

Themes

Codes

Theme 1; ward placement

Allocation to specific ward, specialization of the wards, orientation on the ward, and exposure to other ward

Theme 2; case presentation

Variation in the mode of presentation, limited observation of skills, and unreliable presence of lecturers.

Theme 3; case assessment and feedback

Marks awarded for the long case, case write-up, marks as motivators, pressure to hunt for mark

Feedback is given to the student, feedback to the lecturer, limitations of the feedback practice

Theme 1: Ward placement

The study findings disclosed that medical students are assigned to specific wards for the duration of their clerkship. The specialization of medical wards was found to significantly restrict students’ exposure to limited disease conditions found only in their allocated ward.

With the super-specialization of the units, there is some bias on what they do learn; if a particular group is rotating on the cardiology unit, they will obviously have a bias to learn the history and physical exam related to cardiovascular disease (KII 1).

The students, particularly junior clerks, expressed dissatisfaction with the lack of proper and standardized orientation to the long case on the wards. This deficiency led to wastage of time and a feeling of being unwelcome in the clerkship.

Some orient you when you reach the ward but others you reach and you are supposed to pick up on your own. I expect orientation, then taking data from us, what they expect us to do, and what we expect from them, taking us through the clerkship sessions (FGD 4 Participant 1).

Students’ exposure to cases in other wards poses significant challenges; the study found that as some lecturers facilitate visits to different wards for scheduled teaching sessions, others don’t, resulting in missed learning opportunities. Additionally, some lecturers leave the burden on students’ personal initiative to explore cases in other wards.

We actually encourage them to go through the different specialties because when you are faced with a patient, you will not have to choose which one to see and not to see (KII 4).

Imagine landing on a stroke patient when you have been in the infectious disease ward or getting a patient with renal condition when you have been in the endocrinology ward can create problems (FGD 6 Participant 3).

Theme 2 Case presentation

Medical students present their long case to lecturers and postgraduate students. However, participants revealed variations among lecturers regarding their preferences on how they want students to present their cases. While some prefer to listen to the entire history and examination, others prefer only a summary, and some prefer starting from the diagnosis.

The practice varies depending on the lecturer, as everyone does it their own way. There are some, who listen to your history, examination, and diagnosis, and then they go into basic discussion of the case; others want only a summary. Some lecturers come and tell you to start straight away from your diagnosis, and then they start treating you backward (FGD 6 Participant 3).

The students reported limited observation of their skills due a little emphasis placed by examiners on physical examination techniques, as well as not providing the students with the opportunity to propose treatment plans.

When we are doing these physical examinations on the ward no one is seeing you. You present your physical examination findings, but no one saw how you did it. You may think you are doing the right thing during the ward rotations, but actually your skills are bad (FGD 4 Participant 6).

They don’t give us time to propose management plans. The only time they ask for how you manage a patient is during the summative long case, yet during the ward rotation, they were not giving us the freedom to give our opinion on how we would manage the patient.(FGD 2Participant 6).

Supervision was reportedly dependent on the ward to which the student was allocated. Additionally, the participants believe that the large student-to-lecturer ratio negatively affects the opportunity to present.

My experience was different in years three and five. In year three, we had a specialist every day on the ward, but in year five, we would have a specialist every other day, sometimes even once a week. When I compare year five with year three, I think I was even a better doctor in year three than right now (FGD 1 Participant 1).

Clinical training is like nurturing somebody to behave or conduct themselves in a certain way. Therefore, if the numbers are large, the impacts per person decrease, and the quality decreases (KII 5).

Theme C: Case assessment and feedback

The study found that a student’s long case is assessed both during the case presentation on the ward and through the case write-up, with marks awarded accordingly.

They present to the supervisor and then also write it up, so at a later time you also mark the sheet where they have written up the cases; so they are assessed at presentation and write up (KII 2).

The mark awarded was reportedly a significant motivator for students to visit wards and clerk patients, but students also believe that the pressure to hunt for marks tends to override the goal of the formative assessment.

Your goal there is to learn, but most of us go with the goal of getting signatures; signature-based learning. The learning, you realize probably comes on later if you have the individual morale to go and learn (FGD 1 participant 1).

Feedback is an integral part of any formative assessment. While students receive feedback from lecturers, the participants were concerned about the absence of a formal channel for soliciting feedback from students.

Of course, teachers provide feedback to students because it is a normal part of teaching. However, it is not a common routine to solicit feedback about how teaching has gone. So maybe that is something that needs to be improved so that we know if we have been effective teachers (KII 3).

Whereas the feedback intrigues students to read more to compensate for their knowledge gap, they decried several encounters with demeaning, intimidating, insulting, demotivating, and embarrassing feedback from assessors.

Since we are given a specific target of case presentation we are supposed to make in my training , if I make the ten, I wouldn’t want to present again. Why would I receive other negative comments for nothing? They truly have a personality effect on the student, and students feel low self-esteem (FGD 1, Participant 4).

This study aimed to investigate the experiences of medical students and faculty regarding the use of the long case as a formative assessment method at a tertiary care teaching hospital in a low-resource setting. This qualitative research provides valuable insights into the current practices surrounding the long case as a formative assessment method in such a setting.

The study highlighted the patient bedside as the primary learning environment for medical students. Bedside teaching plays a crucial role in fostering the development of skills such as history-taking and physical examination, as well as modeling professional behaviors and directly observing learners [ 14 , 15 ]. However, the specialization of wards in tertiary hospitals means that students may not be exposed to certain conditions found in other wards. This lack of exposure can lead to issues of case specificity, which has been reported in various literature as a cause of low reliability and generalizability of the long case [ 16 , 17 ]. Participants in the study expressed feeling like pseudo-specialists based on their ward allocations. This is partly attributed to missing scheduled teachings and poor management of opportunities to clerk and present patients on other wards. Addressing these challenges is essential for enhancing the effectiveness of the long case as a formative assessment method in medical education.

Proper orientation at the beginning of a clerkship is crucial for clarifying the structure and organization, defining students’ roles, and providing insights into clinical supervisors’ perspectives [ 18 ]. However, the study revealed that orientation into the long case was unsatisfactory, resulting in time wastage and potentially hindering learning. Effective orientation requires dedicated time and should involve defining expectations and goals, as well as guiding students through the steps of history-taking and physical examination during the initial weeks of the rotation. Contrary to this ideal approach, the medical students reported being taken through systemic examinations when the clerkship was nearing its end, highlighting a significant gap in the orientation process. Proper orientation is very important since previous studies have also documented the positive impact of orientation on student performance [ 19 ]. Therefore, addressing the shortcomings in orientation practices identified in this study is essential for optimizing learning outcomes and ensuring that students are adequately prepared to engage in the long case.

There was reportedly a significant variation in the way students present their long cases, with some lecturers preferring only a case summary, while others expect a complete presentation or begin with a diagnosis. While this diversity in learning styles may expose students to both familiar and unfamiliar approaches, providing a balance of comfort and tension [ 20 ], it’s essential for students to first be exposed to familiar methods before transitioning to less familiar ones to expand their ability to use diverse learning styles. The variation observed in this context may be attributed to time constraints, as lecturers may aim to accommodate the large number of students within the available time. Additionally, a lack of standardized practices could also contribute to this variation. Therefore, there is a pressing need for standardized long-case practices to ensure a consistent experience for students and to meet the desired goals of the assessment. Standardizing the long case practice would not only provide a uniform experience for students but also enhance the reliability, validity, and perception of fairness of the assessment [ 9 , 21 ]. It would ensure that all students are evaluated using the same criteria, reducing potential biases and disparities in grading. Additionally, standardized practices facilitate better alignment with learning objectives and promote more effective feedback mechanisms [ 22 ].

Related to the above, students reported limited observation of skills and little emphasis placed on them to learn physical examination techniques. This finding resonates with the research conducted by Abdalla and Shorbagi in 2018, where many students reported a lack of observation during history-taking and physical examination [ 23 ]. The importance of observation is underscored by the fact that students often avoid conducting physical examinations, as highlighted in Pavlakis & Laurent’s study among postgraduate trainees in 2001 [ 24 ]. This study sheds more light on the critical role of observation in forcing medical students to master clinical assessment and practical skills. The study also uncovered that students are rarely given the opportunity to propose management plans during case presentations, which hampers their confidence and learning of clinical decision-making. These findings likely stem from the large student-to-lecturer ratio and little attention given to these aspects of the long case during the planning of the assessment method. The result is students not receiving the necessary guidance and support to develop their clinical and decision-making skills. Therefore, addressing these issues by putting more emphasis on observation of student-patient interaction, management plan, and having a smaller student group is vital to ensure that medical students receive comprehensive training and are adequately prepared for their future roles as physicians.

The study found that the marks awarded for the long case serve as the primary motivator for students. This finding aligns with previous research indicating that the knowledge that each long case is part of assessment drives students to perform their duties diligently [ 2 , 25 ]. It underscores the crucial role that assessment plays in driving learning processes. However, the pressures to obtain marks and signatures reportedly hinder students’ engagement in learning. This could be attributed to instances where some lecturers relax on supervision or are absent, leaving students to struggle to find someone to assess them. Inadequate supervision by attending physicians has been identified in prior studies as one of the causes of insufficient clinical experience [ 26 ], something that need to be dealt with diligently. While the marks awarded are a motivating factor, it is essential to understand other underlying motivations of medical students to engage in the long case and their impact on the learning process.

Feedback is crucial for the long case to fulfill its role as an assessment for learning. The study participants reported that feedback is provided promptly as students present their cases. This immediate feedback is essential for identifying errors and learning appropriate skills to enhance subsequent performance. However, the feedback process appears to be unilateral, with students receiving feedback from lecturers but lacking a structured mechanism for providing feedback themselves. One reason for the lack of student feedback may be a perceived intimidating approach from lecturers which discourages students from offering their input. It is thus important to establish a conducive environment where students feel comfortable providing feedback without fear of negative repercussions. The study underscores the significance of feedback from students in improving the learning process. This aligns with the findings of Hattie and Timperley (2007), who emphasized that feedback received from learners contributes significantly to improvements in student learning [ 27 ]. Therefore, it is essential to implement strategies to encourage and facilitate bidirectional feedback between students and lecturers in the context of the long case assessment. This could involve creating formal channels for students to provide feedback anonymously or in a structured format, fostering open communication, and addressing any perceived barriers to feedback exchange [ 28 ]. By promoting a culture of feedback reciprocity, educators can enhance the effectiveness of the long case as an assessment tool.

Conclusions

In conclusion, the long case remains a cornerstone of formative assessment during clerkship in many medical schools, particularly in low-resource countries. However, its effectiveness is challenged by limitations such as case specificity in tertiary care hospitals, which can affect the assessment’s reliability and generalizability. The practice of awarding marks in formative assessment serves as a strong motivator for students but also creates tension, especially when there is inadequate contact with lecturers. This can lead to a focus on hunting for marks at the expense of genuine learning. Thus adequate supervision and feedback practices are vital for ensuring the success of the long case as an assessment for learning.

Furthermore, there is a need to foster standardized long case practice to ensure that scheduled learning activities are completed and that all students clerk and present patients with different conditions from various wards. This will promote accountability among both lecturers and students and ensure a consistent and uniform experience with the long case as an assessment for learning, regardless of the ward a student is assigned.

Data availability

The data supporting the study results of this article can be accessed from the Makerere University repository, titled “Perceptions of Medical Students and Lecturers of the Long Case Practices as Formative Assessment in Internal Medicine Clerkship at Makerere University,” available on DSpace. The identifier is http://hdl.handle.net/10570/13032 . Additionally, the raw data are securely stored with the researchers in Google Drive.

Dare AJ, Cardinal A, Kolbe J, Bagg W. What can the history tell us? An argument for observed history-taking in the trainee intern long case assessment. N Z Med J. 2008;121 1282:51–7.

Google Scholar  

Tey C, Chiavaroli N, Ryan A. Perceived educational impact of the medical student long case: a qualitative study. BMC Med Educ. 2020;20(1):1–9.

Article   Google Scholar  

Jayasinghe R. Mastering the Medical Long Case. Elsevier Health Sciences; 2009.

Martinerie L, Rasoaherinomenjanahary F, Ronot M, Fournier P, Dousset B, Tesnière A, Mariette C, Gaujoux S, Gronnier C. Health care simulation in developing countries and low-resource situations. J Continuing Educ Health Professions. 2018;38(3):205–12.

van der Vleuten C. Making the best of the long case. Lancet (London England). 1996;347(9003):704–5.

Reeves TC, Okey JR. Alternative assessment for constructivist learning environments. Constructivist Learn Environments: Case Stud Instructional Des. 1996;191:202.

Biggs J. What the student does: teaching for enhanced learning. High Educ Res Dev. 1999;18(1):141.

Michael A, Rao R, Goel V. The long case: a case for revival? Psychiatrist. 2013;37(12):377–81.

Benning T, Broadhurst M. The long case is dead–long live the long case: loss of the MRCPsych long case and holism in psychiatry. Psychiatr Bull. 2007;31(12):441–2.

Burn W, Brittlebank A. The long case: the case against its revival: Commentary on… the long case. Psychiatrist. 2013;37(12):382–3.

Norcini JJ. The death of the long case? Bmj 2002;324(7334):408–9.

Pell G, Roberts T. Setting standards for student assessment. Int J Res Method Educ. 2006;29(1):91–103.

Masih CS, Benson C. The long case as a formative Assessment Tool–views of medical students. Ulster Med J. 2019;88(2):124.

Peters M, Ten Cate O. Bedside teaching in medical education: a literature review. Perspect Med Educ. 2014;3(2):76–88.

Wölfel T, Beltermann E, Lottspeich C, Vietz E, Fischer MR, Schmidmaier R. Medical ward round competence in internal medicine–an interview study towards an interprofessional development of an Entrustable Professional Activity (EPA). BMC Med Educ. 2016;16(1):1–10.

Wilkinson TJ, Campbell PJ, Judd SJ. Reliability of the long case. Med Educ. 2008;42(9):887–93.

Sood R. Long case examination-can it be improved. J Indian Acad Clin Med. 2001;2(4):252–5.

Atherley AE, Hambleton IR, Unwin N, George C, Lashley PM, Taylor CG. Exploring the transition of undergraduate medical students into a clinical clerkship using organizational socialization theory. Perspect Med Educ. 2016;5:78–87.

Owusu GA, Tawiah MA, Sena-Kpeglo C, Onyame JT. Orientation impact on performance of undergraduate students in University of Cape Coast (Ghana). Int J Educational Adm Policy Stud. 2014;6(7):131–40.

Vaughn L, Baker R. Teaching in the medical setting: balancing teaching styles, learning styles and teaching methods. Med Teach. 2001;23(6):610–2.

Olson CJ, Rolfe I, Hensley. The effect of a structured question grid on the validity and perceived fairness of a medical long case assessment. Med Educ. 2000;34(1):46–52.

Jensen-Doss A, Hawley KM. Understanding barriers to evidence-based assessment: clinician attitudes toward standardized assessment tools. J Clin Child Adolesc Psychol. 2010;39(6):885–96.

Abdalla ME, Shorbagi S. Challenges faced by medical students during their first clerkship training: a cross-sectional study from a medical school in the Middle East. J Taibah Univ Med Sci. 2018;13(4):390–4.

Pavlakis N, Laurent R. Role of the observed long case in postgraduate medical training. Intern Med J. 2001;31(9):523–8.

Teoh NC, Bowden FJ. The case for resurrecting the long case. BMJ. 2008;336(7655):1250–1250.

Mulindwa F, Andia I, McLaughlin K, Kabata P, Baluku J, Kalyesubula R, Kagimu M, Ocama P. A quality improvement project assessing a new mode of lecture delivery to improve postgraduate clinical exposure time in the Department of Internal Medicine, Makerere University, Uganda. BMJ Open Qual. 2022;11(2):e001101.

Hattie J, Timperley H. The power of feedback. Rev Educ Res. 2007;77(1):81–112.

Weallans J, Roberts C, Hamilton S, Parker S. Guidance for providing effective feedback in clinical supervision in postgraduate medical education: a systematic review. Postgrad Med J. 2022;98(1156):138–49.

Download references

Acknowledgements

Not applicable.

This research was supported by the Fogarty International Centre of the National Institute of Health under award number 1R25TW011213. The content is solely the responsibility of the author and does not necessarily represent the official views of the National Institute of Health.

Author information

Authors and affiliations.

School of Medicine, Department of Paediatrics & Child Health, Makerere University, Kampala, Uganda

Jacob Kumakech

School of Biomedical Sciences, Department of Anatomy, Makerere University, Kampala, Uganda

Ian Guyton Munabi

School of Medicine, Department of Radiology, Makerere University, Kampala, Uganda

Aloysius Gonzaga Mubuuke

School of Medicine, Department of Pediatrics & Child Health, Makerere University, Kampala, Uganda

Sarah Kiguli

You can also search for this author in PubMed   Google Scholar

Contributions

JK contributed to the conception and design of the study, as well as the acquisition, analysis, and interpretation of the data. He also drafted the initial version of the work and approved the submitted version. He agrees to be personally accountable for his contribution and to ensure that any questions related to the accuracy or integrity of any part of the work, even those in which he was not personally involved, are appropriately investigated and resolved, with the resolution documented in the literature.IMG contributed to the analysis and interpretation of the data. He also made major corrections to the first draft of the manuscript and approved the submitted version. He agrees to be personally accountable for his contribution and to ensure that any questions related to the accuracy or integrity of any part of the work, even those in which he was not personally involved, are appropriately investigated and resolved, with the resolution documented in the literature.MA contributed to the analysis and interpretation of the data. He made major corrections to the first draft of the manuscript and approved the submitted version. He agrees to be personally accountable for his contribution and to ensure that any questions related to the accuracy or integrity of any part of the work, even those in which he was not personally involved, are appropriately investigated and resolved, with the resolution documented in the literature.SK made major corrections to the first draft and the final corrections for the submitted version of the work. She agrees to be personally accountable for her contribution and to ensure that any questions related to the accuracy or integrity of any part of the work, even those in which she was not personally involved, are appropriately investigated and resolved, with the resolution documented in the literature.

Corresponding author

Correspondence to Jacob Kumakech .

Ethics declarations

Ethical approval.

Ethical approval to conduct the study was obtained from the Makerere University School of Medicine Research and Ethics Committee, with ethics ID Mak-SOMREC-2022-524. Informed consent was obtained from all participants using the Mak-SOMREC informed consent form.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Kumakech, J., Munabi, I.G., Mubuuke, A.G. et al. Experiences of medical students and faculty regarding the use of long case as a formative assessment method at a tertiary care teaching hospital in a low resource setting: a qualitative study. BMC Med Educ 24 , 621 (2024). https://doi.org/10.1186/s12909-024-05589-7

Download citation

Received : 04 April 2024

Accepted : 22 May 2024

Published : 05 June 2024

DOI : https://doi.org/10.1186/s12909-024-05589-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Formative assessment
  • Medical education
  • Low-resource setting

BMC Medical Education

ISSN: 1472-6920

a case study methodology is useful in

  • Open access
  • Published: 04 June 2024

RNA-clique: a method for computing genetic distances from RNA-seq data

  • Andrew C. Tapia 1 ,
  • Jerzy W. Jaromczyk 1 ,
  • Neil Moore 1 &
  • Christopher L. Schardl 2  

BMC Bioinformatics volume  25 , Article number:  205 ( 2024 ) Cite this article

311 Accesses

Metrics details

Although RNA-seq data are traditionally used for quantifying gene expression levels, the same data could be useful in an integrated approach to compute genetic distances as well. Challenges to using mRNA sequences for computing genetic distances include the relatively high conservation of coding sequences and the presence of paralogous and, in some species, homeologous genes.

We developed a new computational method, RNA-clique, for calculating genetic distances using assembled RNA-seq data and assessed the efficacy of the method using biological and simulated data. The method employs reciprocal BLASTn followed by graph-based filtering to ensure that only orthologous genes are compared. Each vertex in the graph constructed for filtering represents a gene in a specific sample under comparison, and an edge connects a pair of vertices if the genes they represent are best matches for each other in their respective samples. The distance computation is a function of the BLAST alignment statistics and the constructed graph and incorporates only those genes that are present in some complete connected component of this graph. As a biological testbed we used RNA-seq data of tall fescue ( Lolium arundinaceum ), an allohexaploid plant ( \(2n = 14\text { Gb}\) ), and bluehead wrasse ( Thalassoma bifasciatum ), a teleost fish. RNA-clique reliably distinguished individual tall fescue plants by genotype and distinguished bluehead wrasse RNA-seq samples by individual. In tests with simulated RNA-seq data, the ground truth phylogeny was accurately recovered from the computed distances. Moreover, tests of the algorithm parameters indicated that, even with stringent filtering for orthologs, sufficient sequence data were retained for the distance computations. Although comparisons with an alternative method revealed that RNA-clique has relatively high time and memory requirements, the comparisons also showed that RNA-clique’s results were at least as reliable as the alternative’s for tall fescue data and were much more reliable for the bluehead wrasse data.

Results of this work indicate that RNA-clique works well as a way of deriving genetic distances from RNA-seq data, thus providing a methodological integration of functional and genetic diversity studies.

Peer Review reports

In this paper, we describe and evaluate RNA-clique, a new approach for computing genetic distance matrices using only RNA-seq data. The method employs rigorous filtering for alignments of orthologous transcripts and uses as its input sets of RNA-seq samples from individuals being compared. The computed distance is a function of alignment statistics and a graph representing inferred orthologies between genes in the set of samples.

This work is key to an NSF-funded project in the Dimensions of Biodiversity program by providing a novel approach to integrate studies of functional diversity (in this case, RNA-seq) and genetic diversity. The technique is to be applied to plant population surveys to assess the interaction of plant genetic diversity to response to environmental variables and diverse symbiotic microbes. Typically, genetic distances are computed using whole, or, more often, partial genomic DNA sequences. Genomic DNA sequences are well-suited for such calculations—they allow us to detect precisely the differences in the genome sequences of two or more individuals. Unfortunately, obtaining genomic DNA sequences can also be costly, especially for organisms with large genomes such as vertebrates or vascular plants.

RNA-seq data are typically used for identifying and measuring expression levels of genes, and RNA-seq studies compare gene expression among multiple individuals or the same individual under different conditions. Since transcripts mostly reflect genomic DNA (aside from splicing and, rarely, RNA-editing), there is potential for using RNA-seq for computing genetic distances as well. A way of computing genetic distances using RNA-seq data would be convenient and economical for projects that need RNA-seq data for other purposes but do not need genomic DNA sequences for any other applications.

The method we propose takes a cautious approach by stringently filtering the sequences used for estimating distances. Thus, the way we use RNA-seq data is analogous to a reduced-representation genome sequencing [ 1 ]. Because we filter so much data and because most transcribed sequence is coding sequence, which is more highly conserved than other regions of the genome, a potential problem is retaining sufficient variation to discriminate between individuals. Hence, we test RNA-clique with multiple RNA-seq samples from each of four plants derived from one ecotype. The results indicated the feasibility of the approach described (Fig.  1 ).

figure 1

PCoA plot for the distance matrix computed for a set of 16 RNA-seq samples. Each sample represents a clone of one of four genotypes of the grass tall fescue ( Lolium arundinaceum ). Genotypes are designated CTE27, CTE46, FATG4, and NTE. Presence ( \(+\) ) or absence (−) of endophyte (the symbiotic fungus Epichloë coenophiala ) was relevant to the original gene expression studies [ 2 ]

Existing tools for computing genetic distances using RNA-seq data alone are scarce. One possible option is the approach implemented in the Cnidaria software of Aflitos et al. [ 3 ]. Cnidaria can operate on either raw RNA-seq reads or assembled transcriptomes. The software uses a “ k -mer counting” approach. The simplest variation of the approach implemented in Cnidaria computes the distance between two samples as the Jaccard distance between the intersections of the sets of k -mers that appear in the sequences of the two samples with those that appear in at least two samples. (The Jaccard distance is taken to be 1 minus the Jaccard similarity. The Jaccard similarity is the number of elements in the sets’ intersection divided by the number of elements in the sets’ union. Since the similarity is a ratio of counts of elements,  k -mers, in this case, both the similarity and distance are dimensionless.)

Cnidaria computes distances without alignment—the input sequences are neither aligned to a reference genome nor to each other. The k -mer counting approach instead works on the principle that similar sequences share more k -mers. This means that orthologous sequences are not directly identified and compared, and we are concerned that results might be influenced by paralogous genes or, in the case of polyploid organisms, sets of homeologs. In this paper, we propose an approach in which orthologous sequences from RNA-seq data are identified and compared directly. We also compare RNA-clique with Cnidaria in terms of accuracy of results and resource usage (“ Distance tests and Resource usage tests ” sections).

RNA-clique utilizes a graph to represent orthology relationships among genes in the samples considered. The graph produced as part of our method bears some resemblance to those built for finding the Clusters of Orthologous Groups (COGs) of Tatusov et al. [ 4 ]. The graph of RNA-clique differs from that of COG in that the edges represent a bidirectional best match between genes (or a non-empty intersection between the top N best matches in both directions if the parameter \(N > 1\) ), whereas the edges in the COG graph may represent a unidirectional best match between proteins. Additionally, the eponymous subgraphs identified by the COG method consist of proteins inferred to be related as either orthologs or paralogs. In contrast, the “ideal” components of our method (described in “ Computing distances for multiple samples ” section) contain genes inferred to be related only as orthologs. COG does also identify some subgraphs presumed to be related as orthologs only—triangles (cliques with exactly three vertices) are “minimal COGs” in which each pair of proteins is orthologous. The ideal components of our method may be viewed as an of extension of this idea, since every ideal component is a clique. Furthermore, every ideal component is a COG (ignoring the distinction between genes and proteins), but not vice versa.

Although graphical representations of homology relationships are not new, their application to genetic distance computation with RNA-seq data is a contribution of the method described here. RNA-clique is designed to offer robustness in the presence of similar non-orthologous sequences. Unlike Cnidaria, RNA-clique explicitly identifies and compares orthologous transcripts using graph-based filtering. The graphs constructed by RNA-clique are also distinct from those of COG, which does not differentiate between orthologous and paralogous sequences. Identifying only orthologs allows RNA-clique to avoid overestimation of distances that could result from comparing paralogs or homeologs.

The purpose of the algorithm developed is to compute values that quantify the similarity or distance among two or more individuals using sequences of RNA transcripts from those individuals captured with RNA-seq. The output of the algorithm is a matrix of values between 0 and 1 for each pair of individuals under consideration; we refer to these values as “genetic distances.” The genetic distance for a pair of individuals is interpreted as the degree of dissimilarity between the individuals’ genomes. The output distance matrix is then useful for downstream analyses such as genotyping and phylogenetics—the distances may be used to distinguish individuals by genotype or infer evolutionary relationships. Requirements of the method were that it be applicable to RNA-seq data from organisms with large and complex genomes and that pairwise comparisons for genetic distance calculations be between orthologs only, and not involve comparisons of paralogs or homeologs (which occur in allopolyploid species).

We first describe in general terms how RNA-clique uses RNA-seq data to compute pairwise genetic distances in “ Distance computation algorithm ” section. Descriptions of the data with which we tested our method and the tests performed are presented in the following “ Data and Tests performed ” sections, respectively.

Distance computation algorithm

Assembling transcriptomes and selecting top genes.

Each “sample” is an RNA-seq dataset from an individual, and different samples may be from the same individual (biological replicates) or different individuals. As in gene expression studies, it is important to include biological replicates for each individual. The dataset from each sample is first assembled into a “transcriptome,” which consists of many assembled transcripts or isotigs and is partitioned into “isotig sets” (i.e., genes). Each isotig in an isotig set is assumed to represent a splice variant or an allelic variant from the same gene, and every isotig in a transcriptome is assumed to have an associated “ k -mer coverage”, which quantifies the amount of sequence from the input sequence reads that contributes to the assembled isotig. The k -mer coverage of a gene is defined as the maximum k -mer coverage among the isotigs of that gene, and, after assembly, the top n genes are identified based on k -mer coverage.

Computing distance for a pair of samples

Distance computation for a pair of samples is described below. The next subsection (“ Computing distances for multiple samples ” section) describes modifications to this basic approach for computing pairwise distances among more than two samples.

The top n genes (see “ Assembling transcriptomes and selecting top genes ” section) from both samples are used as the query and subject sequences in two BLASTn searches [ 5 , 6 ]. In the first search, the top n genes from the first sample are BLASTed against the top n genes from the second sample, and in the second search, the top n genes from the second sample are BLASTed against the top n genes from the first sample. The result of either BLAST search is a table (dataframe) representing high-scoring segment pairs (HSPs). Partial example results for forward and reverse HSPs are shown in Tables 1 and 2 . Note that although what we refer to as HSPs are commonly known as “hits,” in the terminology used by NCBI BLAST+, a hit may consist of one or more HSPs. Each HSP (i.e., each row in the table) specifies a query gene ID, query isotig ID, subject gene ID, subject isotig ID, bitscore, number of identical nucleotides, length, and gaps for the alignment. The bitscore measures the quality of an alignment in a way that does not depend on the size of the database (in this case, the subject transcriptome) and thus can be used to compare HSPs from different BLAST searches.

For both tables of HSPs, we select the top N HSPs for each query gene ID, where N is a positive integer and a configurable parameter of the algorithm. For this paper, we always use \(N = 1\) , though future work may explore other settings for this parameter. Results of selecting the top HSP of each query gene ID in the example are shown in Tables 3 and 4 .

Note that each row in both tables contains one gene ID from the first sample and one gene ID from the second sample. We rename the columns in both tables to reflect this. In the table for the first search, the query gene ID and subject gene ID become the sample 1 gene ID and sample 2 gene ID, respectively. In the table for the second search, the query gene ID and subject gene ID become the sample 2 gene ID and sample 1 gene ID, respectively. The example tables become Tables 5 and 6 after renaming.

Then, we filter both lists of HSPs to include only HSPs for which there is an HSP in both lists with the same sample 1 gene ID and sample 2 gene ID. The rows of the two tables are then merged into a single table. Note that the resulting table has at least two rows with the same sample 1 and sample 2 gene ID (Table 7 ).

We then select the row with highest bitscore for each pair of sample 1 and sample 2 IDs present in the concatenated table. The result is a table that maps each pair of sample 1 and sample 2 IDs to a single best bitscore for that pair of genes (Table 8 ). Note that we may keep multiple rows in the case of ties, but in such cases there will still be a unique best bitscore for each gene pair.

Finally, we select the row with highest bitscore for each sample 1 gene (Table 9 ). In the resulting dataframe, we interpret each row as the most likely ortholog in sample 2 of the gene in sample 1. Again, we may keep multiple rows in the case of ties. We refer to the resulting table as the gene matches table for the two samples.

The similarity between the two samples is then the sum of the number of identical nucleotides for all rows in the table divided by the sum of the difference between the alignment lengths and gaps for all rows in the table. Equivalently, in symbols, let \(\iota _i\) , \(\lambda _i\) , and \(\gamma _i\) represent the number of identical nucleotides, alignment length, and total gap length, respectively, for the i th row in the table. Then, the similarity S between the two samples is

The distance (or dissimilarity) D between the two samples is then defined as \(D = 1 - S\) . Since \(\iota _i\) , \(\lambda _i\) , and \(\gamma _i\) are counts of base pairs, the resulting similarity is a dimensionless ratio of base pairs.

Computing distances for multiple samples

Of course, one straightforward way to find pairwise distances for more than two samples would be to apply the above procedure for finding the distance between two samples for each possible pair of samples. Although such an approach would be simple, we anticipate that this approach would give “unfair” comparisons because the homologous genes used for the comparison differ among pairs of samples. To address potential fairness problems, we employ a graph-based algorithm to find a subset of orthologous genes shared by all samples.

We construct a graph, that is, a collection of vertices connected by edges, in which each vertex represents a gene in a particular sample; we can uniquely identify any vertex by its sample ID and gene ID. We draw an edge between two vertices if and only if the gene pair represented by the two vertices appears in the gene match table for the samples represented by the vertices. Intuitively, we can interpret an edge as indicating that the genes represented by its incident vertices are likely orthologs. We will refer to the resulting graph as the gene matches graph for the set of samples being considered. Figure  2 shows an example of a single connected component (a maximal set of vertices in which each pair of vertices is connected via a path of edges) from a gene matches graph.

figure 2

Example component of a gene matches graph. Vertex labels show sample ID and gene ID, and vertex colors indicate sample ID

We can classify the components of the gene matches graph according to number of vertices. We define a small component as one with fewer vertices than there are samples, and, likewise, we define a large component as one with at least as many vertices as there are samples. Examples of small and large components for the case in which we have five samples are shown in Figs.  3 and 4 , respectively.

figure 3

Examples of small components in the case of five samples

figure 4

Examples of large components in the case of five samples

Additionally, we classify some components as ideal components . We define an ideal component as a component that is a complete subgraph (that is, a clique, a subgraph with an edge between every pair of vertices) with exactly one gene from each sample. Note that this definition implies that an ideal component must also be a large component because an ideal component has exactly as many vertices as there are samples. An example ideal component (for the case of five samples) is shown in Fig.  5 .

figure 5

An example ideal component for five samples

Since no two genes from the same sample may be connected by an edge, any complete component with exactly as many vertices as there are samples must have exactly one gene from each sample. Hence, we can equivalently consider an ideal component to be any component that is a complete subgraph and has as many vertices as there are samples.

The intent is that the vertices of an ideal component should represent genes for which exactly one ortholog is identified in every sample. Thus, in computing distances for multiple samples, we use only those rows of gene match tables whose sample 1 and sample 2 genes appear in some ideal component of the gene matches graph. The result of filtering the example data from Table 9 in this way is shown in Table 10 .

Four sets of data were used for testing—one set of simulated transcriptomes and three sets of real data from past RNA-seq studies. Two of the datasets are from studies of the grass tall fescue ( Lolium arundinaceum ), and one is from a study of bluehead wrasse ( Thalassoma bifasciatum ), a teleost fish [ 7 ].

Tall fescue transcriptomes

Tall fescue, like many grasses (e.g., bread wheat) is “polyploid” due to an ancestry of hybridization between related species with intervening doubling of chromosome numbers. Having three diploid ancestors, tall fescue is hexaploid with a genome size estimated at \(6x = 2C = 14.4 \text { Gb}\) , over twice as large as the human genome [ 8 ]. The grass has a total of 42 chromosomes consisting of three homeologous sets, each with seven pairs of homologous chromosomes. For this reason, many genes—perhaps most—are represented by two or three homeologous sets, each having one or two (or at the population level, potentially more than two) homologous alleles [ 9 ]. Such polyploids are very common in certain plant families, and also in parthenogenic (or otherwise unisexual) animals and represent a special challenge to distinguish homologous versus homeologous gene relationships from mRNA or even genomic DNA sequence data. The tall fescue plant sources of the RNA-seq samples all derive from a single cultivar (‘Kentucky 31’), which in turn derives from a single ecotype—that is, all samples are descended from plants collected at the same location [ 2 , 10 ]. The species is an obligate outcrosser, so each original plant represents a unique genotype. In the prior studies, the plants were divided and propagated as multiple clones, and the 16-sample dataset derives from multiple clones of each of four genotypes (plants). In some cases, clones were treated to eliminate the symbiotic fungus (endophyte) Epichloë coenophiala , and endophyte status ( \(+\) or −) is tracked in our analysis.

The RNA-seq reads were publicly available on NCBI’s Sequence Read Archive (SRA) and were assembled using the rnaSPAdes mode of version 3.15.5 of the SPAdes assembler [ 11 ]. We expected distances between samples from the same set of clones to be much smaller (ideally, zero) than distances between samples in different sets. The information for the samples used is summarized in Table 11 .

rnaSPAdes may identify some transcripts as isoforms (or “isotigs”) of the same gene. Table 11 shows that the number of transcripts was much larger than the number of genes for each sample, but analyzing the frequency with which genes had one or more transcripts revealed that overwhelmingly most genes had very few isoforms (see Fig.  6 ).

figure 6

A histogram showing the frequency of isoform counts for genes in the 16 tall fescue samples. Note that the y-axis uses a logarithmic scale

RNA-seq data of four other samples available on the SRA (Table 12 ) were also used only in a test of the effect of the parameter n on the number of large components and ideal components in the gene matches graph (“ Parameter tests ” section). These reads were likewise assembled into transcriptomes with rnaSPAdes 3.15.5.

Bluehead wrasse transcriptomes

RNA-seq data for the bluehead wrasse originated from a study of gene expression in two tissue types involved in functional sex change [ 7 ]. In bluehead wrasse, individuals can undergo sex change in response to social cues. Specifically, loss of the terminal phase (TP) male from a bluehead social group can cause females and smaller initial phase males to become TP males. The original study of Liu et al. utilized the sequences of RNA extracted from the gonads and brain (midbrain/forebrain) of 12 individuals captured from patch reefs near Key Largo, Florida. The latter tissue type was used because of its role in social decision making.

Like the tall fescue RNA-seq reads, the bluehead wrasse reads were available from the SRA. Each tissue sample from each individual has been assigned an accession in the NCBI BioSample database and a sample ID incorporating the a numeric identifier for the individual and a letter, “G” or “F”, denoting tissue type “gonad” or “midbrain/forebrain”, respectively (Table 13 ). Each sample was associated with two SRA experiments, and, in turn, each experiment was associated with a single SRA run [ 7 ]. Each SRA run was associated with paired-end RNA-seq reads. Using the rnaSPAdes mode of SPAdes 3.15.5, we assembled all RNA-seq reads associated with each sample into a single transcriptome for that sample. Reads from different SRA experiments were provided as separate libraries to SPAdes. One SRA experiment, SRX1176335, belonging to BioSample SAMN04009766, was associated with some additional reads that were treated as unpaired reads from the same library as the others belonging to the experiment.

Simulated transcriptomes

We used the birth-death model implemented in the DendroPy Python library to generate a random phylogenetic tree with 16 extant taxa [ 12 ]. For the birth-death model, we used a birth rate of 1 and a death rate of 0.5; the simulation was allowed to continue until there were exactly 16 extant taxa. The taxa were labeled using the default scheme in DendroPy—i.e., a taxon’s label is simply “T” followed by the index of the taxon. The tree resulting from this simulation is shown in Fig.  7 .

Using the same library, we generated random root state sequences for 50000 simulated transcripts. Transcript lengths were drawn randomly from the frequency distribution of transcript lengths for the 16 tall fescue transcriptomes—that is, the probability of choosing a transcript length was proportional to the number of transcripts with that length among the 16 tall fescue transcriptomes. For each position in a transcript, the base at that position was selected uniformly at random from the set of four DNA bases. (This is the default behavior in DendroPy’s nucleotide character evolution model.) The count of transcripts, 50000, was selected based on the results of the tests determining the effects of the parameter n on the number of ideal components, described in “ Parameter tests ” section.

We used the HKY85 model with an evolution rate of 0.01 to simulate evolution of these base transcripts over the previously generated phylogenetic tree. The value 0.01 was selected after it was determined that the initially selected value 0.1 was too high for BLAST to be able to identify orthologs. We obtained 50000 sets of orthologous transcripts, each containing one transcript per extant taxon.

figure 7

A tree showing the “ground-truth” phylogeny for the 16 simulated transcriptomes

Tests performed

For all tests described in the following sections, the parameter N (the number of top HSPs to select for each query gene ID after the initial BLASTn searches in both directions) and the BLASTn e -value cutoff were fixed. The settings for these parameters were selected at the outset of testing. N was set to 1 to avoid matching non-orthologous genes, and the e -value cutoff was fixed at \(10^{-99}\) to ensure only homologous sequences were reported by BLASTn.

Parameter tests

A number of tests were performed to determine the effects of certain parameters on the gene matches graph. Specifically, we tested the effects of the parameter n (the number of genes selected) and the number of samples s on the number of large components and the number of ideal components. To accomplish this, we ran RNA-clique for various values of these parameters. For n , this was accomplished by directly setting this value of this parameter at the beginning of each run of RNA-clique. For s , we ran RNA-clique with various sized subsets of samples. In all tests, after each run of RNA-clique, the number of ideal components and large components in the gene matches graph was recorded.

For both the four-sample set and the 16-sample set, we tested the effect of varying parameter n , whereby we select the top n genes based on k -mer coverage (“ Assembling transcriptomes and selecting top genes ” section). We reasoned that genes with lower k -mer coverage are less likely to form ideal components, so that the number of ideal components should plateau at higher values of n . Greatly exceeding the number of genes required to reach that plateau would contribute to computation time with little or no gain of usable data for the subsequent distance comparisons. For the set of four tall fescue samples (Table 12 ), we ran RNA-clique with settings of the parameter n varying from 1000 to 306329 (the maximum number of genes among the four samples) in steps of 1000. For the set of 16 tall fescue samples (denoted \({\mathcal {F}}_{16}\) in this section; Table 11 ), we ran RNA-clique with a different sequence of parameter settings for n ; this sequence of settings are the x -axis coordinates of the points in Fig. 9 . We used this sequence for the set of 16 tall fescue samples because the sequence increases exponentially, has easily readable values, and has many fewer elements than the sequence used for the set of four tall fescue samples. The second of these properties was important to capture the relationship between n and the number of components of each type for small values of n , and the last property was important for saving time since running RNA-clique requires more time for larger sets of samples. For both sets of samples, and for each setting of n , the number of ideal components and large components in the gene matches graph resulting from running RNA-clique with that setting was recorded, and these pairs of values were plotted to illustrate the relationships between the variables.

For the set of 16 tall fescue samples, we also tested the effect of the number of samples (i.e., the parameter s ) on the counts of each type of component in the resulting gene matches graph by running RNA-clique with subsets of various size. Of course, for \(0< s < 16\) , we have more than one subset \({\mathcal {S}} \subset {\mathcal {F}}_{16}\) such that \(|{\mathcal {S}}| = s\) (that is, the number of elements in S is s ), and, moreover, for \(0< s < 15\) , there exist \({\mathcal {S}} \subset {\mathcal {F}}_{16}\) and \({\mathcal {T}} \subset {\mathcal {F}}_{16}\) such that \(|{\mathcal {S}}| = |{\mathcal {T}}| - 1 = s\) and \({\mathcal {S}} \not \subset {\mathcal {T}}\) . Hence, testing the effect of s on the component counts by independently selecting a random subset of size s from \({\mathcal {F}}_{16}\) for each value of s tested could be a flawed approach.

Instead of independently selecting random subsets of size s for each value of s , we first selected a permutation of the elements of \({\mathcal {F}}_{16}\) . We then used size s prefixes of the permutation—that is, the first s elements of the permutation—as our subsets of size s . Using such prefixes ensured that each subset tested was a superset of the last—that is, the subset used for \(s + 1\) was always a superset of the subset used for s . We used this prefix approach for our first set of sample count tests. Specifically, we applied the prefix approach for a permutation in which samples were sorted by genotype and a permutation in which samples were interleaved by genotype. For each of these tests, we used \(n = 50000\) ; the selection of this value for n was informed by the results of our tests with \({\mathcal {F}}_{16}\) observing the effect of n on component counts. For each prefix of both permutations, we ran RNA-clique, and, again, the number of large and ideal component counts were recorded. The purpose of the genotype-interleaved and genotype-ordered tests was to allow us to see whether the ideal component count drops more dramatically when a sample with a new genotype is added.

Prefix tests cannot address the problem that there are many possible subsets of s from \({\mathcal {F}}_{16}\) , and, hence, they cannot fully capture the relationship between number of samples and component counts. To address this shortcoming, subsets of \({\mathcal {F}}_{16}\) were sampled using a “fair” strategy that tries subsets selected uniformly at random from subsets of a specific size and tries to spend the same amount of time on each size (i.e., each value of s ). Since computing the gene matches graph generally takes more time for larger values of s , the fair strategy can initially try more subsets for smaller values of s . Since the number of combinations \({16 \atopwithdelims ()s}\) is increasing up to \(s = 8\) , this trend would not continue indefinitely; we would eventually exhaust all subsets for smaller values of s . For each subset \({\mathcal {S}}\) tried, we also varied values of n , but only the data for the case where \(n = 50000\) are reported and discussed here. For each subset of size s and each value of n , we ran RNA-clique and recorded the number of large and ideal components. For each subset of size s , we plotted the number of large components and ideal components to observe the relationship between s and the number of each kind of component. Using this fair sample count approach, we tested a total of 606 subsets of varying sizes.

Distance tests

For the set of 16 tall fescue samples, the set of 24 bluehead wrasse samples, and the set of 16 simulated transcriptomes, pairwise distance matrices were estimated. In all tests, we set the parameter \(n = 50000\) . We visualized the distance matrices as heatmaps and principal coordinates analysis (PCoA) plots, and phylogenetic analysis employed the neighbor-joining algorithm implemented in Biopython’s Phylo module [ 13 ].

Distance tests with Cnidaria

The distance tests for the set of 16 tall fescue samples and the set of 24 bluehead wrasse samples were repeated using the existing method Cnidaria instead of RNA-clique. Although Cnidaria can use either raw RNA-seq data or assembled transcriptomes, the distance tests were only performed using the assembled transcriptome mode. The distance test for the set of bluehead wrasse samples was also repeated using a hybrid approach in which the graph-based filtering of RNA-clique was first used to select those genes with orthologs in all samples, and the resulting orthologs were provided as input to Cnidaria.

Resource usage tests

We measured the time and memory usage of both RNA-clique and Cnidaria for varying values of n , s , and j , the number of parallel jobs, using the set of 16 tall fescue samples. Because Cnidaria may be executed on either raw RNA-seq reads or assembled transcriptomes, we tested both configurations. We also calculated the resource usage for assembling the 16 tall fescue sample transcriptomes; a fair comparison between Cnidaria in RNA-seq read mode with either method in transcriptome mode should account for time needed to assemble reads into transcriptomes. Since resource usage depends on the quantity of input data, the top n genes were selected at the beginning of both the RNA-clique and transcriptome-based Cnidaria tests. Although selection of the top n genes is not part of the original Cnidaria method, it was necessary to perform this step for Cnidaria to ensure a fair comparison. Since selection of the top n transcripts was necessary for both RNA-clique and one of the Cnidaria modes, we measured the selection step separately.

Time usage of a program was measured as the total wall-clock time elapsed during execution of the program. Memory usage was measured as the maximum sum resident set size (RSS) of the program’s process tree during execution. The RSS measures only virtual memory of the process that occupies space in RAM. The sum RSS for the process tree was polled every 0.1 s using the procpath utility.

Tests of resource usage for varying values of n used the full set of 16 tall fescue samples and set n to the same set of values used for the parameter n tests of the 16 tall fescue samples described in “ Parameter tests ” section. Since the top n genes cannot be computed for the unassembled RNA-seq reads, we did not run Cnidaria in RNA-seq mode for the parameter n resource usage tests. Tests of resource usage for varying values of s set \(n = 50000\) and used prefixes of size 4 to 16 of a random permutation of the set of 16 tall fescue samples—this strategy was borrowed from the prefix tests in the parameter tests described in “ Parameter tests ” section.

Both RNA-clique and Cnidaria can benefit from parallelism by performing computation in multiple threads or processes. RNA-clique can select top genes, build BLAST databases and execute BLASTn searches in parallel. Cnidaria can build its Jellyfish k -mer databases using multiple threads and can also split its data into multiple “pieces” which may be analyzed in parallel [ 14 ]. For the tests of resource usage as n and s varied, no parallelism was utilized. We separately tested the effect of the number of parallel jobs j (i.e., threads or processes) on resource usage for both methods. In these parallelism tests, the full set of 16 tall fescue samples was used with the fixed parameter setting \(n = 50000\) . The number of parallel jobs was varied from 1 to 16.

Resource usage tests for assembly were performed with SPAdes (version 3.15.5). SPAdes was allowed to allocate up to 120 GB of memory (though no assembly required that amount of memory). Although assembly can benefit from paralellism by running multiple assemblies in parallel or increasing the number of threads to use with SPAdes, neither option was utilized—only a single assembly was run at a time with one thread.

All tests assessing resource usage were performed on a computer with an AMD Ryzen 9 3950X CPU @ 2.2 GHz. The CPU had 16 physical cores, and frequency boosting up to 4.761 GHz was enabled. The computer had 117 GiB of RAM, and all data were read from and written to a PCIe 4.0 NVMe drive.

figure 8

Large component and ideal component counts in the gene matches graph as the parameter n changes for the set of four tall fescue samples

figure 9

Large component and ideal component counts in the gene matches graph as the parameter n changes for the set of 16 tall fescue samples

Plots displaying gene matches graph component counts for varying values of n in the set of four tall fescue samples and the set of 16 tall fescue samples are shown in Figs.  8 and 9 , respectively. Counts for both component types almost always increased with n . The rate of increase in ideal components increased for small values of n but decreased for large values of n until the counts of ideal components leveled off.

figure 10

Large components and ideal components for prefixes of varying size s from a permutation of the 16 tall fescue samples in which samples are interleaved by genotype. Marker shapes denote the kind of component counted. Colors indicate the genotype of the last sample in the prefix

figure 11

Large components and ideal components for prefixes of varying size s from a permutation of the 16 tall fescue samples in which samples are ordered by genotype. Marker shapes denote the kind of component counted. Colors indicate the genotype of the last sample in the prefix

For our genotype-ordered permutation, we found that adding a sample of a genotype not already present resulted in slightly greater decrease in ideal components than adding a sample with a genotype already present (Figs.  10 and 11 ).

figure 12

Large component and ideal component counts for randomly selected subsets of size s . The opacity of each point shown for s samples is inversely proportional to the number of subsets of size s tested

Figure  12 shows component counts for many randomly selected subsets of each size s from the set of 16 tall fescue samples. The variances in both component types decreased as s increased. (Note that there were fewer results for larger values of s , both because \(\left( {\begin{array}{c}16\\ s\end{array}}\right)\) , 16 choose s, is decreasing for \(s > 8\) and because tests become more time consuming as s increases, requiring the “fair” strategy to attempt fewer tests for large s .)

figure 13

Heatmap showing distance between samples in the set of 16 tall fescue samples. A scale mapping colors to distance values is shown on the right, and each cell of the heatmap is annotated with its distance expressed in ten thousandths. Note that no diagonal is shown for this matrix

The heatmap in Fig.  13 visualizes the distance matrix obtained for the set of 16 fescue samples. The samples are ordered by genotype and endophyte status on both axes. Distances measured ranged from \(0.0063\) to \(0.0092\) between samples.

Figure  1 visualizes the distance matrix for the 16 tall fescue samples using PCoA, in which samples of the same genotype formed clusters. Generally, the distance between two samples of the same genotype was less than the distance between two samples of different genotypes. Although three samples each from two of the genotypes either possessed or lacked endophyte, little or no effect of endophyte was observed in the PCoA plot. (No additional separation was evident in a 3-dimensional PCoA, not shown.)

figure 14

Heatmap showing distance between samples in the set of 24 bluehead wrasse samples. A scale mapping colors to distance values is shown on the right, and each cell of the heatmap is annotated with its distance expressed in ten thousandths. Note that no diagonal is shown for this matrix

Figure 14 is a heatmap visualizing the distance matrix for the set of 24 bluehead wrasse samples. The samples are ordered first by individual and then by genotype. Distances among the bluehead wrasse samples ranged from 0.0026 to 0.0056. For most samples, the closest sample was the other sample from the same individual. The exceptions were the individual 52 and individual 114 samples. The individual 52 forebrain was closest to the individual 114 gonad, and vice versa. Likewise, the individual 52 gonad was closest to the individual 114 forebrain, and vice versa. This stark result suggested that our method detected sample labeling errors.

figure 15

PCoA plot for the distance matrix of the 24 bluehead wrasse samples. Each point represents a sample, and color indicates the individual to which a sample was assigned in the SRA

The PCoA plot in Fig. 15 also visualizes the bluehead wrasse distance matrix. Although most samples were much closer to the other sample from the same individual than they were to any other sample, both individual 52 samples were closest to individual 114 samples, and both individual 114 samples were closest to individual 52 samples.

In the simulation study with 16 sets of sequences, the phylogenetic tree inferred from the calculated genetic distance matrix was topologically identical to the ground-truth tree in Fig.  7 .

figure 16

Heatmap showing distances computed by Cnidaria for the set of 16 tall fescue samples. A scale mapping colors to distance values is shown on the right, and each cell of the heatmap is annotated with its distance expressed in hundredths

Figure 16 visualizes the distance matrix computed with Cnidaria for the set of 16 tall fescue samples. Distances ranged from 0.32 to 0.56. Although the range differed from that for the distances computed using RNA-clique (Fig. 13 ), the two distance matrices showed a similar pattern. The distances between samples of the same genotype were lower than those between samples of different genotype in both matrices.

figure 17

PCoA plot for the distance matrix computed with Cnidaria for the 16 tall fescue samples. Color and shape indicate genotype, and fill indicates endophyte status

Figure 17 is a PCoA plot created from the matrix in Fig. 16 . As in the PCoA plot for the distance matrix computed using RNA-clique (Fig. 1 ), the samples clustered according to genotype, but the CTE27 and CTE46 clusters showed greater spread in the PCoA plot for the Cnidaria distance matrix.

figure 18

Heatmap showing distances computed with Cnidaria for the set of bluehead wrasse samples. A scale is shown to the right, and cells are annotated with distance values expressed in hundredths

The heatmap in Fig. 18 visualizes the distance matrix calculated by Cnidaria for the set of 24 bluehead wrasse samples. Unlike the samples in Fig. 14 , those in Fig. 18 are ordered first by tissue and second by individual. Distances ranged from 0.32 to 0.62. Distances between samples of the same tissue type were generally estimated to be smaller than those between samples of different tiissue type. Although the lowest distances were not between samples from the same individual (as they were in Fig. 14 ), the values on the diagonal of the upper-right quadrant of the matrix (the submatrix consisting of distances between samples of different tissue type) showed that distances between samples from the same individual tended to be lower than distances between other pairs of samples from different tissue types.

figure 19

PCoA plot for the distance matrix computed with Cnidaria for the 24 bluehead wrasse samples. Color and shape denote tissue type

Figure 19 is a PCoA plot for the Cnidaria bluehead wrasse distance matrix. All forebrain/midbrain samples formed a cluster, but the gonad samples were apparently spread out into multiple small clusters along the second principal component axis. Nevertheless, the gonad samples were near each other on the first axis, and all gonad samples were distant from the forebrain/midbrain cluster.

figure 20

Heatmap showing distances computed with Cnidaria for the set of bluehead wrasse samples, after using RNA-clique to filter transcripts so that only genes in ideal components are included. A scale is shown to the right, and cells are annotated with distance values expressed in hundredths

figure 21

PCoA plot for the distance matrix computed with Cnidaria for the 24 bluehead wrasse samples, after using RNA-clique to filter transcripts so that only genes in ideal components are included. Color denotes tissue type

The heatmap in Fig. 20 visualizes the distance matrix obtained with the combined RNA-clique and Cnidaria approach (using RNA-clique to select genes with orthologs in all samples) for the set of 24 bluehead wrasse samples. As in Fig. 18 , samples were sorted by tissue type and individual. Samples of the same tissue type were typically less distant than samples of different tissue types, but the difference between tissue types was less extreme than that observed in Fig. 18 . Moreover, for any given sample, the best match was often the other sample from the same individual. Figure 21 is a PCoA plot for the distance matrix computed for the 24 bluehead wrasse samples using the hybrid approach. Although clusters were denser in Fig. 19 than in Fig. 21 , there nevertheless remained a clear separation between forebrain/midbrain and gonad samples in the latter plot.

figure 22

Execution times for running parts of various RNA-seq to distance matrix pipelines with varying numbers of samples and one parallel job. “Selection” is the script that selects the top \(n = 50000\) genes from each of the transcriptomes, which was executed before RNA-clique or Cnidaria in its assembled mode

Tests of the effect of sample count ( s ; Fig. 22 ) showed that, when only one parallel job was used, transcriptome assembly with SPAdes was the most time-consuming process in any of the pipelines for obtaining genetic distance matrices from RNA-seq data. The “Selection” process represented the selection of top n genes by k -mer coverage (“ Assembling transcriptomes and selecting top genes ” section), which was used in both the RNA-clique and assembled-mode Cnidaria pipelines. Times shown for RNA-clique and assembled-mode Cnidaria did not include the selection time. RNA-clique was the second or third most time-consuming process, depending on s . RNA-clique’s running time was approximately quadratic in s for the values of s tested; all other programs were roughly linear in s . Applying quadratic least-squares regression to the running times for RNA-clique produced a model ( \(r^2 = 0.9984\) ) of RNA-clique’s running time in seconds as a function of s , \(t_{\text {R}}(s) = 3263.683s^2 + 10541.403s + 8169.31\) . Likewise, applying linear least-squares regression to the running times for Cnidaria in assembled mode produced a model ( \(r^2 = 0.9995\) ) of Cnidaria’s running time in seconds as a function of s , \(t_{\text {C}}(s) = 414.866s + 672.287\) .

figure 23

Maximum RSS for running parts of various pipelines with varying numbers of samples and one parallel job

Maximum RSS (memory usage) for varying values of s is shown in Fig. 23 . Although maximum RSS values for SPAdes assembly were recorded, the values were not included in the plot because they were much higher (as large as 14.66 GiB) than those for the other programs. Both modes of Cnidaria had a maximum RSS of 3.46 GiB, independent of the value of s . The selection process maximum RSS increased in steps due to differences in transcriptome size among the samples but never exceeded 135.75 MiB. Although memory usage for RNA-clique was lower than that for Cnidaria for \(s < 16\) , the maximum RSS of RNA-clique scaled roughly quadratically with s . Applying quadratic least-squares regression to the maximum RSS of RNA-clique produced a model ( \(r^2 = 0.9999\) ) of RNA-clique’s memory usage in MiB as a function of s , \(m_{\text {R}}(s) = 477.319s^2 + 1647.475s + 1480.589\) .

figure 24

Execution times for running parts of the RNA-clique and assembled-mode Cnidaria pipelines with varying values for n , the number of top genes to select

Figure 24 shows the execution times of the selection process, RNA-clique, and Cnidaria for various settings of the parameter controlling the number of top genes to select by k -mer coverage, n . Selection required very little time—always less than 150 s. The rate of change in running times in Fig. 24 decreased with n , causing the running times to level off.

figure 25

Maximum RSS for parts of the RNA-clique and assembled-mode Cnidaria pipelines with varying values for n , the number of top genes to select

Figure 25 shows the maximum RSS for the selection process, RNA-clique, and Cnidaria for varying values of n . As in the results measuring the effect of the number of samples s on maximum RSS, Cnidaria used no more than 3.46 GiB, regardless of parameter setting. The selection process maximum RSS increased slightly with n . The difference in memory usage for \(n = 226633\) (the maximum setting of n ) and for \(n = 1000\) was only 9.5 MiB, a \(7\%\) increase. The maximum RSS for RNA-clique likewise increases with n (and is generally much higher than the memory usage for selection), but the rate of change in maximum RSS for RNA-clique also decreases with n .

figure 26

Execution times for parts of various RNA-seq to distance matrix pipelines with varying numbers of parallel jobs

Figure 26 shows results of the tests of the effect of parallelism (number of parallel jobs) on running times of the selection process, Cnidaria (both raw and assembled mode), and RNA-clique. All steps saw much improvement in running time with additional parallel jobs, especially RNA-clique, for which the duration decreased by 5.49 hours, \(88.3\%\) .

figure 27

Maximum RSS for selection of top 50,000 genes with varying numbers of parallel jobs

For RNA-clique and Cnidaria, the maximum RSS increased very little (less than \(0.3 \%\) ) as the number of parallel jobs increased. The memory needed by the selection process increased much more (around \(1054 \%\) ) and increased roughly linearly with the number of parallel jobs. Only maximum RSS values for the selection process were included in Fig. 27 .

Results of the distance tests on plant, animal, and simulated testbeds suggest that the method proposed, RNA-clique, gives sufficiently accurate pairwise distances to distinguish RNA-seq samples according to genotype or individual. Moreover, results of the parameter tests suggest that, for sufficiently similar individuals, enough genes were retained in ideal components on which to base the genetic comparisons. In the tall fescue 16-sample testbed, selecting the top \(50000\) isotig sets by k -mer coverage gave more than \(5000\) ideal components on which to base the distance calculations, and even with a very narrow range of inferred distances from approximately 0.9– \(0.65\%\) , samples from each genotype clearly clustered in a 2D PCoA plot. Likewise, for the bluehead wrasse 24-sample testbed, the samples clustered by individual. Comparisons with an alternative method, Cnidaria, favor RNA-clique. Although Cnidaria may be more scalable than RNA-clique, results from RNA-clique appear more reliable.

The PCoA plot for the 16 tall fescue samples (shown in Fig.  1 ) shows four distant and non-overlapping clusters of individuals—one for each genotype—and the heatmap confirms that the distances between individuals of the same genotype are always relatively low compared to distances between individuals of different genotypes (Figure S1 ). Nevertheless, RNA-clique detects noise in the form of small differences for each pair of individuals with the same genotype. Although plants with the same genotype should be clones, there are no two individuals for which the similarity is computed to be exactly 1. Of course, it is possible some detected differences between clones reflect actual mutations, but differences may also stem from various sources of error. One class of error that could affect the accuracy of the distances are sequencing errors. To understand the effect of a sequencing error on the calculated distance, suppose we have a pair of transcripts, \(t_1\) and \(t_2\) , in one of the filtered gene matches tables, and, due to a sequencing error, \(t_1\) has an erroneous base \(b'\) where it should have b in the aligned region. Also, let c represent the corresponding base in \(t_2\) . (We assume there is no sequencing error at that position in \(t_2\) .) If \(b = c\) , then the erroneous base will appear as a spurious mismatch (a “false positive” difference). If instead \(b \ne c\) and \(b' = c\) , the erroneous base will appear as a spurious identity (a “false negative” difference). Finally, if \(b \ne c\) and \(b' \ne c\) , the erroneous base has no effect for that pair of transcripts—RNA-clique correctly counts it as a mismatch (a “true positive” difference).

Since the tall fescue samples are not haploid, homeologous transcripts may be a source of false differences. Specifically, if a genotype is heterozygous for some gene, but different alleles are captured in the transcriptomes of different clones, there is a risk that a transcript in one clone may erroneously be compared with a transcript that is not its true closest match in another clone. This kind of error would inflate the computed distances. Furthermore, even if all alleles are captured in the RNA-seq reads for all clones, there is a risk that the assembler may assemble reads belonging to different homeologs into a single isotig. If this happens inconsistently across different samples, the assembled transcripts for one clone may differ from those of another, and these differences could contribute to the computed distance between the clones. Such an assembly error could result in either overestimation or underestimation of distances.

The extent to which each of these factors contributes to the differences observed between samples of the same genotype may be explored in future research, and future refinements to RNA-clique may incorporate strategies for mitigating some factors. For example, sequencing and assembly errors may be detectable by consulting the original reads. Sequencing errors may appear as low-quality bases, and assembly errors could be detected by determining whether a detected difference between isotigs can be accounted for by an alternative assembly for one or both of the isotigs. In either case, differences identified as potentially spurious may be excluded from the distance calculation. Such refinements may be especially useful for very small or especially complex datasets. Although certain factors may lead to overestimation of distance in some circumstances, the results indicate that RNA-clique is effective at unambiguously grouping samples by genotype. The results of the tests with the set of 16 tall fescue samples also show that analyzing multiple samples per genotype is especially helpful for genotyping despite non-zero distances among clones since such distances are smaller than those between samples with different genotypes.

The results for the distance tests with the set of 24 bluehead wrasse samples show that RNA-clique can determine pairs of samples that belong to the same individual for at least 10 of the 12 individuals (20 of 24 samples). The method ostensibly gives some incorrect distances for individuals 52 and 114, but since RNA-clique identifies two pairs of closely related samples, both with one sample from each of the two individuals, we believe the error is likely caused by incorrect labeling of the samples. The labels for two samples of the same tissue type from individuals 52 and 114 may have been swapped in the SRA. That the swap is also evident in the results from Cnidaria (the upper-right quadrants of Figs. 18 and 20 ) suggest that the apparent mismatch is not a problem with RNA-clique. Furthermore, the results suggest that RNA-clique is a useful tool for verifying that RNA samples are correctly attributed to source individuals.

A comparison between the results obtained from RNA-clique and those obtained from Cnidaria shows that RNA-clique is as reliable or more reliable than Cnidaria, depending on the dataset. Results obtained by the two methods for the set of 16 tall fescue samples are very similar (though the scales of the distances are different). Nevertheless, the CTE27 and CTE46 clusters in the PCoA plot of the Cnidaria results (Fig. 17 ) are less dense than those in the corresponding plot of the RNA-clique results. Since we expect that samples of the same genotype should be identical, and, thus, should have no distance, this difference in the two plots may indicate that RNA-clique gives more accurate distances for these genotypes than does Cnidaria. In contrast, results obtained with the two methods for the set of 24 bluehead wrasse samples are markedly different. Almost all samples in the PCoA plot for RNA-clique (Fig. 15 ) form two-sample clusters according to individual as expected, but for Cnidaria, samples instead cluster according to tissue type (Fig. 19 ).

The Cnidaria method fails to identify the same genotypes in the bluehead wrasse dataset but succeeds with the tall fescue dataset. We considered as a possibility that the different tissues in the fish expressed sufficiently different sets of genes that most k -mers were specific to one or the other tissue. However, applying our ideal components strategy, which is meant to filter for true orthologs, does not qualitatively change the outcome. Another possibility is that, despite filtering for orthologs, the mRNA structures are sufficiently different due to, for example, alternative splicing [ 15 , 16 ]. An alternatively spliced intron would lead to a number of unique k -mers comparable to the k -mer length, and those may dominate the distance calculation. In contrast, the distance used in RNA-clique is designed to avoid any effect of such differences in mRNA structure, and, perhaps for this reason, succeeds with the fish RNA-seq testbed.

Tests assessing the effect of parameter n , the number of top genes selected at the beginning of our method, on the number of ideal components in the gene matches graph reveal that there are diminishing returns for selecting more genes past a certain point (for the set of four tall fescue samples, we judge around \(n = 20000\) ). For the set of 16 tall fescue samples, the difference between the count of ideal components at \(n = 50000\) and at the maximum value for n , \(n = 226633\) , was only \(216\) ; the increase in ideal components was only approximately \(3.5 \%\) . Therefore, for that study we judge \(50000\) genes to be adequate for the analysis, and this represents much savings in time compared to exhaustive analysis.

Still, it is apparent that the extent to which we benefit (in terms of ideal component count) from selecting more genes depends on the number and kinds of samples we have, among other factors. The ideal component count increases little past \(n = 50000\) for the set of 16 tall fescue samples, but there is still much that can be gained from selecting more than \(50000\) genes in the set of four tall fescue samples. Future work may focus on modeling relationships between the ideal component count and the parameters n and s . Such a model might be useful for selecting appropriate values of n for new data if we can extrapolate predicted ideal component counts for large values of n from counts for smaller values of n for which the gene matches graph is faster to build.

Tests assessing the effect of the parameter s on the component counts show that although we obtain fewer ideal components on average as we increase s for a given value of n , we typically lose fewer components with each successive sample. Of course, some individuals in the set of 16 tall fescue samples are expected to be much more closely related than others, and the genotype-interleaved tests suggest that the similarity of a newly added sample to those previously considered can affect the decrease in ideal components. As we might expect, sufficiently dissimilar samples can cause the component count to drop to zero; we observed this with simulated data when we used a mutation rate of \(0.1\) (data not shown) instead of the rate of \(0.01\) we used for the tests described here. For very distantly related pairs of samples, there may be no BLAST hits at all; if such a pair is present among the set of samples, the gene matches graph will have no ideal components. For other sets of samples, there may be BLAST hits for every pair, but there may still be insufficient hits to form an ideal component. The effect of the samples’ similarity on the number of ideal components we obtain is a possible topic of future research that could be explored with additional simulated data. Specifically, observing how the number of ideal components we obtain varies as we change mutation rate may provide some insight into the relationship between similarity and ideal component count.

Results from the resource usage tests show that Cnidaria scales better than RNA-clique in terms of memory and time requirements, but RNA-clique’s resource usage is nevertheless sufficiently small to make it a practical method for handling moderately large sets of samples. Extrapolation with the regression models of running time and memory usage for RNA-clique ( \(t_{\text {R}}\) and \(m_{\text {R}}\) , respectively; “ Resource usage tests ” section) predicts that the computer used for the resource usage tests should be able to run RNA-clique with sets containing as many as 94 samples ( \(m_{\text {R}}(94) \le 117 \times 2^{10} < m_{\text {R}}(95)\) ), which would take 9.21 days with a single parallel job, or 25.85 hours with 16 parallel jobs. Provided enough memory, RNA-clique should be able to handle in one week sets of up to 82 samples with one parallel job ( \(t_{\text {R}}(82) \le 60^2 \times 24 \times 7 < t_{\text {R}}(83)\) ) or up to 239 samples with 16 parallel jobs ( \((1 - 0.883) \times t_{\text {R}}(239) \le 60^2 \times 24 \times 7 < (1 - 0.883) \times t_{\text {R}}(240)\) ). To run RNA-clique with 82 samples would require 87.88 GiB, and to run RNA-clique with 239 samples would require 741.862 GiB. In contrast, Cnidaria should be able to handle very large sets of samples. The model for Cnidaria’s time usage ( \(t_{\text {C}}\) ; “ Resource usage tests ” section) suggests that Cnidaria should be able to handle in one week sets of up to 8747 samples with one parallel job ( \(t_{\text {C}}(8747) \le 60^2 \times 24 \times 7 < t_{\text {C}}(8748)\) ) or up to 15884 samples with 16 parallel jobs ( \((1 - 0.449) \times t_{\text {C}}(15884) \le 60^2 \times 24 \times 7 < (1 - 0.449) \times t_{\text {C}}(15885)\) ).

Since RNA-clique appears to give more accurate results than Cnidaria, we believe RNA-clique should be the preferred method despite the latter method’s superior scalability. Still, the sources of error in Cnidaria’s distance matrix for the bluehead wrasse data are not fully known. Future work could focus on identifying these sources of error with the goal of improving the method or determining on which datasets Cnidaria can be used reliably.

Future work

In addition to the possible future directions mentioned above, we would also like to further test our method using more synthetic data designed to simulate a wider range of scenarios. Since many commonly studied organisms are diploid or polyploid, we are especially interested in simulating hybridization of closely related taxa to investigate the effect that the presence of homeologs has on the accuracy of the calculated distances and correct matching of orthologs.

Although we think using simulated data would allow us to study more precisely how the number of samples s and samples’ relatedness affect ideal component count, we also plan to test this approach on data for larger—and perhaps more diverse—sets of organisms. Such tests may better inform us of the practical limitations of the method proposed.

Finally, we would like to explore the mathematical properties of the distances we compute and possibly refine our method based on our findings. Although we often describe the quantities we compute for each pair of samples as “distances”, we have not proven that our distance, as a function of a pair of transcriptomes, satisfies all properties one expects to hold for a distance metric. In particular, we believe the distance we compute may not necessarily be symmetric; i.e., computing the distance between sample A and sample B may not give the same result as computing the distance between sample B and sample A . We also have not proved that the triangle inequality holds; we do not know that the sum of distances from A to B and B to C are never less than the distance from A to C . We have yet to observe a counterexample for either property, but we have so far only tested RNA-clique on “realistic” data that may not be likely to explore cases in which these properties would be violated.

Despite the aggressive filtering applied throughout the proposed method and the inherent limitations of considering only transcribed sequences, we find the approach described in this paper satisfactorily measures differences among closely related individuals in tests with both real and simulated data. Although the amount of data remaining after filtering depends on the number of samples used and the relatedness of those samples, the filtering process retains enough data to get useful pairwise distances for the testbed examples, provided that we set the parameter n sufficiently high.

The method has been tested on a hexaploid grass, a vertebrate animal, and simulated data with satisfactory results that suggest RNA-clique may be equipped to handle other organisms of practical interest that possess similarly complex genomes, including humans and many other animals. The method is not without some limitations. Applying RNA-clique to simulated data generated using a high mutation rate (data not shown) revealed that samples may be too distantly related to compare with this method. Likewise, there may be some datasets where samples are too closely related to distinguish above the noise. Comparisons for time and memory usage for RNA-clique versus Cnidaria suggests that the latter may sometimes be preferable for very large sets of samples with the caveat that Cnidaria may not produce as accurate results depending on the nature of the sample sets. Therefore, if the data set is too large for RNA-clique, it may be a useful strategy to check results of Cnidaria against results of RNA-clique on a subset of samples.

Although further work is required to determine how distantly or closely related the samples may be in order for RNA-clique to be practical, we nevertheless think that the results of our tests indicate the method proposed here is useful for generating pairwise distance matrices based on multiple RNA-seq datasets for a wide range of organisms and experiments.

Data availability

The tall fescue RNA-seq data analyzed during the current study are available from the NCBI Sequence Read Archive at https://www.ncbi.nlm.nih.gov/sra using the accessions provided in Tables 11 and 12 . The bluehead wrasse data are likewise available from the Sequence Read Archive and may be found using the BioSample accessions provided in Table 13 The simulated transcriptomes analyzed during the current study are available from the corresponding author on reasonable request.

López A, Carreras C, Pascual M, Pegueroles C. Evaluating restriction enzyme selection for reduced representation sequencing in conservation genomics. Mol Ecol Resour 2023.

Dinkins RD, Nagabhyru P, Young CA, West CP, Schardl CL. Transcriptome analysis and differential expression in tall fescue harboring different endophyte strains in response to water deficit. Plant Genome. 2019;12(2): 180071.

Article   Google Scholar  

Aflitos SA, Severing E, Sanchez-Perez G, Peters S, de Jong H, de Ridder D. Cnidaria: fast, reference-free clustering of raw and assembled genome and transcriptome NGS data. BMC Bioinf. 2015;16(1):1–10.

Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997;278(5338):631–7.

Article   CAS   PubMed   Google Scholar  

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.

Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinf. 2009;10:1–9.

Liu H, Lamm MS, Rutherford K, Black MA, Godwin JR, Gemmell NJ. Large-scale transcriptome sequencing reveals novel expression patterns for key sex-related genes in a sex-changing fish. Biol Sex Differ. 2015;6:1–20.

Article   CAS   Google Scholar  

Arumuganathan K, Tallury S, Fraser M, Bruneau A, Qu R. Nuclear DNA content of thirteen turfgrass species by flow cytometry. Crop Sci. 1999;39(5):1518–21.

Humphreys M, Thomas HM, Morgan W, Meredith M, Harper J, Thomas H, et al. Discriminating the ancestral progenitors of hexaploid Festuca arundinacea using genomic in situ hybridization. Heredity. 1995;75(2):171–4.

Dinkins RD, Nagabhyru P, Graham MA, Boykin D, Schardl CL. Transcriptome response of Lolium arundinaceum to its fungal endophyte Epichloë coenophiala . New Phytol. 2017;213(1):324–37.

Bushmanova E, Antipov D, Lapidus A, Prjibelski AD. rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data. GigaScience. 2019;8(9):giz100. https://doi.org/10.1093/gigascience/giz100

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sukumaran J, Holder MT. DendroPy: a Python library for phylogenetic computing. Bioinformatics. 2010;26(12):1569–71.

Cock PJA, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25(11):1422–3. https://doi.org/10.1093/bioinformatics/btp163 .

Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27(6):764–70. https://doi.org/10.1093/bioinformatics/btr011 .

Marasco LE, Kornblihtt AR. The physiology of alternative splicing. Nat Rev Mol Cell Biol. 2023;24(4):242–54.

Gómez-Redondo I, Planells B, Navarrete P, Gutiérrez-Adán A. Role of alternative splicing in sex determination in vertebrates. Sex Dev. 2021;15(5–6):381–91.

Article   PubMed   Google Scholar  

Download references

Acknowledgements

We acknowledge Dr. Padmaja Nagabhyru for providing advice and suggesting and providing the tall fescue datasets used in this work.

This work was supported by U.S. National Science Foundation grant DEB 2030225, U.S. Department of Agriculture National Institute of Food and Agriculture Multi-state project 7003566, and the Harry E. Wheeler endowment to the University of Kentucky.

Author information

Authors and affiliations.

Department of Computer Science, University of Kentucky, 329 Rose St, Lexington, KY, 40508, USA

Andrew C. Tapia, Jerzy W. Jaromczyk & Neil Moore

Department of Plant Pathology, University of Kentucky, 1405 Veterans Dr, Lexington, KY, 40546, USA

Christopher L. Schardl

You can also search for this author in PubMed   Google Scholar

Contributions

ACT, JWJ, NM, and CLS contributed to the conceptualization of the methods used and reviewed and edited the manuscript. ACT prepared the manuscript and wrote new software used.

Corresponding author

Correspondence to Andrew C. Tapia .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary file 1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Tapia, A.C., Jaromczyk, J.W., Moore, N. et al. RNA-clique: a method for computing genetic distances from RNA-seq data. BMC Bioinformatics 25 , 205 (2024). https://doi.org/10.1186/s12859-024-05811-9

Download citation

Received : 30 November 2023

Accepted : 15 May 2024

Published : 04 June 2024

DOI : https://doi.org/10.1186/s12859-024-05811-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Genetic distance
  • Graph algorithms
  • Phylogenetics

BMC Bioinformatics

ISSN: 1471-2105

a case study methodology is useful in

What is inbound marketing?

Inbound marketing is a methodology to attract loyal customers to your business by aligning with your target audience's needs. Creating tailored marketing experiences through valuable content is the core of an inbound marketing strategy that helps you drive customer engagement and growth.

Inbound Marketing Overview

The inbound methodology is the strategic method of growing your organization by building meaningful, lasting relationships with consumers, prospects, and customers as opposed to interrupting them with traditional advertising methods. It’s about valuing and empowering these people to reach their goals at any stage in their journey with you.

Why? Because when your customers succeed, you succeed.

The inbound methodology can be applied in three ways:

  • Attract : drawing in the right people with valuable content and conversations that set you up as a trusted advisor.
  • Engage : presenting insights and solutions that align with their pain points and goals so they are more likely to buy from you. 
  • Delight : offering help and support to empower your customers to find success with your product. 

Why is inbound marketing important?

When customers find success and share that success with others, it attracts new prospects to your organization, creating a self-sustaining loop. This is how your organization builds momentum, and this is why the inbound methodology serves as a strong foundation for your flywheel .

Attract, engage, delight flywheel graphic, with growth at the center and around the outside: strangers, prospects, customers, promoters.

To reach and engage with that target audience effectively, you need to shift your business focus toward inbound marketing strategies. From HubSpot Co-Founder Brian Halligan’s perspective, "[If] you create all this content, and it's rich content — it's informative, it [will] pull people in…so people fall in love with your brand." By using social media, email marketing, blogging, and a truly exceptional website, you can create valuable, informative, and engaging content that pulls people in and cultivates a genuine connection with your brand. Embracing the inbound methodology serves as a strong foundation for building momentum to attract new prospects and ultimately drive business growth.

How does inbound marketing work?

Inbound marketing strategies will help you effectively market to your target audience the inbound way. Keep your flywheel spinning and help your business grow better .

Attracting Strategies

Inbound marketing strategies that attract your target audience and customer personas are tied to content creation and development.

To reach your audience, start by creating and publishing content — such as blog articles, content offers, and social media — that offers value. Examples include guides on how to use your products, information about how your solution can solve their challenges, customer testimonials, and details about promotions or discounts.

To attract your audience members on a deeper level through inbound marketing, optimize all this content with an SEO strategy . Target specific keywords and phrases related to your products or services, the challenges you solve for customers, and the ways you help customers.

This SEO strategy will allow your content and information to organically appear on the search engine results page (SERP) when people search for this information. These folks are your target audience, and likely the right customers for your business.

Engaging Strategies

When using inbound strategies to engage your audience, make sure you’re communicating and dealing with leads and customers in a way that makes them want to build long-term relationships with you. When using these engagement strategies, inject information about the value your business will give them.

Specific engagement strategies may include how you handle and manage your inbound sales calls . Focus on how customer service representatives handle calls from interested people and prospects. Additionally, be sure you’re always solution selling rather than product selling. This will make sure all deals end in mutually beneficial agreements for customers and your business — meaning, you offer value for your right-fit customers.

Delighting Strategies

Inbound strategies that delight make sure customers are happy, satisfied, and supported long after they buy. These strategies involve your team members becoming advisors and experts who can assist customers at any time.

Incorporating thoughtful, well-timed chatbots and surveys to help, support, and request feedback is a great way to delight your customers. Bots and surveys should be shared at specific points throughout the customer’s journey to make sure they are relevant and valuable.

For example, chatbots may help current customers set up a new technique or tactic you've started offering that they’d like to take advantage of. Additionally, a satisfaction survey may be sent out six months after customers buy your product or service to get their feedback and review ideas for improvement.

Social media listening is another important strategy when it comes to delighting customers. Social media followers may use one of your profiles to give feedback, ask questions, or share their experience with your products or services. Show that you hear and care by responding to these interactions with information that helps, supports, and encourages followers.

Lastly, the mark of an inbound strategy focused on delighting customers is one that helps and supports customers in any situation, whether your business gets any value out of it or not. Remember, a delighted customer becomes a brand advocate and promoter, so handle all interactions, both big and small, with care.

Get Started With Your Inbound Marketing Strategy

As an inbound marketer, your goal is to attract new prospects to your company, engage with them at scale, and delight them individually.

You also partner with your sales and services teams to keep the flywheel spinning effectively and help the business grow. It's a big job, but the inbound methodology and Marketing Hub have you covered.

""

Attract Tools Engage Tools Delight Tools
Ads
Video
Blogging
Social media
Content strategy
Lead flows
Email marketing
Lead management
Conversational bots
Marketing automation
Smart content
Email marketing
Conversations inbox
Attribution reporting
Marketing automation

You don’t want just anyone coming to your website. You want people who are most likely to become leads and, ultimately, happy customers. How do you get them there? You attract more of the right customers with relevant content at the right time.

Use the content strategy tool to build your authority in search and rank for the topics that matter the most to your prospects. Publish your blog post or video content across social networks using social media tools. Create ads to increase awareness of your brand with your target audience. Throughout each stage, reporting and analytics will help you know what’s working and where you need to improve.

Use HubSpot Conversations to create lasting relationships with prospects on the channels they prefer — through email, bots, live chat, or messaging apps. Use conversion tools like — CTAs, forms, and lead flows — to capture the information of prospects visiting your site. 

Use all the prospect and customer information in the CRM to personalize the website experience using smart content, and the entire buyer’s journey using email and workflows. Create brand loyalty by targeting specific audiences with your social content or ads. Connect your favorite tools to HubSpot to fit the unique needs of your business.

Use email and marketing automation in conjunction with HubSpot Conversations to deliver the right information to the right person at the right time, every time. Use the Conversations inbox to align with your sales and service team members to create contextual conversations with the people you do business with. Create memorable content your prospects can share with their friends and family by using a variety of content formats that your prospects prefer, like video, .  

By combining the inbound methodology with   HubSpot software , you’ll grow your business and get customers to buy more, stay with you longer, refer their friends, and tell the world they love you.

Learn Inbound Marketing

Sign up for a free HubSpot Academy course to learn inbound marketing, access free tools to try inbound yourself, and get certified. Grow your business and your career with the inbound methodology.

a case study methodology is useful in

Gravity data fusion using wavelet transform and window weighting: a case study in the Ross Sea of Antarctica

  • Song, Haibin
  • Bai, Yongliang
  • Yan, Quanshu

Satellite gravity anomaly data are characterized with wide coverage and high overall normalized quality, and these data can be used in large-scale regional structural research. However, detailed information on local areas is often missing after smoothing. High-resolution ship-borne gravity anomaly data can better identify fault zones and block boundaries at key locations, compensating for low-resolution satellite gravity data. In this study, comprehensive gravity data derived from multiple techniques are used based on wavelet transforms, the fusion rules for high- and low-frequency wavelet coefficients are established, and the complementary use and effective fusion of gravity data derived from multiple techniques are realized. By collecting a large amount of ship-borne data in the Ross Sea of Antarctica, 1434 valid survey lines with a total length of 98,204 km are obtained in the study area. After adjustment, the root mean square of the crossover errors is ± 1.92 × 10 -5 m/s 2 . Here, different wavelet functions and decomposition levels are used, the concept of window weighting is introduced, and the useful information of the two data types is further fused. Thus, higher-resolution data are obtained with less errors. When fusing all line data, the minimum RMS difference between the optimal fusion result and the ship measurement data is 1.64 × 10 -5 m/s 2 , which increases the accuracy by 1.66 × 10 -5 m/s 2 . When we adopt 80% data fusion and the remaining 20% data validation, although a considerable portion of the remaining side lines are still distributed in areas that the original side lines cannot cover, using this method can still effectively improve the accuracy of the fused data. This method can be applied to most gravity data.

  • Gravity data fusion;
  • Wavelet transform;
  • Window weighting;
  • Wavelet decomposition

IMAGES

  1. The Ivey Case Study Method

    a case study methodology is useful in

  2. a case study research methodology is useful in

    a case study methodology is useful in

  3. How to Create a Case Study + 14 Case Study Templates

    a case study methodology is useful in

  4. How to Write a Case Study

    a case study methodology is useful in

  5. a case study research methodology is useful in

    a case study methodology is useful in

  6. parts of a case study analysis

    a case study methodology is useful in

VIDEO

  1. Case Study Methodology

  2. Case Study Methodology

  3. Case Study || Research Methodology || Part 11

  4. CASE STUDIES @ IBS HYDERABAD

  5. Episode 27: IIDS Webinar on Politics, Economy and Public Policy

  6. Case Study methodology #education #viral #anime #newvideo #love #kdrama #livestream #instagram

COMMENTS

  1. Case Study

    Defnition: A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation. It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied.

  2. Case Study Methods and Examples

    The purpose of case study research is twofold: (1) to provide descriptive information and (2) to suggest theoretical relevance. Rich description enables an in-depth or sharpened understanding of the case. It is unique given one characteristic: case studies draw from more than one data source. Case studies are inherently multimodal or mixed ...

  3. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  4. What is a Case Study?

    A case study protocol outlines the procedures and general rules to be followed during the case study. This includes the data collection methods to be used, the sources of data, and the procedures for analysis. Having a detailed case study protocol ensures consistency and reliability in the study.

  5. 5 Benefits of the Case Study Method

    Through the case method, you can "try on" roles you may not have considered and feel more prepared to change or advance your career. 5. Build Your Self-Confidence. Finally, learning through the case study method can build your confidence. Each time you assume a business leader's perspective, aim to solve a new challenge, and express and ...

  6. What the Case Study Method Really Teaches

    What the Case Study Method Really Teaches. Summary. It's been 100 years since Harvard Business School began using the case study method. Beyond teaching specific subject matter, the case study ...

  7. Case Study Method: A Step-by-Step Guide for Business Researchers

    Some famous books about case study methodology (Merriam, 2002; Stake, 1995; Yin, 2011) provide useful details on case study research but they emphasize more on theory as compared to practice, and most of them do not provide the basic knowledge of case study conduct for beginners (Hancock & Algozzine, 2016). This article is an attempt to bridge ...

  8. Continuing to enhance the quality of case study methodology in health

    Purpose of case study methodology. Case study methodology is often used to develop an in-depth, holistic understanding of a specific phenomenon within a specified context. 11 It focuses on studying one or multiple cases over time and uses an in-depth analysis of multiple information sources. 16,17 It is ideal for situations including, but not limited to, exploring under-researched and real ...

  9. The case study approach

    The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations ...

  10. Case Study

    A case study is a detailed study of a specific subject, such as a person, group, place, event, organisation, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are sometimes also used.

  11. Case Study Methodology

    Case study methodology has as a central purpose to study a bounded system, an individual, whether that individual is a person, an institution, or a group, such as a school class. The purpose is to provide an in-depth understanding of a case. There can be different kinds of case studies, including intrinsic, instrumental, or multiple cases.

  12. (PDF) Qualitative Case Study Methodology: Study Design and

    McMaster University, West Hamilton, Ontario, Canada. Qualitative case study methodology prov ides tools for researchers to study. complex phenomena within their contexts. When the approach is ...

  13. Case Study Method: A Step-by-Step Guide for Business Researchers

    decisions regarding case study design (Yazan, 2015). Some famous books about case study methodology (Mer-riam, 2002; Stake, 1995; Yin, 2011) provide useful details on case study research but they emphasize more on theory as compared to practice, and most of them do not provide the basic knowledge of case study conduct for beginners (Hancock

  14. Case Study

    The case study method is very useful in examining the research questions related to real life (Yin, 2004). The case study research has some uniqueness compared to other methods. It is only case study that can provide the depth understanding of the real-life situations (Hayes, 2000).

  15. What is a case study?

    Case study is a research methodology, typically seen in social and life sciences. There is no one definition of case study research.1 However, very simply… 'a case study can be defined as an intensive study about a person, a group of people or a unit, which is aimed to generalize over several units'.1 A case study has also been described as an intensive, systematic investigation of a ...

  16. Case Study: Definition, Examples, Types, and How to Write

    Intrinsic case studies are useful for learning about unique cases. Instrumental case studies help look at an individual to learn more about a broader issue. A collective case study can be useful for looking at several cases simultaneously. ... Interviews: Interviews are one of the most important methods for gathering information in case studies ...

  17. Case Study Research Method in Psychology

    Case studies are in-depth investigations of a person, group, event, or community. Typically, data is gathered from various sources using several methods (e.g., observations & interviews). The case study research method originated in clinical medicine (the case history, i.e., the patient's personal history). In psychology, case studies are ...

  18. Distinguishing case study as a research method from case reports as a

    Because case study research is in-depth and intensive, there have been efforts to simplify the method or select useful components of cases for focused analysis. Micro-case study is a term that is occasionally used to describe research on micro-level cases . These are cases that occur in a brief time frame, occur in a confined setting, and are ...

  19. The "case" for case studies: why we need high-quality examples of

    This commentary highlights the value of case study methods commonly used in law and business schools as a source of "thick" (i.e., context-rich) description and a teaching tool for global implementation researchers. ... If the evaluation results indicate that the case study creation process produces useful products that enhance learning to ...

  20. What Is Data Analysis? (With Examples)

    Written by Coursera Staff • Updated on Apr 19, 2024. Data analysis is the practice of working with data to glean useful information, which can then be used to make informed decisions. "It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts," Sherlock ...

  21. The case study approach

    The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. ... Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme. Here, quantitative measures ...

  22. Toward Developing a Framework for Conducting Case Study Research

    Representing a framework which entails all the steps we mentioned above seemed useful. By reviewing almost all the literature on case study methodology, in this section, we try to develop a pervasive framework, in order to help researchers using the case study methodology in their research.

  23. Five Misunderstandings About Case-Study Research

    This article examines five common misunderstandings about case-study research: (a) theoretical knowledge is more valuable than practical knowledge; (b) one cannot generalize from a single case, therefore, the single-case study cannot contribute to scientific development; (c) the case study is most useful for generating hypotheses, whereas other methods are more suitable for hypotheses testing ...

  24. Tattoos as a risk factor for malignant lymphoma: a population-based

    Methods Study design. We performed a population-based case-control study, nested within the total Swedish population. ... Although case reports are useful to highlight new research questions and generate hypotheses for future research, they suffer from inherent limitations that hinder causal inference, such as selective reporting and the lack ...

  25. Current status and ongoing needs for the teaching and assessment of

    In this study, we used a convergent parallel mixed-methods design within a pragmatic constructivist case study approach . We ... Case-based learning and simulations were seen as the most useful methods for teaching CR, and clinical and oral examinations were favoured for the assessment of CR. The preferred format for a TTT-course was blended ...

  26. Experiences of medical students and faculty regarding the use of long

    The long case is used to assess medical students' proficiency in performing clinical tasks. As a formative assessment, the purpose is to offer feedback on performance, aiming to enhance and expedite clinical learning. The long case stands out as one of the primary formative assessment methods for clinical clerkship in low-resource settings but has received little attention in the literature.

  27. RNA-clique: a method for computing genetic distances from RNA-seq data

    Background Although RNA-seq data are traditionally used for quantifying gene expression levels, the same data could be useful in an integrated approach to compute genetic distances as well. Challenges to using mRNA sequences for computing genetic distances include the relatively high conservation of coding sequences and the presence of paralogous and, in some species, homeologous genes ...

  28. What Is Inbound Marketing?

    The inbound methodology is the strategic method of growing your organization by building meaningful, lasting relationships with consumers, prospects, and customers as opposed to interrupting them with traditional advertising methods. It's about valuing and empowering these people to reach their goals at any stage in their journey with you.

  29. Gravity data fusion using wavelet transform and window weighting: a

    Satellite gravity anomaly data are characterized with wide coverage and high overall normalized quality, and these data can be used in large-scale regional structural research. However, detailed information on local areas is often missing after smoothing. High-resolution ship-borne gravity anomaly data can better identify fault zones and block boundaries at key locations, compensating for low ...