What Is Primary Data and Secondary Data in Research Methodology

primary and secondary data research methodology

When it comes to research methodology, primary data and secondary data are essential components of the process. What is primary data and secondary data in research methodology?

Primary data is information collected through direct observation or experimentation, while secondary data is existing knowledge obtained from sources such as books, reports, and surveys. Understanding how to collect both primary and secondary data can be a challenge for R&D teams looking for insights into their projects.

In this blog post, we will explore what exactly these two types of research entail, how they should be collected in order to get the best results possible, how to analyze your findings, and how to apply those results to your project.

By understanding more about what is primary data and secondary data in research methodology, you can ensure that any decisions made regarding an innovation project are well-informed ones!

Table of Contents

What is Primary Data?

Types of primary data, advantages of primary data, disadvantages of primary data, how to collect primary and secondary data, methods for collecting primary and secondary data, challenges in collecting what is primary data and secondary data in research methodology, tips for collecting reliable primary and secondary data, analyzing primary and secondary research results, challenges in analyzing research results.

Primary data is information that has been collected directly from its original source. It is original and unique to the research project or study being conducted, as opposed to secondary data which has already been gathered and published by someone else.

Primary data can be collected through a variety of methods such as surveys, interviews, focus groups, observations, experiments, and more.

This type of data can be qualitative or quantitative in nature and provides insight into a particular issue or problem being studied. It is often used in research projects to gain an understanding of people’s opinions, behaviors, attitudes, and preferences on various topics.

The types of primary data depend on the method used for collecting it. Common types include survey responses (qualitative), interview transcripts (qualitative), observation notes (quantitative), and experiment results (quantitative).

Other examples include photographs taken during fieldwork trips or video recordings made during interviews with participants in a study.

Using primary data offers several advantages over relying solely on secondary sources when conducting research.

First off, it allows researchers to collect their own unique set of information that may not have been available before. This gives them greater control over what they are studying as well as how they interpret their findings.

Additionally, primary sources tend to provide more accurate results since there are fewer chances for errors due to human bias or misinterpretation.

Lastly, using primary sources also helps ensure that any potential ethical issues related to collecting personal information are addressed prior to the beginning of the project – something which isn’t always possible with secondary sources!

Despite all these benefits associated with using primary sources, there are some drawbacks too.

One major disadvantage is cost. Primary data collection can become quite expensive if done incorrectly!

Another downside relates to accuracy. Since much less time goes into verifying each data source, mistakes may occur more frequently — resulting in unreliable conclusions.

Key Takeaway: Primary data is a valuable source of information for research as it allows researchers to collect their own unique set of information that may not have been available before.

What is primary data and secondary data in research methodology?

Primary data can be gathered through surveys, interviews, focus groups, and experiments. It provides an accurate picture of the subject being studied since it has not been altered or influenced by other sources.

Secondary data is information that has already been collected and stored in a database. Examples of secondary data include census records, government statistics, published journal articles , and public opinion polls.

Secondary data can provide valuable insights into the topic being studied but may not always be up-to-date or reliable due to its age or source material.

There are several methods available for collecting primary and secondary data including surveys, interviews, focus groups, and experiments as well as online resources such as databases and archives.

Surveys are one of the most common methods used to collect primary data. They involve asking specific questions from a group of people who have agreed to participate in the survey process.

Interviews are another popular method used to gather primary information. They involve having an interviewer ask questions face-to-face with participants who have agreed to take part in the interview process.

Focus groups allow researchers to gain insight into specific topics by gathering together small groups of individuals who share similar interests or experiences so that their opinions can be discussed openly among each other during a moderated session.

Experiments are often used when conducting scientific research. They involve manipulating variables within controlled conditions while measuring results over time.

Online resources such as databases and archives offer access to large amounts of existing secondary information which can then be analyzed further if needed.

One challenge associated with collecting both primary and secondary data is obtaining accurate responses from participants.

Another issue could arise if there’s too much bias present within certain types of datasets (eg: political opinion polls) which makes it difficult for researchers to accurately interpret results.

Additionally, there might also exist some privacy concerns depending on the nature of personal details required while conducting research (eg: medical studies).

How to ensure reliable results when collecting both primary and secondary datasets?

First, make sure you have enough sample size.

Secondly, try to avoid using biased sources like political opinion polls.

Third, check all relevant privacy laws prior to starting any project involving the collection of personal details.

Lastly, double-check the accuracy and validity of all your findings before drawing final conclusions.

Key Takeaway: Collecting reliable primary and secondary data for research projects requires careful consideration of various factors. Researchers should ensure an adequate sample size, avoid biased sources, check relevant privacy laws, and double-check accuracy before drawing conclusions.

The first step in analyzing primary and secondary research results is to identify the key points from each study. This includes understanding what was studied, who participated in the study, how it was conducted, and any other relevant information about the study’s methodology.

Once this information has been gathered, it can be used to draw conclusions about the findings. Additionally, researchers should compare their own findings with those of other studies on similar topics to gain a more comprehensive understanding of their topic area.

Analyzing primary and secondary research results can be challenging due to sample size or methodology.

It is also difficult to determine which findings are reliable since some studies may have methodological flaws that could affect their accuracy or validity.

Additionally, interpreting qualitative data can be especially challenging since there is often no clear-cut answer when examining subjective responses from participants in a survey or interview setting.

Finally, researchers must take care not to make assumptions based on limited evidence as this could lead them astray from accurate interpretations of their results.

what is primary data and secondary data in research methodology

Primary data is collected through surveys, interviews, experiments, or observations while secondary data is obtained from existing sources such as books, journals, newspapers, and websites. Collecting both types of data requires careful planning and execution to ensure accuracy and reliability.

Analyzing the results of primary and secondary research can help identify trends in the industry that could be used to inform decisions or strategies for innovation teams.

Are you an R&D or innovation team looking for a solution to help centralize data sources and provide rapid time to insights? Look no further than Cypris . Our platform is designed specifically for teams like yours, providing easy access to primary and secondary data research so that your team can make the most informed decisions possible.

With our streamlined approach, there’s never been a better way to maximize efficiency in the pursuit of groundbreaking ideas!

Similar insights you might enjoy

primary and secondary data research methodology

Gallium Nitride Innovation Pulse

primary and secondary data research methodology

Carbon Capture & Storage Innovation Pulse

primary and secondary data research methodology

Sodium-Ion Batteries Innovation Pulse

Banner Image

Library Guides

Dissertations 4: methodology: methods.

  • Introduction & Philosophy
  • Methodology

Primary & Secondary Sources, Primary & Secondary Data

When describing your research methods, you can start by stating what kind of secondary and, if applicable, primary sources you used in your research. Explain why you chose such sources, how well they served your research, and identify possible issues encountered using these sources.  

Definitions  

There is some confusion on the use of the terms primary and secondary sources, and primary and secondary data. The confusion is also due to disciplinary differences (Lombard 2010). Whilst you are advised to consult the research methods literature in your field, we can generalise as follows:  

Secondary sources 

Secondary sources normally include the literature (books and articles) with the experts' findings, analysis and discussions on a certain topic (Cottrell, 2014, p123). Secondary sources often interpret primary sources.  

Primary sources 

Primary sources are "first-hand" information such as raw data, statistics, interviews, surveys, law statutes and law cases. Even literary texts, pictures and films can be primary sources if they are the object of research (rather than, for example, documentaries reporting on something else, in which case they would be secondary sources). The distinction between primary and secondary sources sometimes lies on the use you make of them (Cottrell, 2014, p123). 

Primary data 

Primary data are data (primary sources) you directly obtained through your empirical work (Saunders, Lewis and Thornhill 2015, p316). 

Secondary data 

Secondary data are data (primary sources) that were originally collected by someone else (Saunders, Lewis and Thornhill 2015, p316).   

Comparison between primary and secondary data   

Use  

Virtually all research will use secondary sources, at least as background information. 

Often, especially at the postgraduate level, it will also use primary sources - secondary and/or primary data. The engagement with primary sources is generally appreciated, as less reliant on others' interpretations, and closer to 'facts'. 

The use of primary data, as opposed to secondary data, demonstrates the researcher's effort to do empirical work and find evidence to answer her specific research question and fulfill her specific research objectives. Thus, primary data contribute to the originality of the research.    

Ultimately, you should state in this section of the methodology: 

What sources and data you are using and why (how are they going to help you answer the research question and/or test the hypothesis. 

If using primary data, why you employed certain strategies to collect them. 

What the advantages and disadvantages of your strategies to collect the data (also refer to the research in you field and research methods literature). 

Quantitative, Qualitative & Mixed Methods

The methodology chapter should reference your use of quantitative research, qualitative research and/or mixed methods. The following is a description of each along with their advantages and disadvantages. 

Quantitative research 

Quantitative research uses numerical data (quantities) deriving, for example, from experiments, closed questions in surveys, questionnaires, structured interviews or published data sets (Cottrell, 2014, p93). It normally processes and analyses this data using quantitative analysis techniques like tables, graphs and statistics to explore, present and examine relationships and trends within the data (Saunders, Lewis and Thornhill, 2015, p496). 

Qualitative research  

Qualitative research is generally undertaken to study human behaviour and psyche. It uses methods like in-depth case studies, open-ended survey questions, unstructured interviews, focus groups, or unstructured observations (Cottrell, 2014, p93). The nature of the data is subjective, and also the analysis of the researcher involves a degree of subjective interpretation. Subjectivity can be controlled for in the research design, or has to be acknowledged as a feature of the research. Subject-specific books on (qualitative) research methods offer guidance on such research designs.  

Mixed methods 

Mixed-method approaches combine both qualitative and quantitative methods, and therefore combine the strengths of both types of research. Mixed methods have gained popularity in recent years.  

When undertaking mixed-methods research you can collect the qualitative and quantitative data either concurrently or sequentially. If sequentially, you can for example, start with a few semi-structured interviews, providing qualitative insights, and then design a questionnaire to obtain quantitative evidence that your qualitative findings can also apply to a wider population (Specht, 2019, p138). 

Ultimately, your methodology chapter should state: 

Whether you used quantitative research, qualitative research or mixed methods. 

Why you chose such methods (and refer to research method sources). 

Why you rejected other methods. 

How well the method served your research. 

The problems or limitations you encountered. 

Doug Specht, Senior Lecturer at the Westminster School of Media and Communication, explains mixed methods research in the following video:

LinkedIn Learning Video on Academic Research Foundations: Quantitative

The video covers the characteristics of quantitative research, and explains how to approach different parts of the research process, such as creating a solid research question and developing a literature review. He goes over the elements of a study, explains how to collect and analyze data, and shows how to present your data in written and numeric form.

primary and secondary data research methodology

Link to quantitative research video

Some Types of Methods

There are several methods you can use to get primary data. To reiterate, the choice of the methods should depend on your research question/hypothesis. 

Whatever methods you will use, you will need to consider: 

why did you choose one technique over another? What were the advantages and disadvantages of the technique you chose? 

what was the size of your sample? Who made up your sample? How did you select your sample population? Why did you choose that particular sampling strategy?) 

ethical considerations (see also tab...)  

safety considerations  

validity  

feasibility  

recording  

procedure of the research (see box procedural method...).  

Check Stella Cottrell's book  Dissertations and Project Reports: A Step by Step Guide  for some succinct yet comprehensive information on most methods (the following account draws mostly on her work). Check a research methods book in your discipline for more specific guidance.  

Experiments 

Experiments are useful to investigate cause and effect, when the variables can be tightly controlled. They can test a theory or hypothesis in controlled conditions. Experiments do not prove or disprove an hypothesis, instead they support or not support an hypothesis. When using the empirical and inductive method it is not possible to achieve conclusive results. The results may only be valid until falsified by other experiments and observations. 

For more information on Scientific Method, click here . 

Observations 

Observational methods are useful for in-depth analyses of behaviours in people, animals, organisations, events or phenomena. They can test a theory or products in real life or simulated settings. They generally a qualitative research method.  

Questionnaires and surveys 

Questionnaires and surveys are useful to gain opinions, attitudes, preferences, understandings on certain matters. They can provide quantitative data that can be collated systematically; qualitative data, if they include opportunities for open-ended responses; or both qualitative and quantitative elements. 

Interviews  

Interviews are useful to gain rich, qualitative information about individuals' experiences, attitudes or perspectives. With interviews you can follow up immediately on responses for clarification or further details. There are three main types of interviews: structured (following a strict pattern of questions, which expect short answers), semi-structured (following a list of questions, with the opportunity to follow up the answers with improvised questions), and unstructured (following a short list of broad questions, where the respondent can lead more the conversation) (Specht, 2019, p142). 

This short video on qualitative interviews discusses best practices and covers qualitative interview design, preparation and data collection methods. 

Focus groups   

In this case, a group of people (normally, 4-12) is gathered for an interview where the interviewer asks questions to such group of participants. Group interactions and discussions can be highly productive, but the researcher has to beware of the group effect, whereby certain participants and views dominate the interview (Saunders, Lewis and Thornhill 2015, p419). The researcher can try to minimise this by encouraging involvement of all participants and promoting a multiplicity of views. 

This video focuses on strategies for conducting research using focus groups.  

Check out the guidance on online focus groups by Aliaksandr Herasimenka, which is attached at the bottom of this text box. 

Case study 

Case studies are often a convenient way to narrow the focus of your research by studying how a theory or literature fares with regard to a specific person, group, organisation, event or other type of entity or phenomenon you identify. Case studies can be researched using other methods, including those described in this section. Case studies give in-depth insights on the particular reality that has been examined, but may not be representative of what happens in general, they may not be generalisable, and may not be relevant to other contexts. These limitations have to be acknowledged by the researcher.     

Content analysis 

Content analysis consists in the study of words or images within a text. In its broad definition, texts include books, articles, essays, historical documents, speeches, conversations, advertising, interviews, social media posts, films, theatre, paintings or other visuals. Content analysis can be quantitative (e.g. word frequency) or qualitative (e.g. analysing intention and implications of the communication). It can detect propaganda, identify intentions of writers, and can see differences in types of communication (Specht, 2019, p146). Check this page on collecting, cleaning and visualising Twitter data.

Extra links and resources:  

Research Methods  

A clear and comprehensive overview of research methods by Emerald Publishing. It includes: crowdsourcing as a research tool; mixed methods research; case study; discourse analysis; ground theory; repertory grid; ethnographic method and participant observation; interviews; focus group; action research; analysis of qualitative data; survey design; questionnaires; statistics; experiments; empirical research; literature review; secondary data and archival materials; data collection. 

Doing your dissertation during the COVID-19 pandemic  

Resources providing guidance on doing dissertation research during the pandemic: Online research methods; Secondary data sources; Webinars, conferences and podcasts; 

  • Virtual Focus Groups Guidance on managing virtual focus groups

5 Minute Methods Videos

The following are a series of useful videos that introduce research methods in five minutes. These resources have been produced by lecturers and students with the University of Westminster's School of Media and Communication. 

5 Minute Method logo

Case Study Research

Research Ethics

Quantitative Content Analysis 

Sequential Analysis 

Qualitative Content Analysis 

Thematic Analysis 

Social Media Research 

Mixed Method Research 

Procedural Method

In this part, provide an accurate, detailed account of the methods and procedures that were used in the study or the experiment (if applicable!). 

Include specifics about participants, sample, materials, design and methods. 

If the research involves human subjects, then include a detailed description of who and how many participated along with how the participants were selected.  

Describe all materials used for the study, including equipment, written materials and testing instruments. 

Identify the study's design and any variables or controls employed. 

Write out the steps in the order that they were completed. 

Indicate what participants were asked to do, how measurements were taken and any calculations made to raw data collected. 

Specify statistical techniques applied to the data to reach your conclusions. 

Provide evidence that you incorporated rigor into your research. This is the quality of being thorough and accurate and considers the logic behind your research design. 

Highlight any drawbacks that may have limited your ability to conduct your research thoroughly. 

You have to provide details to allow others to replicate the experiment and/or verify the data, to test the validity of the research. 

Bibliography

Cottrell, S. (2014). Dissertations and project reports: a step by step guide. Hampshire, England: Palgrave Macmillan.

Lombard, E. (2010). Primary and secondary sources.  The Journal of Academic Librarianship , 36(3), 250-253

Saunders, M.N.K., Lewis, P. and Thornhill, A. (2015).  Research Methods for Business Students.  New York: Pearson Education. 

Specht, D. (2019).  The Media And Communications Study Skills Student Guide . London: University of Westminster Press.  

  • << Previous: Introduction & Philosophy
  • Next: Ethics >>
  • Last Updated: Sep 14, 2022 12:58 PM
  • URL: https://libguides.westminster.ac.uk/methodology-for-dissertations

CONNECT WITH US

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories
  • Artificial Intelligence

Market Research

  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Salt Lake City.

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • Primary vs Secondary Research

Try Qualtrics for free

Primary vs secondary research – what’s the difference.

14 min read Find out how primary and secondary research are different from each other, and how you can use them both in your own research program.

Primary vs secondary research: in a nutshell

The essential difference between primary and secondary research lies in who collects the data.

  • Primary research definition

When you conduct primary research, you’re collecting data by doing your own surveys or observations.

  • Secondary research definition:

In secondary research, you’re looking at existing data from other researchers, such as academic journals, government agencies or national statistics.

Free Ebook: The Qualtrics Handbook of Question Design

When to use primary vs secondary research

Primary research and secondary research both offer value in helping you gather information.

Each research method can be used alone to good effect. But when you combine the two research methods, you have the ingredients for a highly effective market research strategy. Most research combines some element of both primary methods and secondary source consultation.

So assuming you’re planning to do both primary and secondary research – which comes first? Counterintuitive as it sounds, it’s more usual to start your research process with secondary research, then move on to primary research.

Secondary research can prepare you for collecting your own data in a primary research project. It can give you a broad overview of your research area, identify influences and trends, and may give you ideas and avenues to explore that you hadn’t previously considered.

Given that secondary research can be done quickly and inexpensively, it makes sense to start your primary research process with some kind of secondary research. Even if you’re expecting to find out what you need to know from a survey of your target market, taking a small amount of time to gather information from secondary sources is worth doing.

Types of market research

Primary research

Primary market research is original research carried out when a company needs timely, specific data about something that affects its success or potential longevity.

Primary research data collection might be carried out in-house by a business analyst or market research team within the company, or it may be outsourced to a specialist provider, such as an agency or consultancy. While outsourcing primary research involves a greater upfront expense, it’s less time consuming and can bring added benefits such as researcher expertise and a ‘fresh eyes’ perspective that avoids the risk of bias and partiality affecting the research data.

Primary research gives you recent data from known primary sources about the particular topic you care about, but it does take a little time to collect that data from scratch, rather than finding secondary data via an internet search or library visit.

Primary research involves two forms of data collection:

  • Exploratory research This type of primary research is carried out to determine the nature of a problem that hasn’t yet been clearly defined. For example, a supermarket wants to improve its poor customer service and needs to understand the key drivers behind the customer experience issues. It might do this by interviewing employees and customers, or by running a survey program or focus groups.
  • Conclusive research This form of primary research is carried out to solve a problem that the exploratory research – or other forms of primary data – has identified. For example, say the supermarket’s exploratory research found that employees weren’t happy. Conclusive research went deeper, revealing that the manager was rude, unreasonable, and difficult, making the employees unhappy and resulting in a poor employee experience which in turn led to less than excellent customer service. Thanks to the company’s choice to conduct primary research, a new manager was brought in, employees were happier and customer service improved.

Examples of primary research

All of the following are forms of primary research data.

  • Customer satisfaction survey results
  • Employee experience pulse survey results
  • NPS rating scores from your customers
  • A field researcher’s notes
  • Data from weather stations in a local area
  • Recordings made during focus groups

Primary research methods

There are a number of primary research methods to choose from, and they are already familiar to most people. The ones you choose will depend on your budget, your time constraints, your research goals and whether you’re looking for quantitative or qualitative data.

A survey can be carried out online, offline, face to face or via other media such as phone or SMS. It’s relatively cheap to do, since participants can self-administer the questionnaire in most cases. You can automate much of the process if you invest in good quality survey software.

Primary research interviews can be carried out face to face, over the phone or via video calling. They’re more time-consuming than surveys, and they require the time and expense of a skilled interviewer and a dedicated room, phone line or video calling setup. However, a personal interview can provide a very rich primary source of data based not only on the participant’s answers but also on the observations of the interviewer.

Focus groups

A focus group is an interview with multiple participants at the same time. It often takes the form of a discussion moderated by the researcher. As well as taking less time and resources than a series of one-to-one interviews, a focus group can benefit from the interactions between participants which bring out more ideas and opinions. However this can also lead to conversations going off on a tangent, which the moderator must be able to skilfully avoid by guiding the group back to the relevant topic.

Secondary research

Secondary research is research that has already been done by someone else prior to your own research study.

Secondary research is generally the best place to start any research project as it will reveal whether someone has already researched the same topic you’re interested in, or a similar topic that helps lay some of the groundwork for your research project.

Secondary research examples

Even if your preliminary secondary research doesn’t turn up a study similar to your own research goals, it will still give you a stronger knowledge base that you can use to strengthen and refine your research hypothesis. You may even find some gaps in the market you didn’t know about before.

The scope of secondary research resources is extremely broad. Here are just a few of the places you might look for relevant information.

Books and magazines

A public library can turn up a wealth of data in the form of books and magazines – and it doesn’t cost a penny to consult them.

Market research reports

Secondary research from professional research agencies can be highly valuable, as you can be confident the data collection methods and data analysis will be sound

Scholarly journals, often available in reference libraries

Peer-reviewed journals have been examined by experts from the relevant educational institutions, meaning there has been an extra layer of oversight and careful consideration of the data points before publication.

Government reports and studies

Public domain data, such as census data, can provide relevant information for your research project, not least in choosing the appropriate research population for a primary research method. If the information you need isn’t readily available, try contacting the relevant government agencies.

White papers

Businesses often produce white papers as a means of showcasing their expertise and value in their field. White papers can be helpful in secondary research methods, although they may not be as carefully vetted as academic papers or public records.

Trade or industry associations

Associations may have secondary data that goes back a long way and offers a general overview of a particular industry. This data collected over time can be very helpful in laying the foundations of your particular research project.

Private company data

Some businesses may offer their company data to those conducting research in return for fees or with explicit permissions. However, if a business has data that’s closely relevant to yours, it’s likely they are a competitor and may flat out refuse your request.

Learn more about secondary research

Examples of secondary research data

These are all forms of secondary research data in action:

  • A newspaper report quoting statistics sourced by a journalist
  • Facts from primary research articles quoted during a debate club meeting
  • A blog post discussing new national figures on the economy
  • A company consulting previous research published by a competitor

Secondary research methods

Literature reviews.

A core part of the secondary research process, involving data collection and constructing an argument around multiple sources. A literature review involves gathering information from a wide range of secondary sources on one topic and summarizing them in a report or in the introduction to primary research data.

Content analysis

This systematic approach is widely used in social science disciplines. It uses codes for themes, tropes or key phrases which are tallied up according to how often they occur in the secondary data. The results help researchers to draw conclusions from qualitative data.

Data analysis using digital tools

You can analyze large volumes of data using software that can recognize and categorize natural language. More advanced tools will even be able to identify relationships and semantic connections within the secondary research materials.

Text IQ

Comparing primary vs secondary research

We’ve established that both primary research and secondary research have benefits for your business, and that there are major differences in terms of the research process, the cost, the research skills involved and the types of data gathered. But is one of them better than the other?

The answer largely depends on your situation. Whether primary or secondary research wins out in your specific case depends on the particular topic you’re interested in and the resources you have available. The positive aspects of one method might be enough to sway you, or the drawbacks – such as a lack of credible evidence already published, as might be the case in very fast-moving industries – might make one method totally unsuitable.

Here’s an at-a-glance look at the features and characteristics of primary vs secondary research, illustrating some of the key differences between them.

What are the pros and cons of primary research?

Primary research provides original data and allows you to pinpoint the issues you’re interested in and collect data from your target market – with all the effort that entails.

Benefits of primary research:

  • Tells you what you need to know, nothing irrelevant
  • Yours exclusively – once acquired, you may be able to sell primary data or use it for marketing
  • Teaches you more about your business
  • Can help foster new working relationships and connections between silos
  • Primary research methods can provide upskilling opportunities – employees gain new research skills

Limitations of primary research:

  • Lacks context from other research on related subjects
  • Can be expensive
  • Results aren’t ready to use until the project is complete
  • Any mistakes you make in in research design or implementation could compromise your data quality
  • May not have lasting relevance – although it could fulfill a benchmarking function if things change

What are the pros and cons of secondary research?

Secondary research relies on secondary sources, which can be both an advantage and a drawback. After all, other people are doing the work, but they’re also setting the research parameters.

Benefits of secondary research:

  • It’s often low cost or even free to access in the public domain
  • Supplies a knowledge base for researchers to learn from
  • Data is complete, has been analyzed and checked, saving you time and costs
  • It’s ready to use as soon as you acquire it

Limitations of secondary research

  • May not provide enough specific information
  • Conducting a literature review in a well-researched subject area can become overwhelming
  • No added value from publishing or re-selling your research data
  • Results are inconclusive – you’ll only ever be interpreting data from another organization’s experience, not your own
  • Details of the research methodology are unknown
  • May be out of date – always check carefully the original research was conducted

Related resources

Business research methods 12 min read, qualitative research interviews 11 min read, market intelligence 10 min read, marketing insights 11 min read, ethnographic research 11 min read, qualitative vs quantitative research 13 min read, qualitative research questions 11 min read, request demo.

Ready to learn more about Qualtrics?

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Afr J Emerg Med
  • v.10(Suppl 2); 2020

Acquiring data in medical research: A research primer for low- and middle-income countries

Vicken totten.

a Kaweah Delta Health Care District (KDHCD), KDHCD Department of Emergency Medicine, Visalia, CA, USA

Erin L. Simon

b Cleveland Clinic Akron General, Department of Emergency Medicine, Akron, OH, USA

Mohammad Jalili

c Department of Emergency Medicine, Tehran University of Medical Sciences, Tehran, Iran

Hendry R. Sawe

d Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania

Without data, there is no new knowledge generated. There may be interesting speculation, new paradigms or theories, but without data gathered from the universe, as representative of the truth in the universe as possible, there will be no new knowledge. Therefore, it is important to become excellent at collecting, collating and correctly interpreting data. Pre-existing and new data sources are discussed; variables are discussed, and sampling methods are covered. The importance of a detailed protocol and research manual are emphasized. Data collectors and data collection forms, both electronic and paper-based are discussed. Ensuring subject privacy while also ensuring appropriate data retention must be balanced.

African relevance

  • • To get good quality information you first need good quality data
  • • Data collection systematically and reproducibly gathers and measures variables to answer research questions.
  • • Good data is a result of a well thought out study protocol

The International Federation for Emergency Medicine global health research primer

This paper forms part 9 of a series of ‘how to’ papers, commissioned by the International Federation for Emergency Medicine. It describes data sources, variables, sampling methods, data collection and the value of a clear data protocol. We have also included additional tips and pitfalls that are relevant to emergency medicine researchers.

Data collection is the process of systematically and reproducibly gathering and measuring variables in order to answer research questions, test hypotheses, or evaluate outcomes.

Data is not information. To get good quality information you first need good quality data, then you must curate, analyse and interpret it. Data is comprised of variables. Data collection begins with determining which variables are required, followed by the selection of a sample from a certain population. After that, a data collection tool is used to collect the variables from the selected sample, which is then converted into a data spreadsheet or database. The analysis is done on the database.

Sometimes you gather data yourself. Sometimes you analyse data others collected for different purposes. Ideally, you collect a universal sample, that is, 100%. In real life, you get a limited sample. Preferably, it will be a truly random sample with enough power to answer your question. Unfortunately, you may have to settle for consecutive or convenience sampling. Ideally, your data collectors would be blinded to the outcome of interest, to prevent bias. However, real life is full of biases. Imperfect data may be better than no data; you can often get useful information from imperfect data. Remember the enemy of good is perfect.

Why is good data important?

Acquiring data is the most important step in a research study. The best design with bad data is useless. Bad design produces bad data. The most sophisticated analysis cannot be performed without data; analysing bad data produces erroneous results. Analysis can never be better than the quality of the data on which it was run. Good data has integrity. Data integrity is paramount to learning “Truth in the Universe”. Good data is as complete and as clean, as you can reasonably make it. Clean data ‘has integrity’ when the variables access as much relevant information as possible, and in the same way for each subject.

Some information is very hard to get. You may have to use proxy variables for what you really want to know. A proxy variable is a variable that is not in itself directly relevant, but that serves in place of an unobservable or immeasurable variable. In order for a variable to be a good proxy, it must have a close correlation, not necessarily linear, with the variable of interest. One example for the variable of a specific illness might be a medication list.

Consequences of bad data include an inability to answer the research question; inability to replicate or validate the study; distorted findings and wasted resources; compromised knowledge and even harm to subjects.

Ensure data quality

Good data is a result of a well-thought-out study protocol, which is the written plan for the study. Good planning is the most cost-effective way to ensure data integrity. Good planning is documented by a thorough and detailed protocol, with a comprehensive procedures manual. Poorly written manuals risk incomplete or inconsistent collection of data, in other words, ‘bad data’. The manual should include rigorous, step-by-step instructions on how to administer tests or collect the data. It should cover the ‘who’ (the subject and the researcher); the ‘when’ (the timing), the ‘how’ (methods), and the ‘what’ (a complete listing of variables to be collected). There should also be an identified mechanism to document any changes in procedures that may evolve over the course of the investigation. The study design should be reproducible: so that the protocol can be followed by any other researcher. All data needs to be gathered in the same way. Test (trial-run) your manual before you start your study. If data is collected by several people, make sure there is a sufficient degree of inter-rater reliability.

To get good data, your sample needs to be representative of the population. For others to apply your results, you need to characterize your population, so others can decide if your conclusions are relevant to their population (see Sampling section, below).

Data integrity demands you supervise your study, making sure it is complete and accurate. You may wish to do interim analyses. Keep copies! Keep both the raw data and the data sheets, for the length of time required by law or by Good Research Practice in your country. This will protect you from accusations of falsification of data.

In real life, you may have to deal with any number of sampling and data collection biases. Some of these biases can be measured statistically. Regardless, all the limitations you can think of should be written in your limitations section. The best design you can practically use gives you the best data you can reasonably get. Remember, “you cannot fix with statistics what you fouled up by design.”

Before you acquire your first datum, consider: Do you have a developed protocol and a research manual? Have you sought Ethics Board approval? Do you have an informed consent? Do you have a plan to protect the subject's confidentiality? Do you have a plan for data analysis? Where will you safely store and protect the data? If you have collaborators, have you established, in writing, who owns the data, and who has the right to analyse and publish it?

Types of data: qualitative vs. quantitative data

Numerical data is generally called quantitative; if in words or sentences, it is qualitative. Medical research historically has focused on quantitative methods. Generally, quantitative research is cheaper, easier to gather and easier to analyse. For purposes of this chapter, we will focus on quantitative research.

Qualitative research is about words, sentences, sounds, feeling, emotions, colours and other elements that are non-quantifiable. It requires human intellect to extract themes from the sentences, evaluate the fit of the data to the themes, and to draw the implications of the themes. Primary sources for qualitative data include open ended surveys, interviews, and public meetings. Qualitative research is more common in politics and the social sciences, and will not be further discussed here, except to refer you to other sources.

Quantitative research can include questionnaires with closed-ended questions (open ended questions belong in qualitative research). The data is transformed into numbers and will be analysed with parametric and non-parametric statistical tests. In general, you will derive a mean, mode and median; you will calculate probabilities, make correlation and regressions in order to draw conclusions.

Sources of data: primary vs secondary data

To answer a research question, there are many potential sources of data. Two main categories are primary data and secondary data. Primary data is newly collected data; it can be gathered directly from people's responses (surveys), or from their biometrics (blood pressure, weight, blood tests, etc.). It is still considered primary data if you gather data that was collected for other (medical) purposes by extracting the data from medical records. Medical records can be a rich source of data, but data extraction by hand takes a lot of time.

Secondary data already exists; it has already been published or complied. There are extant local, regional, national and international databases such as Trauma Registries, Disease-specific Registries, Public Health Data, government statistics, and World Health Organization data. Locally, your hospital or clinic may already keep statistics on any number of topics. Combining information from disparate databases may sometimes yield interesting results. For example, in the US, the Centers for Disease Control and Prevention keeps databases of reportable diseases, accidents, causes of death and much more. The US Geographic Survey reports the average elevation of American cities. Combining the two databases revealed that, even when gun ownership, drug and alcohol use were statistically controlled for, there was a linear correlation between altitude and suicide rates [ 2 ]. Reno et al., reviewed the existing medical literature (also secondary data), and confirmed the correlation and concluded that the mechanisms have yet to be elucidated [ 3 ].

Collecting good data is often the hardest part of research. Ideally, you would want to collect 100% of the data (universal sampling to reflect target population). One example would be ‘all elderly persons with gout’. In real life, you have access to only a subset of the target population (the accessible population). Further, in your study you will be limited to a subset of the accessible population (the study population). Again, in the ideal world, that limited sample would be truly random, and have enough power to answer your question. You can find free random number generators online. In real life, you may have to settle for consecutive or convenience sampling. Of the two, consecutive sampling has less bias. Sometimes it is important to balance your groups. You may have 2 or 3 treatments (or interventions) and want to have an equal number of each kind. So, you create blocks — of a few times the number of treatments. You randomized within the block. Each time a block is filled, you are assured that you have the right balance of subjects. Blocks are often in groups of six, eight or 12. This is called balanced allocation .

If you must get only a convenience sample – for example because you only have a single data gatherer and can get data only when that person is available – you should, at a minimum, try to get some simple demographics from times when the data gatherer is not available, to see if subjects at that other time are systematically different. For example, if you are looking at injuries, people who are injured when drinking on a Friday night might be systematically different from people who are injured on their way to work on a Monday morning. If you can only collect injury data in the morning, your results will be biased.

Variables are the bits of data you collect. They change from subject to subject and describe the subject numerically. Age (or year of birth); gender; ethnic group or tribe; and geographic location are commonly called simple demographic variables and should be collected and reported for most populations.

Continuous variables are quantified on a continuous scale, such as body weight. Discrete variables use a scale whose units are limited to integers (such as the number of cigarettes smoked per day). Discrete variables have a number of possible values and can resemble continuous variables in statistical analysis and be equivalent for the purpose of designing measurements. A good general rule is to prefer continuous variables because the information they contain provides additional information and improves statistical efficiency (more study power and smaller sample size).

Categorical variables are those not suitable for quantification. They are often measured by classifying them into categories. If there are two possible values (dead or alive), they are dichotomous. If there are more than two categories, they can be classified according to the type of information they provide (polytomous).

Research variables are either predictor (independent) or outcome (dependent) variables. The predictor variables might include such things as “Diabetes, Yes/No”, “Age over 65 — Yes/No”, and “diagnosis of hypertension” (again, Yes/No). The respective outcome might be “lower limb amputation” or “death within 10 years”. Your question might have been, “How much additional risk of amputation does a diagnosis of hypertension add in a person with diabetes?”

Before analysis, variables are coded into numbers and entered into a database. Your Research Manual should describe how to code all the data. When the variables are binary, (male/female; alive/dead) coding them into “0” and “1” makes analysing the data much easier (“1” versus “2” makes it harder). The easiest variables for computers to analyse are binary. In other words, “0” or “1”. Such variables are Yes/No; True/False; Male/Female; 65 or over / under 65, etc. The next easiest are ordinal integers: 1, 2, 3, etc. You might create ordinal numbers from categories (0–9; 10–19; 20–29 years of age, etc.), but in order to be ordinal, they require an obvious sequence. Categorical variables do not have an intrinsic order. “Green” “Brown” and “Orange” are non-ordinal, categorical variables. It is possible to transform categorical variables into binary variables, by making columns where only one of the answers is marked with a “1” (if that variable is present) and all the others are marked “0”. The form of the variables and their distribution will determine the type of statistical analysis possible. Data which must be transformed or cleaned is more prone to error in the cleaning or transformation process.

There are alternative ways to get similar information. For example, if you wanted to know the HIV status of each of your subjects, you could either test each one, or you could ask them. The tests cost more, however; they are less likely to give biased results. How you gather each variable will depend on your resources and will inform the limitations of your study.

Precision of a variable is the degree to which it is reproducible with nearly the same value each time it is measured. Precision has a very important influence on the power of a study. The more precise a measurement, the greater the statistical power of a given sample size to estimate mean values and test your hypotheses. In order to minimize random error in your data, and increase the precision of measurements, you should standardize your measurement methods; train your observers; refine any instruments you may use (such as calibrating instruments); automate instruments when possible (automated blood pressure cuff instead of manual); and repeat your measurements.

Accuracy of the variable is the degree to which it actually represents what it is intended to (Truth in the Universe). This influences the validity of the study. Accuracy is impacted by systemic error (bias). The greater the error, the less accurate the variable. Three common biases are: observer bias (how the measurement is reported); instrument bias (faulty function of an instrument); and subject bias (bad reporting or recall of the measurement by the study subject).

Validity is the degree to which a measurement represents the phenomenon of interest. When validating an abstract concept, search the literature or consult with experts so you can find an already validated data collection instrument (such as a questionnaire). This allows your results to be comparable to prior studies in the same area and strengthens your study methods.

Research manual

Simple research with limited resources does not need a research manual, just a protocol. Nor is there much need if the primary investigator is the only data gatherer and analyser. However, if several persons gather data, it is important that the data be gathered the same way each time.

Prevention is the most cost-effective activity that will ensure the integrity of data collection. A detailed and comprehensive research manual will standardize data collection. Poorly written manuals are vague and ambiguous.

The research manual is based off your protocol. The manual should spell out every step of the data collection process. It should include the name of each variable and specific details about how each variable should be collected. Contingents should be written. For example: “If the patient does not have a left arm, the blood pressure may be taken on the right arm. If the patient has no arms, leg blood pressures may be recorded, but put an ‘*’ beside the reading.” The manual should also include every step of the coding process. The coding manual should describe the name of each variable, and how it should be coded. Both the coder and the statistician will want to refer to that section. The coding section should describe how each variable will be entered into the database. Test the manual to make sure everyone understands it the same way.

Think about various ways a plan can go wrong. Write them down, with preferred solutions. There will always be unexpected changes. They should be added into the manual on a continuing basis. An on-going section where questions, problems and their solutions are all recorded will increase the integrity of your research.

Data collection methods

Before you start data collection, you need to ask yourself what data you are going to collect and how you are going to collect them. Which data, and the amount of data to be collected needs to be defined clearly. Different people (including several data collectors) should have a similar understanding of each variable and how it is measured. Otherwise, the data cannot be relied on. Furthermore, the decision to collect a piece of data needs to be justified. The amount of data collected for the study should be sufficient. A common mistake is to collect too much data without actually knowing what will be done with it. Researchers should identify essential data elements and eliminate those that may seem interesting but are not central to the study hypothesis. Collection of the latter type of data places an unnecessary burden on both the study participants and data collectors.

Different data collection approaches which are commonly used in the conduct of clinical research include questionnaire surveys, patient self-reported data, proxy/informant information, hospital and ambulatory medical records, as well as the collection and analysis of biologic samples. Each of these methods has its own advantages and disadvantages.

Surveys are conducted through administration of standardized or home-grown questionnaires, where participants are asked to respond to a set of questions as yes/no, or perhaps on a Likert type scale. Sometimes open-ended responses are elicited.

Medical records can be important sources of high-quality data and may be used either as the only source of data, or as a complement to information collected through other instruments. Unfortunately, due to the non-standardized nature of data collection, information contained in the medical records may be conflicting or of questionable accuracy. Moreover, the extent of documentation by different providers can vary significantly. These issues can make the construction or use of key study variables very difficult.

Collection of biological materials, as well as various imaging modalities, from the study participants are increasingly being used in clinical research. They need to be performed under standardized conditions, and ethical implications should be considered.

Data collection tool

You may need to collect information on paper. If you do, it is useful to have the actual code which should be entered into the computerized database written on the forms themselves (as well as in the manual). If you have access to an electronic database such as REDcap [a web-based application developed by Vanderbilt University to capture data for clinical research and create databases and projects [ 4 ], you can enter the data directly as you get them ( male ; female ) and the database will automatically convert the data into code. This reduces transcribing errors. Another common electronic database is Excel, which can also be used to manipulate the data. In spite of the advantages of recording data electronically, such as directly into REDcap or Excel, there are advantages to collecting and keeping the original data on paper. Paper data collection forms can be saved for audit or quality control. Furthermore, paper records cannot be remotely hacked. Moreover, if the anonymous electronic database is compromised or corrupted, you can re-create your database.

Data collectors

Good data collectors are worth gold. If they are thorough and ethical, you will get great data. If not, your data may be unusable. Make sure they understand research ethics, the need for protection of human subjects, and the privacy of data. Ideally, your data collectors would be blinded to the outcome of interest, to prevent bias. It is ok to blind data collectors to the research question, but they need to understand that collecting every variable the same way for each subject is essential to data integrity.

Data gatherers should be trained in advance of collecting any data. They need to understand informed consent and have the time to explain a study to the satisfaction of the subjects. The importance of conducting a dry run in an attempt to anticipate and address issues that can arise during data collection cannot be over-stated. It would even be worthwhile to pilot the research manual, to learn if everyone understands it the same way.

Data storage

Data collection, done right, protects the confidentiality of the subject as well as the data. Data must also be properly stored safely and securely. It is reasonable to back up your data in a different, secure, location. You do not want to go to all the trouble of creating a protocol, collecting your data, only to lose it, or have no way to analyse it!

There are many reasons to keep your data safe and secure. Obviously, you do not want to lose your data. You may wish to use the data again. For example, you may wish to combine it with other data for a different study. An additional reason is that you do not want your subjects to risk a ‘loss of privacy’. Still another reason is that institutions and governments may require you to store data for a specified number of years. Know how long you must keep your data. Keep it in a locked cabinet in a secure room, or behind an institutional firewall.

Furthermore, if you keep a cipher , that is, a connector between a subject and their study number, keep that cipher separate from the research data. That way, even if someone learns that subject 302 has an embarrassing condition, they will not know who subject 302 really is.

These days, almost everyone has access to computers and programs, locally or ‘in the cloud’. For statistical analysis, you will need to have your data in electronic form. If you started with paper, consider double entry (two data extractors for each record, then compare the two) for greater accuracy.

Tips on this topic and pitfalls to avoid

Hazard: no research manual.

  • • No identified mechanism to document changes in procedures that may evolve over the course of the investigation.
  • • Vague description of data collection instruments to be used in lieu of rigorous step-by-step instructions on administering tests
  • • Only a partial listing of variables to be collected
  • • Forgetting to put instructions on the data collection sheet about how to code the data when transferring to an electronic medium.

Hazard: no assistant training

  • • Failure to adequately train data collectors
  • • Failure to do a Dry Run/Failure to try enrolling a mock subject
  • • Uncertainty about when, how and who should review gathered data.

Hazard: failure to understand data management

  • • Data should be easy to understand, and the protocol good enough that another researcher can repeat the study.
  • • Data audit: keep raw data and collected data
  • • Failure to keep backups

Annotated bibliography

  • 1. RCR Data Acquisition and Management. This online book is pretty comprehensive. http://ccnmtl.columbia.edu/projects/rcr/rcr_data/foundation/ (Accessed 2019 June 23)
  • 2. Qualitative research – Wikipedia: en.wikipedia.org/wiki/Qualitative_research (Accessed 2019 June 23) – this is a good overview with references so you can delve deeper if you wish.
  • 3. Qualitative Research: Definition, Types, Methods and Examples: https://www.questionpro.com/blog/qualitative-research-methods/ (Accessed 2019 June 23) – this is a good overview with references so you can delve deeper if you wish.
  • 4. Qualitative Research Methods: A Data Collector's Field Guide: https://course.ccs.neu.edu/is4800sp12/resources/qualmethods.pdf (Accessed 2019 June 23) – another on-line resource about data collection.

Additional reading about statistical variables

  • 1. Types of Variables in Statistics and Research: A List of Common and Uncommon Types of Variables. https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/types-of-variables/
  • 2. Research Variables: Dependent, Independent, Control, Extraneous & Moderator. https://study.com/academy/lesson/research-variables-dependent-independent-control-extraneous-moderator.html
  • 3. Knatterud GL. Rockhold FW. George SL. Barton FB. Davis CE. Fairweather WR. Honohan, T. Mowery R. O'Neill R. (1998). Guidelines for quality assurance in multicenter trials: a position paper. Controlled Clinical Trials, 19:477–493.
  • 4. Whitney CW. Lind BK. Wahl PW. (1998). Quality assurance and quality control in longitudinal studies. Epidemiologic Reviews, 20 [ 1 ]: 71–80.

Additional relevant information to consider

Consider who owns the data before and after collection (this brings up questions of consent, privacy, sponsorship and data-sharing, most of which are beyond the scope of this paper).

Authors' contribution

Authors contributed as follow to the conception or design of the work; the acquisition, analysis, or interpretation of data for the work; and drafting the work or revising it critically for important intellectual content: ES contributed 70%; VT, MJ and HS contributed 10% each. All authors approved the version to be published and agreed to be accountable for all aspects of the work.

Declaration of competing interest

The authors declared no conflicts of interest.

primary and secondary data research methodology

Primary Data: Definition, Examples & Collection Methods

primary and secondary data research methodology

Introduction

What is meant by primary data, what is the difference between primary and secondary data, what are examples of primary data, primary data collection methods, advantages of primary data collection, disadvantages of primary data collection, ethical considerations for primary data.

Understanding the type of data being analyzed is crucial for drawing accurate conclusions in qualitative research. Collecting primary data directly from the source offers unique insights that can benefit researchers in various fields.

This article provides a comprehensive guide on primary data, illustrating its definition, how it stands apart from secondary data , pertinent examples, and the common methods employed in the primary data collection process. Additionally, we will explore the advantages and disadvantages associated with primary data acquisition.

primary and secondary data research methodology

Primary data refers to information that is collected firsthand by the researcher for a specific research purpose. Unlike secondary data, which is already available and has been collected for some other objective, primary data is raw and unprocessed, offering fresh insights directly related to the research question at hand. This type of data is gathered through various methods such as surveys , interviews , experiments, and observations , allowing researchers to obtain tailored and precise information.

The main characteristic of primary data is its relevancy to the specific study. Since it is collected with the research objectives and questions in mind, it directly addresses the issues or hypotheses under investigation. This direct connection enhances the validity and accuracy of the research findings, as the data is not diluted or missing important information relevant to the research question.

Moreover, primary data provides the most current information available, making it especially valuable in fast-changing fields or situations where timely data is crucial. By analyzing primary data, researchers can draw unique conclusions and develop original insights that contribute significantly to their field of study.

primary and secondary data research methodology

Understanding the distinction between primary and secondary data is fundamental in the realm of research, as it influences the research design , methodology , and analysis . Primary data is information collected firsthand for a specific research purpose. It is original and unprocessed, providing new insights directly relevant to the researcher's questions or objectives. Common methods of collecting data from primary sources include observations , surveys , interviews , and experiments, each allowing the researcher to gather specific, targeted information.

Conversely, secondary data refers to information that was collected by someone else for a different purpose and is subsequently used by a researcher for a new study. This data can come from a primary source such as an academic journal, a government report, a set of historical records, or a previous research study. While secondary data is invaluable for providing context, background, and supporting evidence, it may not be as precisely tailored to the specific research questions as primary data.

The key differences between these two types of data also extend to their advantages and disadvantages concerning accessibility, cost, and time. Primary data is typically more time-consuming and expensive to collect but offers specificity and relevance that is unmatched by secondary data. On the other hand, secondary data is usually more accessible and less costly, as it leverages existing information, although it may not align perfectly with the current research needs and might be outdated or less specific.

In terms of accuracy and reliability, primary data allows for greater control over the quality and methodology of the data collected, reflecting the current scenario accurately. However, secondary data's reliability depends on the original data collection's accuracy and the context in which it was gathered, which might not be fully verifiable by the new researcher.

primary and secondary data research methodology

Synthesizing primary and secondary data

While primary and secondary data each have distinct roles in research, synthesizing both types can provide a more comprehensive understanding of the research topic . Integrating primary data with secondary data allows researchers to contextualize their firsthand findings within the broader literature and existing knowledge.

This approach can enhance the depth and relevance of the research, providing a more nuanced analysis that leverages the detailed, current insights of primary data alongside the extensive, contextual background of secondary data.

For example, primary data might offer detailed consumer behavior insights, which researchers can then compare with broader market trends or historical data from secondary sources. This synthesis can reveal patterns, corroborate findings, or identify anomalies, enriching the research's analytical value and implications.

Ultimately, combining primary and secondary data helps build a robust research framework, enabling a more informed and comprehensive exploration of the research question .

primary and secondary data research methodology

Primary data collection is a cornerstone of research in the social sciences, providing firsthand insights that are crucial for understanding complex human behaviors and societal structures. This direct approach to data gathering allows researchers to uncover rich, context-specific information.

The following subsections highlight examples of primary data across various social science disciplines, showcasing the versatility and depth of these research methods.

Economic behaviors in market research

Market research within economics often relies on primary data to understand consumer preferences, spending habits, and decision-making processes. For instance, a study may collect primary data through surveys or interviews to gauge consumer reactions to a new product or service.

This information can reveal economic behaviors, such as price sensitivity and brand loyalty, offering valuable insights for businesses and policymakers.

Voting patterns in political science

In political science, researchers collect primary data to analyze voting patterns and political engagement. Through exit polls and surveys conducted during elections, researchers can obtain firsthand accounts of voter preferences and motivations.

This data is pivotal in understanding the dynamics of electoral politics, voter turnout, and the influence of campaign strategies on public opinion.

Cultural practices in anthropology

Anthropologists gather primary data to explore cultural practices and beliefs, often through ethnographic studies . By immersing themselves in a community, researchers can directly observe rituals, social interactions, and traditions.

For example, a study might focus on marriage ceremonies, food customs, or religious practices within a particular culture, providing in-depth insights into the community's way of life.

Social interactions in sociology

Sociologists utilize primary data to investigate the intricacies of social interactions and societal structures. Observational studies , for instance, can reveal how individuals behave in group settings, how social norms are enforced, and how social hierarchies influence behavior.

By analyzing these interactions within settings like schools, workplaces, or public spaces, sociologists can uncover patterns and dynamics that shape social life.

primary and secondary data research methodology

Quality research starts with powerful analysis tools

Make ATLAS.ti your solution for qualitative data analysis. Download a free trial.

Primary data collection is an integral aspect of research, enabling investigators to gather fresh, relevant data directly related to their study objectives. This direct engagement provides rich, nuanced insights that are critical for in-depth analysis. Selecting the appropriate data collection method is pivotal, as it influences the study's overall design, data quality, and conclusiveness.

Below are some of the different types of primary data utilized across various research disciplines, each offering unique benefits and suited to different research needs.

In-person and online surveys collect data from a large audience efficiently. By utilizing structured questionnaires, researchers can gather data on a wide range of topics, such as attitudes, preferences, behaviors, or factual information.

Surveys can be distributed through various channels, including online platforms, phone, mail, or in-person, allowing for flexibility in reaching diverse populations.

Interviews provide an in-depth look into the respondents' perspectives, experiences, or opinions. They can range from highly structured formats to open-ended, conversational styles, depending on the research goals.

Interviews are particularly valuable for exploring complex issues, understanding personal narratives, and gaining detailed insights that are not easily captured through other methods.

Focus groups

Focus groups involve guided discussions with a small group of participants, allowing researchers to explore collective views, uncover trends in perceptions, and stimulate debate on a specific topic.

This method is particularly useful for generating rich qualitative data, understanding group dynamics, and identifying variations in opinions across different demographic groups.

Observations

Observational research involves systematically watching and recording behaviors and interactions in their natural context. It can be conducted in various settings, such as schools, workplaces, or public areas, providing authentic insights into real-world behaviors.

The observation method can be either participant, where the observer is involved in the activities, or non-participant, where the researcher observes without interaction.

Experiments

Experiments are a fundamental method in scientific research, allowing researchers to control variables and measure effects accurately.

By manipulating certain factors and observing the outcomes, experiments can establish causal relationships, providing a robust basis for testing hypotheses and drawing conclusions.

Case studies

Case studies offer an in-depth examination of a particular instance or phenomenon, often involving a comprehensive analysis of individuals, organizations, events, or other entities.

This method is particularly suited to exploring new or complex issues, providing detailed contextual analysis, and uncovering underlying mechanisms or principles.

Ethnography

As a key method in anthropology, ethnography involves extended observation of a community or culture, often through fieldwork. Researchers immerse themselves in the environment, participating in and observing daily life to gain a deep understanding of social practices, norms, and values.

Ethnography is invaluable for exploring cultural phenomena, understanding community dynamics, and providing nuanced interpretations of social behavior.

primary and secondary data research methodology

Primary data collection is a fundamental aspect of research, offering distinct advantages that enhance the quality and relevance of study findings. By gathering high-quality primary data firsthand, a research project can obtain specific, up-to-date information that directly addresses their research questions or hypotheses. This section explores four key advantages of primary data collection, highlighting how it contributes to robust and insightful research outcomes.

Specificity

One of the most significant advantages of primary data collection is its specificity. Data gathered firsthand is tailored specifically to the research question or hypothesis, ensuring that the information is directly relevant and applicable to the study's objectives. This level of specificity enhances the precision of the research, allowing for a more targeted analysis and reducing the likelihood of extraneous variables influencing the results.

Primary data collection offers the advantage of currency, providing the most recent information available. This is particularly crucial in fields where data rapidly change, such as market trends, technological advancements, or social dynamics. By accessing current data, researchers can draw conclusions that are timely and reflective of the present context, adding significant value and relevance to their findings.

Control over data quality

When collecting primary data, researchers have direct control over the data quality. They can design the data collection process, choose the sample, and implement quality assurance measures to ensure valid and reliable data. This direct involvement allows researchers to address potential biases, minimize errors, and adjust methodologies as needed, ensuring that the data is accurate and representative of the population under study.

Exclusive insights

Gathering primary data provides exclusive insights that might not be available through secondary sources. By collecting unique data sets, researchers can explore uncharted territories, generate new theories, and contribute original findings to their field. This exclusivity not only advances academic knowledge but also offers competitive advantages in applied settings, such as business or policy development, where novel insights can lead to innovative solutions and strategic advancements.

primary and secondary data research methodology

While primary data collection offers numerous benefits, it also comes with distinct disadvantages that researchers must consider. These drawbacks can impact the feasibility, reliability, and overall outcome of a study. Understanding these limitations is crucial for researchers to design effective and comprehensive research methodologies . Below, we explore four significant disadvantages of primary data collection.

Time-consuming process

Primary data collection often requires a significant investment of time. From designing the data collection tools and protocols to actually gathering the data and analyzing results, each step can take a long time to carry out. For instance, conducting in-depth interviews , surveys , or extensive observations demands considerable time for both preparation and execution. This extended timeline can be a significant hurdle, especially in fields where timely data is crucial.

The financial implications of primary data collection can be substantial. Resources are needed for various stages of the process, including material creation, data gathering, personnel, and data analysis . For example, organizing focus groups or conducting large-scale surveys involves logistical expenses, compensation for participants, and possibly travel costs. Such financial requirements can limit the scope of the research or even render it unfeasible for underfunded projects.

Limited scope

Primary data collection is typically focused on a specific research question or context, which may limit the breadth of the data. While this specificity provides detailed insights into the chosen area of study, it may not offer a comprehensive overview of the subject. For example, a case study provides in-depth data about a particular case, but its findings may not be generalizable to other contexts or populations, limiting the scope of the research conclusions.

Potential for data collection bias

The process of collecting primary data is susceptible to various biases , which can compromise the data's accuracy and reliability. Researcher bias, selection bias, or response bias can skew results, leading to misleading conclusions. For instance, the presence of an observer might influence participants' behavior, or poorly designed survey questions might lead to ambiguous or skewed responses. Mitigating these biases requires meticulous planning and execution, but some level of bias is often inevitable.

primary and secondary data research methodology

Ethical considerations are paramount in the realm of primary data collection , ensuring the respect and dignity of participants are maintained while preserving the integrity of the research process. Researchers are obligated to adhere to ethical standards that promote trust, accountability, and scientific excellence. This section delves into key ethical principles that must be considered when collecting primary data.

Informed consent

Informed consent is the cornerstone of ethical research. Participants must be fully informed about the study's purpose, procedures, potential risks, and benefits, as well as their right to withdraw at any time without penalty. This information should be communicated in a clear, understandable manner, ensuring participants can make an informed decision about their involvement. Documented consent, whether written or verbal, is essential to demonstrate that participants have agreed to partake in the study voluntarily, understanding all its aspects.

Confidentiality and privacy

Protecting participants' confidentiality and privacy is crucial to uphold their rights and the data's integrity. Researchers must implement measures to ensure that personal information is securely stored and only accessible to authorized team members. Data should be anonymized or de-identified to prevent the identification of individual participants in reports or publications. Researchers must also be transparent about any data sharing plans and obtain consent for such activities, ensuring participants are aware of who might access their information and for what purposes.

Data integrity and reporting

Maintaining data integrity is fundamental to ethical research practices. Researchers are responsible for collecting, analyzing, and presenting data accurately and transparently, without fabrication, falsification, or inappropriate data manipulation. Reporting should be honest and comprehensive, reflecting all relevant findings, including any that contradict the research hypotheses. Researchers should also disclose any conflicts of interest that might influence the study's outcomes, maintaining transparency throughout the research process.

Minimizing harm

Research should be designed and conducted in a way that minimizes any potential harm to participants. This includes considering physical, psychological, emotional, and social risks. Researchers must take steps to reduce any discomfort or adverse effects, providing support or referrals if participants experience distress. Ethical research also involves selecting appropriate methodologies that align with the study's objectives while safeguarding participants' well-being, ensuring that the research's potential benefits justify any risks involved.

primary and secondary data research methodology

Make the most of your data with ATLAS.ti

Our powerful tools make research easier and faster than ever. Get started with a free trial today.

primary and secondary data research methodology

Benedictine University Library

Public Health Research Guide: Primary & Secondary Data Definitions

  • COVID-19 Myths, misinformation

Primary & Secondary Data Definitions

  • Primary Data Sources
  • Secondary Data Sources
  • Database Search Strategy
  • Searching Database PUBMED
  • Creating Infographics
  • Sample Poster Presentations
  • Writing for Peer-Reviewed Journals
  • SPSS Tutorial
  • Books and eBooks
  • CORE Journals (Medical Library Association)

Guide Author

Profile Photo

Science Outreach Librarian

Ask a Librarian

Chat with a Librarian

Lisle: (630) 829-6057 Mesa: (480) 878-7514 Toll Free: (877) 575-6050 Email: [email protected]

Book a Research Consultation Library Hours

Facebook

Primary Data: Data that has been generated by the researcher himself/herself, surveys, interviews, experiments, specially designed for understanding and solving the research problem at hand.

Secondary Data:  Using existing data generated by large government Institutions, healthcare facilities etc. as part of organizational record keeping. The data is then extracted from more varied datafiles. 

Supplementary Data : A few years ago the Obama Administration judged that any research that is done using Federal Public funds should be available for free to the public. Moreover Data Management Plans should be in place to store and preserve the data for almost eternity. These data sets are published as Supplementary Materials in the journal lliterature, and data sets can downloaded and manipulated for research. 

NOTE: Even though the research is Primary source, the supplemental files downloaded by others becomes Secondary Source.

 Pros and Cons for each. 

Comparison Chart

Quantitative & Qualitative Research Methods

Quantitative Research Definition:  Data that can be measured, quantified. Basically Descriptive Statistics.

Read:  Introduction to Quantitative Methods

Qualitative Research Definition: Data collected that is not numerical, hence cannot be quantified. It measures other characteristics through interviews, observation and focused groups among a few methods. It can also be termed as  " Categorical Statistics ". 

Read:  Qualitative methods in public health

Mixed methods research. When quantitative and qualitative research methods are used.

Qualitative Research Methods:

Source: https://managementhelp.org/evaluation/program-evaluation-guide.htm#anchor1585345

  • << Previous: COVID-19 Myths, misinformation
  • Next: Primary Data Sources >>
  • Last Updated: May 1, 2024 3:22 PM
  • URL: https://researchguides.ben.edu/public-health

Kindlon Hall 5700 College Rd. Lisle, IL 60532 (630) 829-6050

Gillett Hall 225 E. Main St. Mesa, AZ 85201 (480) 878-7514

Instagram

  • Privacy Policy

Research Method

Home » Secondary Data – Types, Methods and Examples

Secondary Data – Types, Methods and Examples

Table of Contents

Secondary Data

Secondary Data

Definition:

Secondary data refers to information that has been collected, processed, and published by someone else, rather than the researcher gathering the data firsthand. This can include data from sources such as government publications, academic journals, market research reports, and other existing datasets.

Secondary Data Types

Types of secondary data are as follows:

  • Published data: Published data refers to data that has been published in books, magazines, newspapers, and other print media. Examples include statistical reports, market research reports, and scholarly articles.
  • Government data: Government data refers to data collected by government agencies and departments. This can include data on demographics, economic trends, crime rates, and health statistics.
  • Commercial data: Commercial data is data collected by businesses for their own purposes. This can include sales data, customer feedback, and market research data.
  • Academic data: Academic data refers to data collected by researchers for academic purposes. This can include data from experiments, surveys, and observational studies.
  • Online data: Online data refers to data that is available on the internet. This can include social media posts, website analytics, and online customer reviews.
  • Organizational data: Organizational data is data collected by businesses or organizations for their own purposes. This can include data on employee performance, financial records, and customer satisfaction.
  • Historical data : Historical data refers to data that was collected in the past and is still available for research purposes. This can include census data, historical documents, and archival records.
  • International data: International data refers to data collected from other countries for research purposes. This can include data on international trade, health statistics, and demographic trends.
  • Public data : Public data refers to data that is available to the general public. This can include data from government agencies, non-profit organizations, and other sources.
  • Private data: Private data refers to data that is not available to the general public. This can include confidential business data, personal medical records, and financial data.
  • Big data: Big data refers to large, complex datasets that are difficult to manage and analyze using traditional data processing methods. This can include social media data, sensor data, and other types of data generated by digital devices.

Secondary Data Collection Methods

Secondary Data Collection Methods are as follows:

  • Published sources: Researchers can gather secondary data from published sources such as books, journals, reports, and newspapers. These sources often provide comprehensive information on a variety of topics.
  • Online sources: With the growth of the internet, researchers can now access a vast amount of secondary data online. This includes websites, databases, and online archives.
  • Government sources : Government agencies often collect and publish a wide range of secondary data on topics such as demographics, crime rates, and health statistics. Researchers can obtain this data through government websites, publications, or data portals.
  • Commercial sources: Businesses often collect and analyze data for marketing research or customer profiling. Researchers can obtain this data through commercial data providers or by purchasing market research reports.
  • Academic sources: Researchers can also obtain secondary data from academic sources such as published research studies, academic journals, and dissertations.
  • Personal contacts: Researchers can also obtain secondary data from personal contacts, such as experts in a particular field or individuals with specialized knowledge.

Secondary Data Formats

Secondary data can come in various formats depending on the source from which it is obtained. Here are some common formats of secondary data:

  • Numeric Data: Numeric data is often in the form of statistics and numerical figures that have been compiled and reported by organizations such as government agencies, research institutions, and commercial enterprises. This can include data such as population figures, GDP, sales figures, and market share.
  • Textual Data: Textual data is often in the form of written documents, such as reports, articles, and books. This can include qualitative data such as descriptions, opinions, and narratives.
  • Audiovisual Data : Audiovisual data is often in the form of recordings, videos, and photographs. This can include data such as interviews, focus group discussions, and other types of qualitative data.
  • Geospatial Data: Geospatial data is often in the form of maps, satellite images, and geographic information systems (GIS) data. This can include data such as demographic information, land use patterns, and transportation networks.
  • Transactional Data : Transactional data is often in the form of digital records of financial and business transactions. This can include data such as purchase histories, customer behavior, and financial transactions.
  • Social Media Data: Social media data is often in the form of user-generated content from social media platforms such as Facebook, Twitter, and Instagram. This can include data such as user demographics, content trends, and sentiment analysis.

Secondary Data Analysis Methods

Secondary data analysis involves the use of pre-existing data for research purposes. Here are some common methods of secondary data analysis:

  • Descriptive Analysis: This method involves describing the characteristics of a dataset, such as the mean, standard deviation, and range of the data. Descriptive analysis can be used to summarize data and provide an overview of trends.
  • Inferential Analysis: This method involves making inferences and drawing conclusions about a population based on a sample of data. Inferential analysis can be used to test hypotheses and determine the statistical significance of relationships between variables.
  • Content Analysis: This method involves analyzing textual or visual data to identify patterns and themes. Content analysis can be used to study the content of documents, media coverage, and social media posts.
  • Time-Series Analysis : This method involves analyzing data over time to identify trends and patterns. Time-series analysis can be used to study economic trends, climate change, and other phenomena that change over time.
  • Spatial Analysis : This method involves analyzing data in relation to geographic location. Spatial analysis can be used to study patterns of disease spread, land use patterns, and the effects of environmental factors on health outcomes.
  • Meta-Analysis: This method involves combining data from multiple studies to draw conclusions about a particular phenomenon. Meta-analysis can be used to synthesize the results of previous research and provide a more comprehensive understanding of a particular topic.

Secondary Data Gathering Guide

Here are some steps to follow when gathering secondary data:

  • Define your research question: Start by defining your research question and identifying the specific information you need to answer it. This will help you identify the type of secondary data you need and where to find it.
  • Identify relevant sources: Identify potential sources of secondary data, including published sources, online databases, government sources, and commercial data providers. Consider the reliability and validity of each source.
  • Evaluate the quality of the data: Evaluate the quality and reliability of the data you plan to use. Consider the data collection methods, sample size, and potential biases. Make sure the data is relevant to your research question and is suitable for the type of analysis you plan to conduct.
  • Collect the data: Collect the relevant data from the identified sources. Use a consistent method to record and organize the data to make analysis easier.
  • Validate the data: Validate the data to ensure that it is accurate and reliable. Check for inconsistencies, missing data, and errors. Address any issues before analyzing the data.
  • Analyze the data: Analyze the data using appropriate statistical and analytical methods. Use descriptive and inferential statistics to summarize and draw conclusions from the data.
  • Interpret the results: Interpret the results of your analysis and draw conclusions based on the data. Make sure your conclusions are supported by the data and are relevant to your research question.
  • Communicate the findings : Communicate your findings clearly and concisely. Use appropriate visual aids such as graphs and charts to help explain your results.

Examples of Secondary Data

Here are some examples of secondary data from different fields:

  • Healthcare : Hospital records, medical journals, clinical trial data, and disease registries are examples of secondary data sources in healthcare. These sources can provide researchers with information on patient demographics, disease prevalence, and treatment outcomes.
  • Marketing : Market research reports, customer surveys, and sales data are examples of secondary data sources in marketing. These sources can provide marketers with information on consumer preferences, market trends, and competitor activity.
  • Education : Student test scores, graduation rates, and enrollment statistics are examples of secondary data sources in education. These sources can provide researchers with information on student achievement, teacher effectiveness, and educational disparities.
  • Finance : Stock market data, financial statements, and credit reports are examples of secondary data sources in finance. These sources can provide investors with information on market trends, company performance, and creditworthiness.
  • Social Science : Government statistics, census data, and survey data are examples of secondary data sources in social science. These sources can provide researchers with information on population demographics, social trends, and political attitudes.
  • Environmental Science : Climate data, remote sensing data, and ecological monitoring data are examples of secondary data sources in environmental science. These sources can provide researchers with information on weather patterns, land use, and biodiversity.

Purpose of Secondary Data

The purpose of secondary data is to provide researchers with information that has already been collected by others for other purposes. Secondary data can be used to support research questions, test hypotheses, and answer research objectives. Some of the key purposes of secondary data are:

  • To gain a better understanding of the research topic : Secondary data can be used to provide context and background information on a research topic. This can help researchers understand the historical and social context of their research and gain insights into relevant variables and relationships.
  • To save time and resources: Collecting new primary data can be time-consuming and expensive. Using existing secondary data sources can save researchers time and resources by providing access to pre-existing data that has already been collected and organized.
  • To provide comparative data : Secondary data can be used to compare and contrast findings across different studies or datasets. This can help researchers identify trends, patterns, and relationships that may not have been apparent from individual studies.
  • To support triangulation: Triangulation is the process of using multiple sources of data to confirm or refute research findings. Secondary data can be used to support triangulation by providing additional sources of data to support or refute primary research findings.
  • To supplement primary data : Secondary data can be used to supplement primary data by providing additional information or insights that were not captured by the primary research. This can help researchers gain a more complete understanding of the research topic and draw more robust conclusions.

When to use Secondary Data

Secondary data can be useful in a variety of research contexts, and there are several situations in which it may be appropriate to use secondary data. Some common situations in which secondary data may be used include:

  • When primary data collection is not feasible : Collecting primary data can be time-consuming and expensive, and in some cases, it may not be feasible to collect primary data. In these situations, secondary data can provide valuable insights and information.
  • When exploring a new research area : Secondary data can be a useful starting point for researchers who are exploring a new research area. Secondary data can provide context and background information on a research topic, and can help researchers identify key variables and relationships to explore further.
  • When comparing and contrasting research findings: Secondary data can be used to compare and contrast findings across different studies or datasets. This can help researchers identify trends, patterns, and relationships that may not have been apparent from individual studies.
  • When triangulating research findings: Triangulation is the process of using multiple sources of data to confirm or refute research findings. Secondary data can be used to support triangulation by providing additional sources of data to support or refute primary research findings.
  • When validating research findings : Secondary data can be used to validate primary research findings by providing additional sources of data that support or refute the primary findings.

Characteristics of Secondary Data

Secondary data have several characteristics that distinguish them from primary data. Here are some of the key characteristics of secondary data:

  • Non-reactive: Secondary data are non-reactive, meaning that they are not collected for the specific purpose of the research study. This means that the researcher has no control over the data collection process, and cannot influence how the data were collected.
  • Time-saving: Secondary data are pre-existing, meaning that they have already been collected and organized by someone else. This can save the researcher time and resources, as they do not need to collect the data themselves.
  • Wide-ranging : Secondary data sources can provide a wide range of information on a variety of topics. This can be useful for researchers who are exploring a new research area or seeking to compare and contrast research findings.
  • Less expensive: Secondary data are generally less expensive than primary data, as they do not require the researcher to incur the costs associated with data collection.
  • Potential for bias : Secondary data may be subject to biases that were present in the original data collection process. For example, data may have been collected using a biased sampling method or the data may be incomplete or inaccurate.
  • Lack of control: The researcher has no control over the data collection process and cannot ensure that the data were collected using appropriate methods or measures.
  • Requires careful evaluation : Secondary data sources must be evaluated carefully to ensure that they are appropriate for the research question and analysis. This includes assessing the quality, reliability, and validity of the data sources.

Advantages of Secondary Data

There are several advantages to using secondary data in research, including:

  • Time-saving : Collecting primary data can be time-consuming and expensive. Secondary data can be accessed quickly and easily, which can save researchers time and resources.
  • Cost-effective: Secondary data are generally less expensive than primary data, as they do not require the researcher to incur the costs associated with data collection.
  • Large sample size : Secondary data sources often have larger sample sizes than primary data sources, which can increase the statistical power of the research.
  • Access to historical data : Secondary data sources can provide access to historical data, which can be useful for researchers who are studying trends over time.
  • No ethical concerns: Secondary data are already in existence, so there are no ethical concerns related to collecting data from human subjects.
  • May be more objective : Secondary data may be more objective than primary data, as the data were not collected for the specific purpose of the research study.

Limitations of Secondary Data

While there are many advantages to using secondary data in research, there are also some limitations that should be considered. Some of the main limitations of secondary data include:

  • Lack of control over data quality : Researchers do not have control over the data collection process, which means they cannot ensure the accuracy or completeness of the data.
  • Limited availability: Secondary data may not be available for the specific research question or study design.
  • Lack of information on sampling and data collection methods: Researchers may not have access to information on the sampling and data collection methods used to gather the secondary data. This can make it difficult to evaluate the quality of the data.
  • Data may not be up-to-date: Secondary data may not be up-to-date or relevant to the current research question.
  • Data may be incomplete or inaccurate : Secondary data may be incomplete or inaccurate due to missing or incorrect data points, data entry errors, or other factors.
  • Biases in data collection: The data may have been collected using biased sampling or data collection methods, which can limit the validity of the data.
  • Lack of control over variables: Researchers have limited control over the variables that were measured in the original data collection process, which can limit the ability to draw conclusions about causality.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Primary Data

Primary Data – Types, Methods and Examples

Qualitative Data

Qualitative Data – Types, Methods and Examples

Research Data

Research Data – Types Methods and Examples

Quantitative Data

Quantitative Data – Types, Methods and Examples

Research Information

Information in Research – Types and Examples

  • Key Differences

Know the Differences & Comparisons

Difference Between Primary and Secondary Data

primary vs secondary data

There are many differences between primary and secondary data, which are discussed in this article. But the most important difference is that primary data is factual and original whereas secondary data is just the analysis and interpretation of the primary data. While primary data is collected with an aim for getting solution to the problem at hand, secondary data is collected for other purposes.

Content: Primary Data Vs Secondary Data

Comparison chart, definition of primary data.

Primary data is data originated for the first time by the researcher through direct efforts and experience, specifically for the purpose of addressing his research problem. Also known as the first hand or raw data. Primary data collection is quite expensive, as the research is conducted by the organisation or agency itself, which requires resources like investment and manpower. The data collection is under direct control and supervision of the investigator.

The data can be collected through various methods like surveys, observations, physical testing, mailed questionnaires, questionnaire filled and sent by enumerators, personal interviews, telephonic interviews, focus groups, case studies, etc.

Definition of Secondary Data

Secondary data implies second-hand information which is already collected and recorded by any person other than the user for a purpose, not relating to the current research problem. It is the readily available form of data collected from various sources like censuses, government publications, internal records of the organisation, reports, books, journal articles, websites and so on.

Secondary data offer several advantages as it is easily available, saves time and cost of the researcher. But there are some disadvantages associated with this, as the data is gathered for the purposes other than the problem in mind, so the usefulness of the data may be limited in a number of ways like relevance and accuracy.

Moreover, the objective and the method adopted for acquiring data may not be suitable to the current situation. Therefore, before using secondary data, these factors should be kept in mind.

Key Differences Between Primary and Secondary Data

The fundamental differences between primary and secondary data are discussed in the following points:

  • The term primary data refers to the data originated by the researcher for the first time. Secondary data is the already existing data, collected by the investigator agencies and organisations earlier.
  • Primary data is a real-time data whereas secondary data is one which relates to the past.
  • Primary data is collected for addressing the problem at hand while secondary data is collected for purposes other than the problem at hand.
  • Primary data collection is a very involved process. On the other hand, secondary data collection process is rapid and easy.
  • Primary data collection sources include surveys, observations, experiments, questionnaire, personal interview, etc. On the contrary, secondary data collection sources are government publications, websites, books, journal articles, internal records etc.
  • Primary data collection requires a large amount of resources like time, cost and manpower. Conversely, secondary data is relatively inexpensive and quickly available.
  • Primary data is always specific to the researcher’s needs, and he controls the quality of research. In contrast, secondary data is neither specific to the researcher’s need, nor he has control over the data quality.
  • Primary data is available in the raw form whereas secondary data is the refined form of primary data. It can also be said that secondary data is obtained when statistical methods are applied to the primary data.
  • Data collected through primary sources are more reliable and accurate as compared to the secondary sources.

Video: Primary Vs Seconday Data

As can be seen from the above discussion that primary data is an original and unique data, which is directly collected by the researcher from a source according to his requirements. As opposed to secondary data which is easily accessible but are not pure as they have undergone through many statistical treatments.

You Might Also Like:

primary vs secondary research

vimbainashe marume says

June 29, 2017 at 2:55 pm

Thank you for the information it is of great importance to us as Anderson students who have the privilege to use internet for our assignments

December 20, 2017 at 11:30 am

V.nice n easy way to describe.. Really helpful.. Thanks alot

allana says

October 19, 2018 at 4:49 pm

Thank you for this information it was very useful to use in class!

Lornah says

December 3, 2018 at 12:27 pm

very articulate and simple to understand. thanks alot for this information

February 8, 2019 at 7:22 am

What is the difference between independent and dependant variables in research?

Surbhi S says

September 26, 2019 at 2:14 pm

You can find the difference between independent and dependent variable here: https://keydifferences.com/difference-between-independent-and-dependent-variable.html

Moussa Ibrahim says

February 11, 2019 at 5:44 am

It’s really guided me. Thanks

Munazza majeed says

June 26, 2019 at 8:11 am

very simple,authentic and valuable data…………..its really help me alot thank you so much

Carolina Vazquez says

February 7, 2020 at 11:02 am

It is really helpful thank you for this information.

Salman Abuzar says

July 26, 2020 at 5:56 pm

Very well explained especially the illustration in the table makes it much easier to understand. Thank you very much for this useful information.

MURHULA KAPALATA Gloire says

November 13, 2020 at 5:01 pm

it’s really helpfull and very easier to understand

CALCULUS says

November 27, 2020 at 1:29 pm

Wow! This is so impressive, I have discovered a lot out of these😊

Tinenyasha says

February 10, 2021 at 4:42 pm

i really appreciate your materials ,helps me a lot in my study . very easy to understand.

Chinchan says

February 14, 2021 at 10:12 pm

Thank you so much for the information

December 14, 2021 at 5:06 pm

Thank you so much <3

January 29, 2022 at 10:45 pm

Thank you. The information is very clear and simple to understand.

Muhindo Ronald says

October 4, 2022 at 4:18 pm

Thanks, the information is readable and can be understood

Rkboss says

October 29, 2022 at 7:52 am

Thank you so much. It helps me in exam time. Thanks..🙏🙏😌

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Primary vs Secondary Research Methods: 15 Key Differences

busayo.longe

When carrying out a systematic investigation, you can choose to be directly involved in the data collection process or to rely on already acquired information. While the former is described as primary research, the latter is known as secondary research. 

The distinguishing factor between primary research and secondary research is the degree of involvement of the research with the data gathering process . In this article, we’ll be detailing other key differences between primary and secondary research, and also show you how to conduct primary research with Formplus. 

What is Primary Research?

Primary research is a type of research that requires the researcher to participate directly in the data-gathering process. In primary research, the researcher does not depend on already existing data, rather he or she collects first-hand information which serves as research materials for the systematic investigation. 

This type of research gives the researcher absolute ownership of the data which is extremely important for businesses and organisations in fast-paced markets. These organisations utilise primary research to gather valuable information about consumer needs and preferences before launching a new product or service.  

Usually, primary research focuses on the specific needs of the research contexts. However, this type of research is expensive, time-consuming and it usually requires a lot of skilled resources that may not be readily available and this is why many businesses outsource this to 3rd party market research companies. 

What is Secondary Research?

Secondary research is a type of research approach in which the researcher relies solely on existing research materials rather than gather data directly for research. This research approach is less expensive and time-efficient unlike primary research.. 

Data for secondary research can be accessed from the internet, archive, libraries, educational institutions and organisational reports. However, extra care must be taken by the researcher to ensure that the data is valid as this can have a negative impact on the research process and outcomes. 

Differences Between Primary and Secondary Research

Primary research is a research approach that involves gathering data directly while secondary research is a research approach that involves relying on already existing data when carrying out a systematic investigation. 

This means that in primary research, the researcher is directly involved in the data collection and categorization process. In secondary research, on the other hand, the researcher simply depends on existing materials for the research without any need to collect raw information from the field. 

  • Sources of Data

Surveys, interviews, focus groups and observation techniques are common sources of data in primary research. In secondary research, the researcher collects existing research materials through a number of sources like the internet, libraries and archives.

These data collection methods require some sort of interaction with the research subjects in order to gather first-hand information that will be useful in the research. Many times,secondary sources are free to access but some of them will require you to pay an access fee before you can make use of the information. 

  • Other Names

Secondary research is also known as desk research because it does not necessarily require the researcher to move from one place to another. Meanwhile, primary research is also referred to as a field research design because it requires the researcher to get totally involved with the data collection process.

In secondary research, researchers can easily access information from the comfort of their desk; especially when using the internet to source for research materials. In some cases, the researcher would need to co-exist with the research subjects for a specific period of time in order to get information for the research. 

  • Advantages of Primary Research over Secondary Research

Unlike secondary research, primary research gives the researcher 100% ownership of the research data which is extremely useful for organisations in highly competitive markets. Data from secondary research can be accessed by everyone and does not yield any specific benefits to organisations. 

Also, in primary research, the researcher can fully account for the authenticity of the data because he or she is an active participant in the data collection process. Because the researcher is not directly involved in gathering secondary research data, he or she cannot ascertain the authenticity of the research materials. 

  • Advantages of Secondary Research over Primary Research.

Unlike primary research that is expensive and time-consuming, secondary research can be completed in limited time and with limited resources. Since the research data already exists, the secondary researcher does not need to invest time or resources to gather first-hand information. 

Also, secondary research helps to prevent knowledge repetition by mapping out already existing research efforts and this helps the primary researcher to concentrate on exploring new areas of knowledge. Hence, it is important for every research effort to begin with secondary research. 

Common tools used to collect data in secondary research include bots, internet-enabled devices like laptops, smartphones and tablets. On the other hand, surveys, questionnaires and interviews are common data gathering tools in primary research.

Secondary research devices help researchers to access sources of secondary data like libraries, archives and peer-reviewed journals; without needing to go to the field.  Primary research tools help the researcher to access first-hand information about the characteristics, dispositions and behaviours of research subjects in line with the context of the systematic investigation.  

Primary research makes use of real-time information while secondary research makes use of past or already existing research materials. During primary research, the research is ultimately concerned with gathering first-hand information about the research subjects and contexts while in secondary research, the researcher simply re-examines existing data. 

Hence, the type of data used in secondary research is described as “past data” because it reflects past occurrences and only provides insights into dealing with present situations. The role of the secondary researcher is primarily to specify how this past data informs his or her current research.

  • Research Purpose

The purpose of primary research is to gather real-time data that will be useful in solving a specific problem. On the other hand, the purpose of secondary research is to gather existing research materials that may not directly address the problem at hand. 

The primary research process is carefully tailored towards the specific research problem from start to finish and this is why it relies on first-hand data. Secondary research is not tailored towards solving a specific problem rather, it provides general information that can prove useful for primary research. 

  • When to Conduct Primary and Secondary Research

Primary or field research is usually carried out when an individual or organization needs to gather recent data that is useful for a specific research context. When organisations need to gather information on the changing needs of target markets, they typically employ primary research methods. 

Secondary research, on the other hand, is used when the researcher needs to identify existing knowledge that can provide useful insight in research. With this information, the researcher can identify knowledge gaps which would form the core of his or her research efforts. 

  • Data Recency

Primary research relies on recent data for its systematic investigation because it addresses present situations. As earlier asserted, primary research efforts are ultimately tailored towards the needs of a specific research context from start to finish;hence, the primary researcher must gather real-time data in order to arrive at relevant research outcomes. 

Secondary research, on the other hand, makes use of past data in an attempt to understand existing research efforts, identify knowledge gaps and map out the recent research to fill these knowledge gaps. This, findings from secondary research do not necessarily apply to specific research contexts.  

  • Feasibility

Secondary research is more feasible than primary research. For example, it may be improbable for a company to attempt to observe the buying culture of all the individuals in its target market. 

In this case, the researcher may have to depend on existing research findings that detail the buying culture of the target market. Alternatively, the researcher can use other sampling methods that would help him or her gather feedback from a section of the market. 

Examples of primary research data are student thesis, market research and first-person accounts of trauma survivors while examples of secondary research data include newspapers, books, academic journals and magazines. 

Secondary research data often represent an aggregation of already existing information with little or no additions while primary data contains new information. Usually, primary research collects data from the original source unlike secondary research that relies on reported information. For example, a student who wants to write a thesis would need to either interact with the research subjects in their natural environment or carry out an experiment. 

  • Specificity

Primary research is more specific than secondary research because primary research is aimed at addressing issues peculiar to a business, organisation or institution. On the other hand, secondary research that does not cater to the specific needs of an organization. 

For example, when carrying out a primary research on consumer satisfaction for a product, the entirety of the research process is tailored towards the product in question. In secondary research, however, the data collected may not be exactly what the researcher needs. 

In primary research, the researcher has 100% ownership and control over the data and he or she can choose to make such information available to others or not. This means that the primary researcher has absolute discretion over the research materials. 

In secondary research, however, the researcher does not own the data and as such, he or she does not have absolute discretion over it. Secondary research can aptly be described as a “free-for-all” situation because everyone can gain access to the data. 

  • Data Accuracy

Data gathered through primary research is more accurate than secondary research data. In primary research, the researcher is fully involved in the data collection process and he or she takes care to collect valid data that can be easily authenticated. 

The secondary researcher, on the other hand, has no control over the data and he or she cannot account for the validity of the research materials. For instance, there is a lot of inaccurate information on the internet which can affect research outcomes when used as the basis of a systematic investigation.  

Similarity between Primary and Secondary Research

Primary and secondary research makes use of quantitative and qualitative data. Quantitative data collection methods such as surveys and questionnaires are used to gather numerical data while qualitative data collection methods like observation are used to gather descriptive data . 

How to Conduct Primary Research with Formplus 

Primary research can be conducted with Formplus using a survey or questionnaire . Here is a step-by-step guide on how to go about this. 

  • Sign into Formplus

primary and secondary data research methodology

With Formplus, you can create different types of surveys and questionnaires for primary research. Sign into your Formplus account to access the form builder where you can seamlessly add and modify different form fields for your primary research survey. 

Once you sign in, click on “create new form” to begin. 

primary and secondary data research methodology

In the builder page, you can specify your form title to be “Primary Research Survey” in the title box. Next, click on or drag your desired form fields into your survey form from the builder’s inputs section. 

  • Edit fields
  • Click on “Save”
  • Preview form. 
  • Form Customization

primary and secondary data research methodology

In the form customization section in the form builder, you can easily personalize your primary research survey by modifying its outlook to suit your needs. Formplus allows you to change your form theme, add background images and even change the font according to your needs. 

  • Multiple Sharing Options

primary and secondary data research methodology

With Formplus, you can easily share your primary research survey with respondents using the available multiple sharing options. You can use the direct social media sharing buttons to share your form link to your organization’s social media pages. 

You can send out your survey form as email invitations to your research subjects too. If you wish, you can share your form’s QR code or embed it in your organization’s website for easy access. 

Conclusion   

Many times, researchers combine primary and secondary data collection methods in order to arrive at the most valid outcomes at the end of a systematic investigation. Usually, they start off with secondary research to effectively map out a relevant scope for their research effort, before proceeding to conduct primary research. 

It is important for you to consider the strengths and weaknesses of secondary and primary research before opting for any of these research methods. More importantly, you should pay attention to the overall aim of your systematic investigation as this is the fundamental determinator for choosing primary or secondary research.

Logo

Connect to Formplus, Get Started Now - It's Free!

  • primary research
  • primary secondary research differences
  • primary secondary research method
  • primary vs secondary research
  • types of research methods
  • busayo.longe

Formplus

You may also like:

What is Secondary Research? + [Methods & Examples]

A simple guide on secondary research; definitions, methods, examples, advantages and its disadvantages

primary and secondary data research methodology

What is Primary Research? + [Methods & Examples]

A simple guide on primary research; definitions, Its methods, examples, data collection techniques, advantages and disadvantages

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

Recall Bias: Definition, Types, Examples & Mitigation

This article will discuss the impact of recall bias in studies and the best ways to avoid them during research.

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

Survey Software & Market Research Solutions - Sawtooth Software

  • Technical Support
  • Technical Papers
  • Knowledge Base
  • Question Library

Call our friendly, no-pressure support team.

Primary vs Secondary Research: Differences, Methods, Sources, and More

Two images representing primary vs secondary research: woman holding a phone taking an online survey (primary research), and a stack of books bound with string (secondary research).

Table of Contents

Primary vs Secondary Research – What’s the Difference?

In the search for knowledge and data to inform decisions, researchers and analysts rely on a blend of research sources. These sources are broadly categorized into primary and secondary research, each serving unique purposes and offering different insights into the subject matter at hand. But what exactly sets them apart?

Primary research is the process of gathering fresh data directly from its source. This approach offers real-time insights and specific information tailored to specific objectives set by stakeholders. Examples include surveys , interviews, and observational studies.

Secondary research , on the other hand, involves the analysis of existing data, most often collected and presented by others. This type of research is invaluable for understanding broader trends, providing context, or validating hypotheses. Common sources include scholarly articles, industry reports, and data compilations.

The crux of the difference lies in the origin of the information: primary research yields firsthand data which can be tailored to a specific business question, whilst secondary research synthesizes what's already out there. In essence, primary research listens directly to the voice of the subject, whereas secondary research hears it secondhand .

When to Use Primary and Secondary Research

Selecting the appropriate research method is pivotal and should be aligned with your research objectives. The choice between primary and secondary research is not merely procedural but strategic, influencing the depth and breadth of insights you can uncover.

Primary research shines when you need up-to-date, specific information directly relevant to your study. It's the go-to for fresh insights, understanding consumer behavior, or testing new theories. Its bespoke nature makes it indispensable for tailoring questions to get the exact answers you need.

Ready to Start Gathering Primary Research Data?

Get started with our free survey research tool today! In just a few minutes, you can create powerful surveys with our easy-to-use interface.

Start Survey Research for Free or Request a Product Tour

Secondary research is your first step into the research world. It helps set the stage by offering a broad understanding of the topic. Before diving into costly primary research, secondary research can validate the need for further investigation or provide a solid background to build upon. It's especially useful for identifying trends, benchmarking, and situating your research within the existing body of knowledge.

Combining both methods can significantly enhance your research. Starting with secondary research lays the groundwork and narrows the focus, whilst subsequent primary research delves deep into specific areas of interest, providing a well-rounded, comprehensive understanding of the topic.

Primary vs Secondary Research Methods

In the landscape of market research, the methodologies employed can significantly influence the insights and conclusions drawn. Let's delve deeper into the various methods underpinning both primary and secondary research, shedding light on their unique applications and the distinct insights they offer.

Two women interviewing at a table. Represents primary research interviews.

Primary Research Methods:

  • Surveys: Surveys are a cornerstone of primary research, offering a quantitative approach to gathering data directly from the target audience. By employing structured questionnaires, researchers can collect a vast array of data ranging from customer preferences to behavioral patterns. This method is particularly valuable for acquiring statistically significant data that can inform decision-making processes and strategy development. The application of statistical approaches for analysing this data, such as key drivers analysis, MaxDiff or conjoint analysis can also further enhance any collected data.
  • One on One Interviews: Interviews provide a qualitative depth to primary research, allowing for a nuanced exploration of participants' attitudes, experiences, and motivations. Conducted either face-to-face or remotely, interviews enable researchers to delve into the complexities of human behavior, offering rich insights that surveys alone may not uncover. This method is instrumental in exploring new areas of research or obtaining detailed information on specific topics.
  • Focus Groups: Focus groups bring together a small, diverse group of participants to discuss and provide feedback on a particular subject, product, or idea. This interactive setting fosters a dynamic exchange of ideas, revealing consumers' perceptions, experiences, and preferences. Focus groups are invaluable for testing concepts, exploring market trends, and understanding the factors that influence consumer decisions.
  • Ethnographic Studies: Ethnographic studies involve the systematic watching, recording, and analysis of behaviors and events in their natural setting. This method offers an unobtrusive way to gather authentic data on how people interact with products, services, or environments, providing insights that can lead to more user-centered design and marketing strategies.

The interior of a two story library with books lining the walls and study cubicles in the center of the room. Represents secondary research.

Secondary Research Methods:

  • Literature Reviews: Literature reviews involve the comprehensive examination of existing research and publications on a given topic. This method enables researchers to synthesize findings from a range of sources, providing a broad understanding of what is already known about a subject and identifying gaps in current knowledge.
  • Meta-Analysis: Meta-analysis is a statistical technique that combines the results of multiple studies to arrive at a comprehensive conclusion. This method is particularly useful in secondary research for aggregating findings across different studies, offering a more robust understanding of the evidence on a particular topic.
  • Content Analysis: Content analysis is a method for systematically analyzing texts, media, or other content to quantify patterns, themes, or biases . This approach allows researchers to assess the presence of certain words, concepts, or sentiments within a body of work, providing insights into trends, representations, and societal norms. This can be performed across a range of sources including social media, customer forums or review sites.
  • Historical Research: Historical research involves the study of past events, trends, and behaviors through the examination of relevant documents and records. This method can provide context and understanding of current trends and inform future predictions, offering a unique perspective that enriches secondary research.

Each of these methods, whether primary or secondary, plays a crucial role in the mosaic of market research, offering distinct pathways to uncovering the insights necessary to drive informed decisions and strategies.

Primary vs Secondary Sources in Research

Both primary and secondary sources of research form the backbone of the insight generation process, when both are utilized in tandem it can provide the perfect steppingstone for the generation of real insights. Let’s explore how each category serves its unique purpose in the research ecosystem.

Primary Research Data Sources

Primary research data sources are the lifeblood of firsthand research, providing raw, unfiltered insights directly from the source. These include:

  • Customer Satisfaction Survey Results: Direct feedback from customers about their satisfaction with a product or service. This data is invaluable for identifying strengths to build on and areas for improvement and typically renews each month or quarter so that metrics can be tracked over time.
  • NPS Rating Scores from Customers: Net Promoter Score (NPS) provides a straightforward metric to gauge customer loyalty and satisfaction. This quantitative data can reveal much about customer sentiment and the likelihood of referrals.
  • Ad-hoc Surveys: Ad-hoc surveys can be about any topic which requires investigation, they are typically one off surveys which zero in on one particular business objective. Ad-hoc projects are useful for situations such as investigating issues identified in other tracking surveys, new product development, ad testing, brand messaging, and many other kinds of projects.
  • A Field Researcher’s Notes: Detailed observations from fieldwork can offer nuanced insights into user behaviors, interactions, and environmental factors that influence those interactions. These notes are a goldmine for understanding the context and complexities of user experiences.
  • Recordings Made During Focus Groups: Audio or video recordings of focus group discussions capture the dynamics of conversation, including reactions, emotions, and the interplay of ideas. Analyzing these recordings can uncover nuanced consumer attitudes and perceptions that might not be evident in survey data alone.

These primary data sources are characterized by their immediacy and specificity, offering a direct line to the subject of study. They enable researchers to gather data that is specifically tailored to their research objectives, providing a solid foundation for insightful analysis and strategic decision-making.

Secondary Research Data Sources

In contrast, secondary research data sources offer a broader perspective, compiling and synthesizing information from various origins. These sources include:

  • Books, Magazines, Scholarly Journals: Published works provide comprehensive overviews, detailed analyses, and theoretical frameworks that can inform research topics, offering depth and context that enriches primary data.
  • Market Research Reports: These reports aggregate data and analyses on industry trends, consumer behavior, and market dynamics, providing a macro-level view that can guide primary research directions and validate findings.
  • Government Reports: Official statistics and reports from government agencies offer authoritative data on a wide range of topics, from economic indicators to demographic trends, providing a reliable basis for secondary analysis.
  • White Papers, Private Company Data: White papers and reports from businesses and consultancies offer insights into industry-specific research, best practices, and market analyses. These sources can be invaluable for understanding the competitive landscape and identifying emerging trends.

Secondary data sources serve as a compass, guiding researchers through the vast landscape of information to identify relevant trends, benchmark against existing data, and build upon the foundation of existing knowledge. They can significantly expedite the research process by leveraging the collective wisdom and research efforts of others.

By adeptly navigating both primary and secondary sources, researchers can construct a well-rounded research project that combines the depth of firsthand data with the breadth of existing knowledge. This holistic approach ensures a comprehensive understanding of the research topic, fostering informed decisions and strategic insights.

Examples of Primary and Secondary Research in Marketing

In the realm of marketing, both primary and secondary research methods play critical roles in understanding market dynamics, consumer behavior, and competitive landscapes. By comparing examples across both methodologies, we can appreciate their unique contributions to strategic decision-making.

Example 1: New Product Development

Primary Research: Direct Consumer Feedback through Surveys and Focus Groups

  • Objective: To gauge consumer interest in a new product concept and identify preferred features.
  • Process: Surveys distributed to a target demographic to collect quantitative data on consumer preferences, and focus groups conducted to dive deeper into consumer attitudes and desires.
  • Insights: Direct insights into consumer needs, preferences for specific features, and willingness to pay. These insights help in refining product design and developing a targeted marketing strategy.

Secondary Research: Market Analysis Reports

  • Objective: To understand the existing market landscape, including competitor products and market trends.
  • Process: Analyzing published market analysis reports and industry studies to gather data on market size, growth trends, and competitive offerings.
  • Insights: Provides a broader understanding of the market, helping to position the new product strategically against competitors and align it with current trends.

Example 2: Brand Positioning

Primary Research: Brand Perception Analysis through Surveys

  • Objective: To understand how the brand is perceived by consumers and identify potential areas for repositioning.
  • Process: Conducting surveys that ask consumers to describe the brand in their own words, rate it against various attributes, and compare it to competitors.
  • Insights: Direct feedback on brand strengths and weaknesses from the consumer's perspective, offering actionable data for adjusting brand messaging and positioning.

Secondary Research: Social Media Sentiment Analysis

  • Objective: To analyze public sentiment towards the brand and its competitors.
  • Process: Utilizing software tools to analyze mentions, hashtags, and discussions related to the brand and its competitors across social media platforms.
  • Insights: Offers an overview of public perception and emerging trends in consumer sentiment, which can validate findings from primary research or highlight areas needing further investigation.

Example 3: Market Expansion Strategy

Primary Research: Consumer Demand Studies in New Markets

  • Objective: To assess demand and consumer preferences in a new geographic market.
  • Process: Conducting surveys and interviews with potential consumers in the target market to understand their needs, preferences, and cultural nuances.
  • Insights: Provides specific insights into the new market’s consumer behavior, preferences, and potential barriers to entry, guiding market entry strategies.

Secondary Research: Economic and Demographic Analysis

  • Objective: To evaluate the economic viability and demographic appeal of the new market.
  • Process: Reviewing existing economic reports, demographic data, and industry trends relevant to the target market.
  • Insights: Offers a macro view of the market's potential, including economic conditions, demographic trends, and consumer spending patterns, which can complement insights gained from primary research.

By leveraging both primary and secondary research, marketers can form a comprehensive understanding of their market, consumers, and competitors, facilitating informed decision-making and strategic planning. Each method brings its strengths to the table, with primary research offering direct consumer insights and secondary research providing a broader context within which to interpret those insights.

What Are the Pros and Cons of Primary and Secondary Research?

When it comes to market research, both primary and secondary research offer unique advantages and face certain limitations. Understanding these can help researchers and businesses make informed decisions on which approach to utilize for their specific needs. Below is a comparative table highlighting the pros and cons of each research type.

Navigating the Pros and Cons

  • Balance Your Research Needs: Consider starting with secondary research to gain a broad understanding of the subject matter, then delve into primary research for specific, targeted insights that are tailored to your precise needs.
  • Resource Allocation: Evaluate your budget, time, and resource availability. Primary research can offer more specific and actionable data but requires more resources. Secondary research is more accessible but may lack the specificity or recency you need.
  • Quality and Relevance: Assess the quality and relevance of available secondary sources before deciding if primary research is necessary. Sometimes, the existing data might suffice, especially for preliminary market understanding or trend analysis.
  • Combining Both for Comprehensive Insights: Often, the most effective research strategy involves a combination of both primary and secondary research. This approach allows for a more comprehensive understanding of the market, leveraging the broad perspective provided by secondary sources and the depth and specificity of primary data.

Free Survey Maker Tool

Get access to our free and intuitive survey maker. In just a few minutes, you can create powerful surveys with its easy-to-use interface.

Try our Free Survey Maker or Request a Product Tour

Sawtooth Software

3210 N Canyon Rd Ste 202

Provo UT 84604-6508

United States of America

primary and secondary data research methodology

Support: [email protected]

Consulting: [email protected]

Sales: [email protected]

Products & Services

Support & Resources

primary and secondary data research methodology

Research Methods In Psychology

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

Research methods in psychology are systematic procedures used to observe, describe, predict, and explain behavior and mental processes. They include experiments, surveys, case studies, and naturalistic observations, ensuring data collection is objective and reliable to understand and explain psychological phenomena.

research methods3

Hypotheses are statements about the prediction of the results, that can be verified or disproved by some investigation.

There are four types of hypotheses :
  • Null Hypotheses (H0 ) – these predict that no difference will be found in the results between the conditions. Typically these are written ‘There will be no difference…’
  • Alternative Hypotheses (Ha or H1) – these predict that there will be a significant difference in the results between the two conditions. This is also known as the experimental hypothesis.
  • One-tailed (directional) hypotheses – these state the specific direction the researcher expects the results to move in, e.g. higher, lower, more, less. In a correlation study, the predicted direction of the correlation can be either positive or negative.
  • Two-tailed (non-directional) hypotheses – these state that a difference will be found between the conditions of the independent variable but does not state the direction of a difference or relationship. Typically these are always written ‘There will be a difference ….’

All research has an alternative hypothesis (either a one-tailed or two-tailed) and a corresponding null hypothesis.

Once the research is conducted and results are found, psychologists must accept one hypothesis and reject the other. 

So, if a difference is found, the Psychologist would accept the alternative hypothesis and reject the null.  The opposite applies if no difference is found.

Sampling techniques

Sampling is the process of selecting a representative group from the population under study.

Sample Target Population

A sample is the participants you select from a target population (the group you are interested in) to make generalizations about.

Representative means the extent to which a sample mirrors a researcher’s target population and reflects its characteristics.

Generalisability means the extent to which their findings can be applied to the larger population of which their sample was a part.

  • Volunteer sample : where participants pick themselves through newspaper adverts, noticeboards or online.
  • Opportunity sampling : also known as convenience sampling , uses people who are available at the time the study is carried out and willing to take part. It is based on convenience.
  • Random sampling : when every person in the target population has an equal chance of being selected. An example of random sampling would be picking names out of a hat.
  • Systematic sampling : when a system is used to select participants. Picking every Nth person from all possible participants. N = the number of people in the research population / the number of people needed for the sample.
  • Stratified sampling : when you identify the subgroups and select participants in proportion to their occurrences.
  • Snowball sampling : when researchers find a few participants, and then ask them to find participants themselves and so on.
  • Quota sampling : when researchers will be told to ensure the sample fits certain quotas, for example they might be told to find 90 participants, with 30 of them being unemployed.

Experiments always have an independent and dependent variable .

  • The independent variable is the one the experimenter manipulates (the thing that changes between the conditions the participants are placed into). It is assumed to have a direct effect on the dependent variable.
  • The dependent variable is the thing being measured, or the results of the experiment.

variables

Operationalization of variables means making them measurable/quantifiable. We must use operationalization to ensure that variables are in a form that can be easily tested.

For instance, we can’t really measure ‘happiness’, but we can measure how many times a person smiles within a two-hour period. 

By operationalizing variables, we make it easy for someone else to replicate our research. Remember, this is important because we can check if our findings are reliable.

Extraneous variables are all variables which are not independent variable but could affect the results of the experiment.

It can be a natural characteristic of the participant, such as intelligence levels, gender, or age for example, or it could be a situational feature of the environment such as lighting or noise.

Demand characteristics are a type of extraneous variable that occurs if the participants work out the aims of the research study, they may begin to behave in a certain way.

For example, in Milgram’s research , critics argued that participants worked out that the shocks were not real and they administered them as they thought this was what was required of them. 

Extraneous variables must be controlled so that they do not affect (confound) the results.

Randomly allocating participants to their conditions or using a matched pairs experimental design can help to reduce participant variables. 

Situational variables are controlled by using standardized procedures, ensuring every participant in a given condition is treated in the same way

Experimental Design

Experimental design refers to how participants are allocated to each condition of the independent variable, such as a control or experimental group.
  • Independent design ( between-groups design ): each participant is selected for only one group. With the independent design, the most common way of deciding which participants go into which group is by means of randomization. 
  • Matched participants design : each participant is selected for only one group, but the participants in the two groups are matched for some relevant factor or factors (e.g. ability; sex; age).
  • Repeated measures design ( within groups) : each participant appears in both groups, so that there are exactly the same participants in each group.
  • The main problem with the repeated measures design is that there may well be order effects. Their experiences during the experiment may change the participants in various ways.
  • They may perform better when they appear in the second group because they have gained useful information about the experiment or about the task. On the other hand, they may perform less well on the second occasion because of tiredness or boredom.
  • Counterbalancing is the best way of preventing order effects from disrupting the findings of an experiment, and involves ensuring that each condition is equally likely to be used first and second by the participants.

If we wish to compare two groups with respect to a given independent variable, it is essential to make sure that the two groups do not differ in any other important way. 

Experimental Methods

All experimental methods involve an iv (independent variable) and dv (dependent variable)..

  • Field experiments are conducted in the everyday (natural) environment of the participants. The experimenter still manipulates the IV, but in a real-life setting. It may be possible to control extraneous variables, though such control is more difficult than in a lab experiment.
  • Natural experiments are when a naturally occurring IV is investigated that isn’t deliberately manipulated, it exists anyway. Participants are not randomly allocated, and the natural event may only occur rarely.

Case studies are in-depth investigations of a person, group, event, or community. It uses information from a range of sources, such as from the person concerned and also from their family and friends.

Many techniques may be used such as interviews, psychological tests, observations and experiments. Case studies are generally longitudinal: in other words, they follow the individual or group over an extended period of time. 

Case studies are widely used in psychology and among the best-known ones carried out were by Sigmund Freud . He conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

Case studies provide rich qualitative data and have high levels of ecological validity. However, it is difficult to generalize from individual cases as each one has unique characteristics.

Correlational Studies

Correlation means association; it is a measure of the extent to which two variables are related. One of the variables can be regarded as the predictor variable with the other one as the outcome variable.

Correlational studies typically involve obtaining two different measures from a group of participants, and then assessing the degree of association between the measures. 

The predictor variable can be seen as occurring before the outcome variable in some sense. It is called the predictor variable, because it forms the basis for predicting the value of the outcome variable.

Relationships between variables can be displayed on a graph or as a numerical score called a correlation coefficient.

types of correlation. Scatter plot. Positive negative and no correlation

  • If an increase in one variable tends to be associated with an increase in the other, then this is known as a positive correlation .
  • If an increase in one variable tends to be associated with a decrease in the other, then this is known as a negative correlation .
  • A zero correlation occurs when there is no relationship between variables.

After looking at the scattergraph, if we want to be sure that a significant relationship does exist between the two variables, a statistical test of correlation can be conducted, such as Spearman’s rho.

The test will give us a score, called a correlation coefficient . This is a value between 0 and 1, and the closer to 1 the score is, the stronger the relationship between the variables. This value can be both positive e.g. 0.63, or negative -0.63.

Types of correlation. Strong, weak, and perfect positive correlation, strong, weak, and perfect negative correlation, no correlation. Graphs or charts ...

A correlation between variables, however, does not automatically mean that the change in one variable is the cause of the change in the values of the other variable. A correlation only shows if there is a relationship between variables.

Correlation does not always prove causation, as a third variable may be involved. 

causation correlation

Interview Methods

Interviews are commonly divided into two types: structured and unstructured.

A fixed, predetermined set of questions is put to every participant in the same order and in the same way. 

Responses are recorded on a questionnaire, and the researcher presets the order and wording of questions, and sometimes the range of alternative answers.

The interviewer stays within their role and maintains social distance from the interviewee.

There are no set questions, and the participant can raise whatever topics he/she feels are relevant and ask them in their own way. Questions are posed about participants’ answers to the subject

Unstructured interviews are most useful in qualitative research to analyze attitudes and values.

Though they rarely provide a valid basis for generalization, their main advantage is that they enable the researcher to probe social actors’ subjective point of view. 

Questionnaire Method

Questionnaires can be thought of as a kind of written interview. They can be carried out face to face, by telephone, or post.

The choice of questions is important because of the need to avoid bias or ambiguity in the questions, ‘leading’ the respondent or causing offense.

  • Open questions are designed to encourage a full, meaningful answer using the subject’s own knowledge and feelings. They provide insights into feelings, opinions, and understanding. Example: “How do you feel about that situation?”
  • Closed questions can be answered with a simple “yes” or “no” or specific information, limiting the depth of response. They are useful for gathering specific facts or confirming details. Example: “Do you feel anxious in crowds?”

Its other practical advantages are that it is cheaper than face-to-face interviews and can be used to contact many respondents scattered over a wide area relatively quickly.

Observations

There are different types of observation methods :
  • Covert observation is where the researcher doesn’t tell the participants they are being observed until after the study is complete. There could be ethical problems or deception and consent with this particular observation method.
  • Overt observation is where a researcher tells the participants they are being observed and what they are being observed for.
  • Controlled : behavior is observed under controlled laboratory conditions (e.g., Bandura’s Bobo doll study).
  • Natural : Here, spontaneous behavior is recorded in a natural setting.
  • Participant : Here, the observer has direct contact with the group of people they are observing. The researcher becomes a member of the group they are researching.  
  • Non-participant (aka “fly on the wall): The researcher does not have direct contact with the people being observed. The observation of participants’ behavior is from a distance

Pilot Study

A pilot  study is a small scale preliminary study conducted in order to evaluate the feasibility of the key s teps in a future, full-scale project.

A pilot study is an initial run-through of the procedures to be used in an investigation; it involves selecting a few people and trying out the study on them. It is possible to save time, and in some cases, money, by identifying any flaws in the procedures designed by the researcher.

A pilot study can help the researcher spot any ambiguities (i.e. unusual things) or confusion in the information given to participants or problems with the task devised.

Sometimes the task is too hard, and the researcher may get a floor effect, because none of the participants can score at all or can complete the task – all performances are low.

The opposite effect is a ceiling effect, when the task is so easy that all achieve virtually full marks or top performances and are “hitting the ceiling”.

Research Design

In cross-sectional research , a researcher compares multiple segments of the population at the same time

Sometimes, we want to see how people change over time, as in studies of human development and lifespan. Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time.

In cohort studies , the participants must share a common factor or characteristic such as age, demographic, or occupation. A cohort study is a type of longitudinal study in which researchers monitor and observe a chosen population over an extended period.

Triangulation means using more than one research method to improve the study’s validity.

Reliability

Reliability is a measure of consistency, if a particular measurement is repeated and the same result is obtained then it is described as being reliable.

  • Test-retest reliability :  assessing the same person on two different occasions which shows the extent to which the test produces the same answers.
  • Inter-observer reliability : the extent to which there is an agreement between two or more observers.

Meta-Analysis

A meta-analysis is a systematic review that involves identifying an aim and then searching for research studies that have addressed similar aims/hypotheses.

This is done by looking through various databases, and then decisions are made about what studies are to be included/excluded.

Strengths: Increases the conclusions’ validity as they’re based on a wider range.

Weaknesses: Research designs in studies can vary, so they are not truly comparable.

Peer Review

A researcher submits an article to a journal. The choice of the journal may be determined by the journal’s audience or prestige.

The journal selects two or more appropriate experts (psychologists working in a similar field) to peer review the article without payment. The peer reviewers assess: the methods and designs used, originality of the findings, the validity of the original research findings and its content, structure and language.

Feedback from the reviewer determines whether the article is accepted. The article may be: Accepted as it is, accepted with revisions, sent back to the author to revise and re-submit or rejected without the possibility of submission.

The editor makes the final decision whether to accept or reject the research report based on the reviewers comments/ recommendations.

Peer review is important because it prevent faulty data from entering the public domain, it provides a way of checking the validity of findings and the quality of the methodology and is used to assess the research rating of university departments.

Peer reviews may be an ideal, whereas in practice there are lots of problems. For example, it slows publication down and may prevent unusual, new work being published. Some reviewers might use it as an opportunity to prevent competing researchers from publishing work.

Some people doubt whether peer review can really prevent the publication of fraudulent research.

The advent of the internet means that a lot of research and academic comment is being published without official peer reviews than before, though systems are evolving on the internet where everyone really has a chance to offer their opinions and police the quality of research.

Types of Data

  • Quantitative data is numerical data e.g. reaction time or number of mistakes. It represents how much or how long, how many there are of something. A tally of behavioral categories and closed questions in a questionnaire collect quantitative data.
  • Qualitative data is virtually any type of information that can be observed and recorded that is not numerical in nature and can be in the form of written or verbal communication. Open questions in questionnaires and accounts from observational studies collect qualitative data.
  • Primary data is first-hand data collected for the purpose of the investigation.
  • Secondary data is information that has been collected by someone other than the person who is conducting the research e.g. taken from journals, books or articles.

Validity means how well a piece of research actually measures what it sets out to, or how well it reflects the reality it claims to represent.

Validity is whether the observed effect is genuine and represents what is actually out there in the world.

  • Concurrent validity is the extent to which a psychological measure relates to an existing similar measure and obtains close results. For example, a new intelligence test compared to an established test.
  • Face validity : does the test measure what it’s supposed to measure ‘on the face of it’. This is done by ‘eyeballing’ the measuring or by passing it to an expert to check.
  • Ecological validit y is the extent to which findings from a research study can be generalized to other settings / real life.
  • Temporal validity is the extent to which findings from a research study can be generalized to other historical times.

Features of Science

  • Paradigm – A set of shared assumptions and agreed methods within a scientific discipline.
  • Paradigm shift – The result of the scientific revolution: a significant change in the dominant unifying theory within a scientific discipline.
  • Objectivity – When all sources of personal bias are minimised so not to distort or influence the research process.
  • Empirical method – Scientific approaches that are based on the gathering of evidence through direct observation and experience.
  • Replicability – The extent to which scientific procedures and findings can be repeated by other researchers.
  • Falsifiability – The principle that a theory cannot be considered scientific unless it admits the possibility of being proved untrue.

Statistical Testing

A significant result is one where there is a low probability that chance factors were responsible for any observed difference, correlation, or association in the variables tested.

If our test is significant, we can reject our null hypothesis and accept our alternative hypothesis.

If our test is not significant, we can accept our null hypothesis and reject our alternative hypothesis. A null hypothesis is a statement of no effect.

In Psychology, we use p < 0.05 (as it strikes a balance between making a type I and II error) but p < 0.01 is used in tests that could cause harm like introducing a new drug.

A type I error is when the null hypothesis is rejected when it should have been accepted (happens when a lenient significance level is used, an error of optimism).

A type II error is when the null hypothesis is accepted when it should have been rejected (happens when a stringent significance level is used, an error of pessimism).

Ethical Issues

  • Informed consent is when participants are able to make an informed judgment about whether to take part. It causes them to guess the aims of the study and change their behavior.
  • To deal with it, we can gain presumptive consent or ask them to formally indicate their agreement to participate but it may invalidate the purpose of the study and it is not guaranteed that the participants would understand.
  • Deception should only be used when it is approved by an ethics committee, as it involves deliberately misleading or withholding information. Participants should be fully debriefed after the study but debriefing can’t turn the clock back.
  • All participants should be informed at the beginning that they have the right to withdraw if they ever feel distressed or uncomfortable.
  • It causes bias as the ones that stayed are obedient and some may not withdraw as they may have been given incentives or feel like they’re spoiling the study. Researchers can offer the right to withdraw data after participation.
  • Participants should all have protection from harm . The researcher should avoid risks greater than those experienced in everyday life and they should stop the study if any harm is suspected. However, the harm may not be apparent at the time of the study.
  • Confidentiality concerns the communication of personal information. The researchers should not record any names but use numbers or false names though it may not be possible as it is sometimes possible to work out who the researchers were.

Print Friendly, PDF & Email

Related Articles

Qualitative Data Coding

Research Methodology

Qualitative Data Coding

What Is a Focus Group?

What Is a Focus Group?

Cross-Cultural Research Methodology In Psychology

Cross-Cultural Research Methodology In Psychology

A-level Psychology AQA Revision Notes

A-Level Psychology

A-level Psychology AQA Revision Notes

What Is Internal Validity In Research?

What Is Internal Validity In Research?

What Is Face Validity In Research? Importance & How To Measure

Research Methodology , Statistics

What Is Face Validity In Research? Importance & How To Measure

  • Open access
  • Published: 01 June 2024

Biomarkers for personalised prevention of chronic diseases: a common protocol for three rapid scoping reviews

  • E Plans-Beriso   ORCID: orcid.org/0000-0002-9388-8744 1 , 2   na1 ,
  • C Babb-de-Villiers 3   na1 ,
  • D Petrova 2 , 4 , 5 ,
  • C Barahona-López 1 , 2 ,
  • P Diez-Echave 1 , 2 ,
  • O R Hernández 1 , 2 ,
  • N F Fernández-Martínez 2 , 4 , 5 ,
  • H Turner 3 ,
  • E García-Ovejero 1 ,
  • O Craciun 1 ,
  • P Fernández-Navarro 1 , 2 ,
  • N Fernández-Larrea 1 , 2 ,
  • E García-Esquinas 1 , 2 ,
  • V Jiménez-Planet 7 ,
  • V Moreno 2 , 8 , 9 ,
  • F Rodríguez-Artalejo 2 , 10 , 11 ,
  • M J Sánchez 2 , 4 , 5 ,
  • M Pollan-Santamaria 1 , 2 ,
  • L Blackburn 3 ,
  • M Kroese 3   na2 &
  • B Pérez-Gómez 1 , 2   na2  

Systematic Reviews volume  13 , Article number:  147 ( 2024 ) Cite this article

Metrics details

Introduction

Personalised prevention aims to delay or avoid disease occurrence, progression, and recurrence of disease through the adoption of targeted interventions that consider the individual biological, including genetic data, environmental and behavioural characteristics, as well as the socio-cultural context. This protocol summarises the main features of a rapid scoping review to show the research landscape on biomarkers or a combination of biomarkers that may help to better identify subgroups of individuals with different risks of developing specific diseases in which specific preventive strategies could have an impact on clinical outcomes.

This review is part of the “Personalised Prevention Roadmap for the future HEalThcare” (PROPHET) project, which seeks to highlight the gaps in current personalised preventive approaches, in order to develop a Strategic Research and Innovation Agenda for the European Union.

To systematically map and review the evidence of biomarkers that are available or under development in cancer, cardiovascular and neurodegenerative diseases that are or can be used for personalised prevention in the general population, in clinical or public health settings.

Three rapid scoping reviews are being conducted in parallel (February–June 2023), based on a common framework with some adjustments to suit each specific condition (cancer, cardiovascular or neurodegenerative diseases). Medline and Embase will be searched to identify publications between 2020 and 2023. To shorten the time frames, 10% of the papers will undergo screening by two reviewers and only English-language papers will be considered. The following information will be extracted by two reviewers from all the publications selected for inclusion: source type, citation details, country, inclusion/exclusion criteria (population, concept, context, type of evidence source), study methods, and key findings relevant to the review question/s. The selection criteria and the extraction sheet will be pre-tested. Relevant biomarkers for risk prediction and stratification will be recorded. Results will be presented graphically using an evidence map.

Inclusion criteria

Population: general adult populations or adults from specific pre-defined high-risk subgroups; concept: all studies focusing on molecular, cellular, physiological, or imaging biomarkers used for individualised primary or secondary prevention of the diseases of interest; context: clinical or public health settings.

Systematic review registration

https://doi.org/10.17605/OSF.IO/7JRWD (OSF registration DOI).

Peer Review reports

In recent years, innovative health research has moved quickly towards a new paradigm. The ability to analyse and process previously unseen sources and amounts of data, e.g. environmental, clinical, socio-demographic, epidemiological, and ‘omics-derived, has created opportunities in the understanding and prevention of chronic diseases, and in the development of targeted therapies that can cure them. This paradigm has come to be known as “personalised medicine”. According to the European Council Conclusion on personalised medicine for patients (2015/C 421/03), this term defines a medical model which involves characterisation of individuals’ genotypes, phenotypes and lifestyle and environmental exposures (e.g. molecular profiling, medical imaging, lifestyle and environmental data) for tailoring the right therapeutic strategy for the right person at the right time, and/or to determine the predisposition to disease and/or to deliver timely and targeted prevention [ 1 , 2 ]. In many cases, these personalised health strategies have been based on advances in fields such as molecular biology, genetic engineering, bioinformatics, diagnostic imaging and new’omics technologies, which have made it possible to identify biomarkers that have been used to design and adapt therapies to specific patients or groups of patients [ 2 ]. A biomarker is defined as a substance, structure, characteristic, or process that can be objectively quantified as an indicator of typical biological functions, disease processes, or biological reactions to exposure [ 3 , 4 ].

Adopting a public health perspective within this framework, one of the most relevant areas that would benefit from these new opportunities is the personalisation of disease prevention. Personalised prevention aims to delay or avoid the occurrence, progression and recurrence of disease by adopting targeted interventions that take into account biological information, environmental and behavioural characteristics, and the socio-economic and cultural context of individuals. These interventions should be timely, effective and equitable in order to maintain the best possible balance in lifetime health trajectory [ 5 ].

Among the main diseases that merit specific attention are chronic noncommunicable diseases, due to their incidence, their mortality or disability-adjusted life years [ 6 , 7 , 8 , 9 ]. Within the European Union (EU), in 2021, one-third of adults reported suffering from a chronic condition [ 10 ]. In addition, in 2019, the leading causes of mortality were cardiovascular disease (CVD) (35%), cancer (26%), respiratory disease (8%), and Alzheimer's disease (5%) [ 11 ]. For all of the above, in 2019, the PRECeDI consortium recommended the identification of biomarkers that could be used for the prevention of chronic diseases to integrate personalised medicine in the field of chronicity. This will support the goal of stratifying populations by indicating an individuals’ risk or resistance to disease and their potential response to drugs, guiding primary, secondary and tertiary preventive interventions [ 12 ]; understanding primary prevention as measures taken to prevent the occurrence of a disease before it occurs, secondary prevention as actions aimed at early detection, and tertiary prevention as interventions to prevent complications and improve quality of life in individuals already affected by a disease [ 4 ].

The “Personalised Prevention roadmap for the future HEalThcare” (PROPHET) project, funded by the European Union’s Horizon Europe research and innovation program and linked to ICPerMed, seeks to assess the effectiveness, clinical utility, and existing gaps in current personalised preventive approaches, as well as their potential to be implemented in healthcare settings. It also aims to develop a Strategy Research and Innovation Agenda (SRIA) for the European Union. This protocol corresponds to one of the first steps in the PROPHET, namely a review that aims to map the evidence and highlight the evidence gaps in research or the use of biomarkers in personalised prevention in the general adult population, as well as their integration with digital technologies, including wearable devices, accelerometers, and other appliances utilised for measuring physical and physiological functions. These biomarkers may be already available or currently under development in the fields of cancer, CVD, and neurodegenerative diseases.

There is already a significant body of knowledge about primary and secondary prevention strategies for these diseases. For example, hypercholesterolemia or dyslipidaemia, hypertension, smoking, diabetes mellitus and obesity or levels of physical activity are known risk factors for CVD [ 6 , 13 ] and neurodegenerative diseases [ 14 , 15 , 16 ]; for cancer, a summary of lifestyle preventive actions with good evidence is included in the European code against cancer [ 17 ]. The question is whether there is any biomarker or combination of biomarkers that can help to better identify subgroups of individuals with different risks of developing a particular disease, in which specific preventive strategies could have an impact on clinical outcomes. Our aim in this context is to show the available research in this field.

Given the context and time constraints, the rapid scoping review design is the most appropriate method for providing landscape knowledge [ 18 ] and provide summary maps, such as Campbell evidence and gap map [ 19 ]. Here, we present the protocol that will be used to elaborate three rapid scoping reviews and evidence maps of research on biomarkers investigated in relation to primary or secondary prevention of cancer, cardiovascular and neurodegenerative diseases, respectively. The results of these three rapid scoping reviews will contribute to inform the development of the PROPHET SRIA, which will guide the future policy for research in this field in the EU.

Review question

What biomarkers are being investigated in the context of personalised primary and secondary prevention of cancer, CVD and neurodegenerative diseases in the general adult population in clinical or public health settings?

Three rapid scoping reviews are being conducted between February and June 2023, in parallel, one for each disease group included (cancer, CVD and neurodegenerative diseases), using a common framework and specifying the adaptations to each disease group in search terms, data extraction and representation of results.

This research protocol, designed according to Joanna Briggs Institute (JBI) and Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) Checklist [ 20 , 21 , 22 ] was uploaded to the Open Science Framework for public consultation [ 23 ], with registration DOI https://doi.org/ https://doi.org/10.17605/OSF.IO/7JRWD . The protocol was also reviewed by experts in the field, after which modifications were incorporated.

Eligibility criteria

Following the PCC (population, concept and context) model [ 21 , 22 ], the included studies will meet the following eligibility criteria (Table  1 ):

Rationale for performing a rapid scoping review

As explained above, these scoping reviews are intended to be one of the first materials produced in the PROPHET project, so that they can inform the first draft of the SRIA. Therefore, according to the planned timetable, the reviews should be completed in only 4 months. Thus, following recommendations from the Cochrane Rapid Review Methods Group [ 24 ] and taking into account the large number of records expected to be assessed, according to the preliminary searches, and in order to meet these deadlines, specific restrictions were defined for the search—limited to a 3-year period (2020–2023), in English only, and using only MEDLINE and EMBASE as possible sources—and it was decided that the title-abstract and full-text screening phase would be carried out by a single reviewer, after an initial training phase with 10% of the records assessed by two reviewers to ensure concordance between team members. This percentage could be increased if necessary.

Rationale for population selection

These rapid scoping reviews are focused on the general adult population. In addition, they give attention to studies conducted among populations that present specific risk factors relevant to the selected diseases or that include these factors among those considered in the study.

For cancer, these risk (or preventive) factors include smoking [ 25 ], obesity [ 26 ], diabetes [ 27 , 28 , 29 ], Helicobacter pylori infection/colonisation [ 30 ], human papillomavirus (HPV) infection [ 30 ], human immunodeficiency virus (HIV) infection [ 30 ], alcohol consumption [ 31 ], liver cirrhosis and viral (HVB, HVC, HVD) hepatitis [ 32 ].

For CVD, we include hypercholesterolemia or dyslipidaemia, arterial hypertension, smoking, diabetes mellitus, chronic kidney disease, hyperglycaemia and obesity [ 6 , 13 ].

Risk groups for neurodegenerative diseases were defined based on the following risk factors: obesity [ 15 , 33 ], arterial hypertension [ 15 , 33 , 34 , 35 ], diabetes mellitus [ 15 , 33 , 34 , 35 ], dyslipidaemia [ 33 ], alcohol consumption [ 36 , 37 ] and smoking [ 15 , 16 , 33 , 34 ].

After the general search, only relevant and/or disease-specific subpopulations will be used for each specific disease. On the other hand, pregnancy is an exclusion criterion, as the very specific characteristics of this population group would require a specific review.

Rationale for disease selection

The search is limited to diseases with high morbidity and mortality within each of the three disease groups:

Cancer type

Due to time constraints, we only evaluate those malignant neoplasms with the greatest mortality and incidence rates in Europe, which according to the European Cancer Information System [ 38 ] are breast, prostate, colorectum, lung, bladder, pancreas, liver, stomach, kidney, and corpus uteri. Additionally, cervix uteri and liver cancers will also be included due to their preventable nature and/or the existence of public health screening programs [ 30 , 31 ].

We evaluate the following main causes of deaths: ischemic heart disease (49.2% of all CVD deaths), stroke (35.2%) (this includes ischemic stroke, intracerebral haemorrhage and subarachnoid haemorrhage), hypertensive heart disease (6.2%), cardiomyopathy and myocarditis (1.8%), atrial fibrillation and flutter (1.7%), rheumatic heart disease (1.6%), non-rheumatic valvular heart disease (0.9%), aortic aneurism (0.9%), peripheral artery disease (0.4%) and endocarditis (0.4%) [ 6 ].

In this scoping review, specifically in the context of CVD, rheumatic heart disease and endocarditis are not considered because of their infectious aetiology. Arterial hypertension is a risk factor for many cardiovascular diseases and for the purposes of this review is considered as an intermediary disease that leads to CVD.

  • Neurodegenerative diseases

The leading noncommunicable neurodegenerative causes of death are Alzheimer’s disease or dementia (20%), Parkinson’s disease (2.5%), motor neuron diseases (0.4%) and multiple sclerosis (0.2%) [ 8 ]. Alzheimer’s disease, vascular dementia, frontotemporal dementia and Lewy body disease will be specifically searched, following the pattern of European dementia prevalence studies [ 39 ]. Additionally, because amyotrophic lateral sclerosis is the most common motor neuron disease, it is also included in the search [ 8 , 40 , 41 ].

Rationale for context

Public health and clinical settings from any geographical location are being considered. The searches will only consider the period between January 2020 and mid-February 2023 due to time constraints.

Rationale for type of evidence

Qualitative studies are not considered since they cannot answer the research question. Editorials and opinion pieces, protocols, and conference abstracts will also be excluded. Clinical practice guidelines are not included since the information they contain should be in the original studies and in reviews on which they are based.

Pilot study

We did a pilot study to test and refine the search strategies, selection criteria and data extraction sheet as well as to get used to the software—Covidence [ 42 ]. The pilot study consisted of selecting from the results of the preliminary search matrix 100 papers in order of best fit to the topic, and 100 papers at random. The team comprised 15 individual reviewers (both in the pilot and final reviews) who met daily to revise, enhance, and reach consensus on the search matrices, criteria, and data extraction sheets.

Regarding the selected databases and the platforms used, we conducted various tests, including PubMed/MEDLINE and Ovid/MEDLINE, as well as Ovid/Embase and Elsevier/Embase. Ultimately, we chose Ovid as the platform for accessing both MEDLINE and Embase, utilizing thesaurus Mesh and EmTrees. We manually translated these thesauri to ensure consistency between them. Given that the review team was spread across the UK and Spain, we centralised the search results within the UK team's access to the Ovid license to ensure consistency. Additionally, using Ovid exclusively for accessing both MEDLINE and Embase streamlined the process and allowed for easier access to preprints, which represent the latest research in this rapidly evolving field.

Identification of research

The searches are being conducted in MEDLINE via Ovid, Embase via Ovid and Embase preprints via Ovid. We also explored the feasibility of searching in CDC-Authored Genomics and Precision Health Publications Databases [ 43 ] . However, the lack of advanced tools to refine the search, as well as the unavailability of bulk downloading prevented the inclusion of this data source. Nevertheless, a search with 15 records for each disease group showed a full overlap with MEDLINE and/or Embase.

Search strategy definition

An initial limited search of MEDLINE via PubMed and Ovid was undertaken to identify relevant papers on the topic. In this step, we identified keytext words in their titles and abstracts, as well as thesaurus terms. The SR-Accelerator, Citationchaser, and Yale Mesh Analyzer tools were used to assist in the construction of the search matrix. With all this information, we developed a full search strategy adapted for each included database and information source, optimised by research librarians.

Study evidence selection

The complete search strategies are shown in Additional file 3. The three searches are being conducted in parallel. When performing the search, no limits to the type of study or setting are being applied.

Following each search, all identified citations will be collated and uploaded into Covidence (Veritas Health Innovation, Melbourne, Australia, available at www.covidence.org ) with the citation details, and duplicates will be removed.

In the title-abstract and full-text screening phase, the first 10% of the papers will be evaluated by two independent reviewers (accounting for 200 or more papers in absolute numbers in the title-abstract phase). Then, a meeting to discuss discrepancies will lead to adjusting inclusion and exclusion criteria and to acquire consistency between reviewers’ decisions. After that, the full screening of the search results will be performed by a single reviewer. Disagreements that arise between reviewers at each stage of the selection process will be resolved through discussion, or with additional reviewers. We maintain an active forum to facilitate permanent contact among reviewers.

The results of the searches and the study inclusion processes will be reported and presented in a flow diagram following the PRISMA-ScR recommendations [ 22 ].

Expert consultation

The protocol has been refined after consultation with experts in each field (cancer, CVD, and neurodegenerative diseases) who gave input on the scope of the reviews regarding the diverse biomarkers, risk factors, outcomes, and types of prevention relevant to their fields of expertise. In addition, the search strategies have been peer-reviewed by a network of librarians (PRESS-forum in pressforum.pbworks.com) who kindly provided useful feedback.

Data extraction

We have developed a draft data extraction sheet, which is included as Additional file 4, based on the JBI recommendations [ 21 ]. Data extraction will include citation details, study design, population type, biomarker information (name, type, subtype, clinical utility, use of AI technology), disease (group, specific disease), prevention (primary or secondary, lifestyle if primary prevention), and subjective reviewer observations. The data extraction for all papers will be performed by two reviewers to ensure consistency in the classification of data.

Data analysis and presentation

The descriptive information about the studies collected in the previous phase will be coded according to predefined categories to allow the elaboration of visual summary maps that can allow readers and researchers to have a quick overview of their main results. As in the previous phases, this process will be carried out with the aid of Covidence.

Therefore, a summary of the extracted data will be presented in tables as well as in static and, especially, through interactive evidence gap maps (EGM) created using EPPI-Mapper [ 44 ], an open-access web application developed in 2018 by the Evidence for Policy and Practice Information and Coordinating Centre (EPPI-Centre) and Digital Solution Foundry, in partnership with the Campbell Collaboration, which has become the standard software for producing visual evidence gap maps.

Tables and static maps will be made by using R Studio, which will also be used to clean and prepare the database for its use in EPPI-Mapper by generating two Excel files: one containing the EGM structure (i.e. what will be the columns and rows of the visual table) and coding sets, and another containing the bibliographic references and their codes that reviewers had added. Finally, we will use a Python script to produce a file in JSON format, making it ready for importation into EPPI-Reviewer.

The maps are matrixes with biomarker categories/subcategories defining the rows and diseases serving as columns. They define cells, which contain small squares, each one representing each paper included in it. We will use a code of colours to reflect the study design. There will be also a second sublevel in the columns, depending on the map. Thus, for each group of diseases, we will produce three interactive EGMs: two for primary prevention and one for secondary prevention. For primary prevention, the first map will stratify the data to show whether any or which lifestyle has been considered in each paper in combination with the studied biomarker. The second map for primary prevention and the map for secondary prevention will include, as a second sublevel, the subpopulations in which the biomarker has been used or evaluated, which are disease-specific (i.e. cirrhosis for hepatic cancer) researched. The maps will also include filters that allow users to select records based on additional features, such as the use of artificial intelligence in the content of the papers. Furthermore, the EGM, which will be freely available online, will enable users to view and export selected bibliographic references and their abstracts. An example of these interactive maps with dummy data is provided in Additional file 5.

Finally, we will elaborate on two scientific reports for PROPHET. The main report, which will follow the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) recommendations, will summarise the results of the three scoping reviews, will provide a general and global interpretation of the results and will comment on their implication for the SRIA, and will discuss the limitations of the process. The second report will present the specific methodology for the dynamic maps.

This protocol summarises the procedure to carry out three parallel rapid scoping reviews to provide an overview of the available research and gaps in the literature on biomarkers for personalised primary and secondary prevention for the three most common chronic disease groups: cancer, CVD and neurodegenerative diseases. The result will be a common report for the three scoping reviews and the online publication of interactive evidence gap maps to facilitate data visualisation.

This work will be complemented, in a further step of the PROPHET project, by a subsequent mapping report on the scientific evidence for the clinical utility of biomarkers. Both reports are part of an overall mapping effort to characterise the current knowledge and environment around personalised preventive medicine. In this context, PROPHET will also map personalised prevention research programs, as well as bottlenecks and challenges in the adoption of personalised preventive approaches or in the involvement of citizens, patients, health professionals and policy-makers in personalised prevention. The overall results will contribute to the development of the SRIA concept paper, which will help define future priorities for personalised prevention research in the European Union.

In regard to this protocol, one of the strengths of this approach is that it can be applied in the three scoping reviews. This will improve the consistency and comparability of the results between them, allowing for better leveraging of efforts; it also will facilitate the coordination among the staff conducting the different reviews and will allow them to discuss them together, providing a more global perspective as needed for the SRIA. In addition, the collaboration of researchers with different backgrounds, the inclusion of librarians in the research team, and the specific software tools used have helped us to guarantee the quality of the work and have shortened the time invested in defining the final version of this protocol. Another strength is that we have conducted a pilot study to test and refine the search strategy, selection criteria and data extraction sheet. In addition, the selection of the platform of access to the bibliographic databases has been decided after a previous evaluation process (Ovid-MEDLINE versus PubMed MEDLINE, Ovid-Embase versus Elsevier-Embase, etc.).

Only 10% of the papers will undergo screening by two reviewers, and if time permits, we will conduct kappa statistics to assess reviewer agreement during the screening phases. Additionally, ongoing communication and the exchange and discussion of uncertainties will ensure a high level of consensus in the review process.

The main limitation of this work is the very broad field it covers: personalised prevention in all chronic diseases; however, we have tried to maintain decisions to limit it to the chronic diseases with the greatest impact on the population and in the last 3 years, making a rapid scoping review due to time constraints following recommendations from the Cochrane Rapid Review Methods Group [ 24 ]; however, as our aim is to identify gaps in the literature in an area of growing interest (personalisation and prevention), we believe that the records retrieved will provide a solid foundation for evaluating available literature. Additionally, systematic reviews, which may encompass studies predating 2020, have the potential to provide valuable insights beyond the temporal constraints of our search.

Thus, this protocol reflects the decisions set by the PROPHET's timetable, without losing the quality and rigour of the work. In addition, the data extraction phase will be done by two reviewers in 100% of the papers to ensure the consistency of the extracted data. Lastly, extending beyond these three scoping reviews, the primary challenge resides in amalgamating their findings with those from numerous other reviews within the project, ultimately producing a cohesive concept paper in the Strategy Research and Innovation Agenda (SRIA) for the European Union, firmly rooted in evidence-based conclusions.

Council of European Union. Council conclusions on personalised medicine for patients (2015/C 421/03). Brussels: European Union; 2015 dic. Report No.: (2015/C 421/03). Disponible en: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52015XG1217(01)&from=FR .

Goetz LH, Schork NJ. Personalized medicine: motivation, challenges, and progress. Fertil Steril. 2018;109(6):952–63.

Article   PubMed   PubMed Central   Google Scholar  

FDA-NIH Biomarker Working Group. BEST (Biomarkers, EndpointS, and other Tools) Resource. Silver Spring (MD): Food and Drug Administration (US); 2016 [citado 3 de febrero de 2023]. Disponible en: http://www.ncbi.nlm.nih.gov/books/NBK326791/ .

Porta M, Greenland S, Hernán M, dos Silva I S, Last JM. International Epidemiological Association, editores. A dictionary of epidemiology. 6th ed. Oxford: Oxford Univ. Press; 2014. p. 343.

Google Scholar  

PROPHET. Project kick-off meeting. Rome. 2022.

Roth GA, Mensah GA, Johnson CO, Addolorato G, Ammirati E, Baddour LM, et al. Global burden of cardiovascular diseases and risk factors, 1990–2019. J Am College Cardiol. 2020;76(25):2982–3021.

Article   Google Scholar  

GBD 2019 Cancer Collaboration, Kocarnik JM, Compton K, Dean FE, Fu W, Gaw BL, et al. Cancer incidence, mortality, years of life lost, years lived with disability, and disability-adjusted life years for 29 cancer groups from 2010 to 2019: a systematic analysis for the global burden of disease study 2019. JAMA Oncol. 2022;8(3):420.

Feigin VL, Vos T, Nichols E, Owolabi MO, Carroll WM, Dichgans M, et al. The global burden of neurological disorders: translating evidence into policy. The Lancet Neurology. 2020;19(3):255–65.

Article   PubMed   Google Scholar  

GBD 2019 Collaborators, Nichols E, Abd‐Allah F, Abdoli A, Abosetugn AE, Abrha WA, et al. Global mortality from dementia: Application of a new method and results from the Global Burden of Disease Study 2019. A&D Transl Res & Clin Interv. 2021;7(1). Disponible en: https://onlinelibrary.wiley.com/doi/10.1002/trc2.12200 . [citado 7 de febrero de 2023].

Eurostat. ec.europa.eu. Self-perceived health statistics. European health interview survey (EHIS). 2022. Disponible en: https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Self-perceived_health_statistics . [citado 7 de febrero de 2023].

OECD/European Union. Health at a Glance: Europe 2022: State of Health in the EU Cycle. Paris: OECD Publishing; 2022. Disponible en: https://www.oecd-ilibrary.org/social-issues-migration-health/health-at-a-glance-europe-2022_507433b0-en .

Boccia S, Pastorino R, Ricciardi W, Ádány R, Barnhoorn F, Boffetta P, et al. How to integrate personalized medicine into prevention? Recommendations from the Personalized Prevention of Chronic Diseases (PRECeDI) Consortium. Public Health Genomics. 2019;22(5–6):208–14.

Visseren FLJ, Mach F, Smulders YM, Carballo D, Koskinas KC, Bäck M, et al. 2021 ESC Guidelines on cardiovascular disease prevention in clinical practice. Eur Heart J. 2021;42(34):3227–337.

World Health Organization. Global action plan on the public health response to dementia 2017–2025. Geneva: WHO Document Production Services; 2017. p. 27.

Norton S, Matthews FE, Barnes DE, Yaffe K, Brayne C. Potential for primary prevention of Alzheimer’s disease: an analysis of population-based data. Lancet Neurol. 2014;13(8):788–94.

Mentis AFA, Dardiotis E, Efthymiou V, Chrousos GP. Non-genetic risk and protective factors and biomarkers for neurological disorders: a meta-umbrella systematic review of umbrella reviews. BMC Med. 2021;19(1):6.

Schüz J, Espina C, Villain P, Herrero R, Leon ME, Minozzi S, et al. European Code against Cancer 4th Edition: 12 ways to reduce your cancer risk. Cancer Epidemiol. 2015;39:S1-10.

Tricco AC, Langlois EtienneV, Straus SE, Alliance for Health Policy and Systems Research, World Health Organization. Rapid reviews to strengthen health policy and systems: a practical guide. Geneva: World Health Organization; 2017. Disponible en: https://apps.who.int/iris/handle/10665/258698 . [citado 3 de febrero de 2023].

White H, Albers B, Gaarder M, Kornør H, Littell J, Marshall Z, et al. Guidance for producing a Campbell evidence and gap map. Campbell Systematic Reviews. 2020;16(4). Disponible en: https://onlinelibrary.wiley.com/doi/10.1002/cl2.1125 . [citado 3 de febrero de 2023].

Aromataris E, Munn Z. editores. JBI: JBI Manual for Evidence Synthesis; 2020.

Peters MDJ, Marnie C, Tricco AC, Pollock D, Munn Z, Alexander L, et al. Updated methodological guidance for the conduct of scoping reviews. JBI Evid Synth. 2020;18(10):2119–26.

Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann Intern Med. 2018;169(7):467–73.

OSF. Open Science Framework webpage. Disponible en: https://osf.io/ . [citado 8 de febrero de 2023].

Garritty C, Gartlehner G, Nussbaumer-Streit B, King VJ, Hamel C, Kamel C, et al. Cochrane Rapid Reviews Methods Group offers evidence-informed guidance to conduct rapid reviews. Journal Clin Epidemiol. 2021;130:13–22.

Leon ME, Peruga A, McNeill A, Kralikova E, Guha N, Minozzi S, et al. European code against cancer, 4th edition: tobacco and cancer. Cancer Epidemiology. 2015;39:S20-33.

Anderson AS, Key TJ, Norat T, Scoccianti C, Cecchini M, Berrino F, et al. European code against cancer 4th edition: obesity, body fatness and cancer. Cancer Epidemiology. 2015;39:S34-45.

Barone BB, Yeh HC, Snyder CF, Peairs KS, Stein KB, Derr RL, et al. Long-term all-cause mortality in cancer patients with preexisting diabetes mellitus: a systematic review and meta-analysis. JAMA. 2008;300(23):2754–64.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Barone BB, Yeh HC, Snyder CF, Peairs KS, Stein KB, Derr RL, et al. Postoperative mortality in cancer patients with preexisting diabetes: systematic review and meta-analysis. Diabetes Care. 2010;33(4):931–9.

Noto H, Tsujimoto T, Sasazuki T, Noda M. Significantly increased risk of cancer in patients with diabetes mellitus: a systematic review and meta-analysis. Endocr Pract. 2011;17(4):616–28.

Villain P, Gonzalez P, Almonte M, Franceschi S, Dillner J, Anttila A, et al. European code against cancer 4th edition: infections and cancer. Cancer Epidemiology. 2015;39:S120-38.

Scoccianti C, Cecchini M, Anderson AS, Berrino F, Boutron-Ruault MC, Espina C, et al. European Code against Cancer 4th Edition: Alcohol drinking and cancer. Cancer Epidemiology. 2016;45:181–8.

El-Serag HB. Epidemiology of viral hepatitis and hepatocellular carcinoma. Gastroenterology. 2012;142(6):1264-1273.e1.

Li XY, Zhang M, Xu W, Li JQ, Cao XP, Yu JT, et al. Midlife modifiable risk factors for dementia: a systematic review and meta-analysis of 34 prospective cohort studies. CAR. 2020;16(14):1254–68.

Ford E, Greenslade N, Paudyal P, Bremner S, Smith HE, Banerjee S, et al. Predicting dementia from primary care records: a systematic review and meta-analysis Forloni G, editor. PLoS ONE. 2018;13(3):e0194735.

Xu W, Tan L, Wang HF, Jiang T, Tan MS, Tan L, et al. Meta-analysis of modifiable risk factors for Alzheimer’s disease. J Neurol Neurosurg Psychiatry. 2015;86(12):1299–306.

PubMed   Google Scholar  

Guo Y, Xu W, Liu FT, Li JQ, Cao XP, Tan L, et al. Modifiable risk factors for cognitive impairment in Parkinson’s disease: A systematic review and meta-analysis of prospective cohort studies. Mov Disord. 2019;34(6):876–83.

Jiménez-Jiménez FJ, Alonso-Navarro H, García-Martín E, Agúndez JAG. Alcohol consumption and risk for Parkinson’s disease: a systematic review and meta-analysis. J Neurol agosto de. 2019;266(8):1821–34.

ECIS European Cancer Information System. Data explorer | ECIS. 2023. Estimates of cancer incidence and mortality in 2020 for all cancer sites. Disponible en: https://ecis.jrc.ec.europa.eu/explorer.php?$0-0$1-AE27$2-All$4-2$3-All$6-0,85$5-2020,2020$7-7,8$CEstByCancer$X0_8-3$CEstRelativeCanc$X1_8-3$X1_9-AE27$CEstBySexByCancer$X2_8-3$X2_-1-1 . [citado 22 de febrero de 2023].

Bacigalupo I, Mayer F, Lacorte E, Di Pucchio A, Marzolini F, Canevelli M, et al. A systematic review and meta-analysis on the prevalence of dementia in Europe: estimates from the highest-quality studies adopting the DSM IV diagnostic criteria Bruni AC, editor. JAD. 2018;66(4):1471–81.

Barceló MA, Povedano M, Vázquez-Costa JF, Franquet Á, Solans M, Saez M. Estimation of the prevalence and incidence of motor neuron diseases in two Spanish regions: Catalonia and Valencia. Sci Rep. 2021;11(1):6207.

Ng L, Khan F, Young CA, Galea M. Symptomatic treatments for amyotrophic lateral sclerosis/motor neuron disease. Cochrane Neuromuscular Group, editor. Cochrane Database of Systematic Reviews. 2017;2017(1). Disponible en: http://doi.wiley.com/10.1002/14651858.CD011776.pub2 . [citado 13 de febrero de 2023].

Covidence systematic review software. Melbourne, Australia: Veritas Health Innovation; 2023. Disponible en: https://www.covidence.org .

Centre for Disease Control and Prevention. Public Health Genomics and Precision Health Knowledge Base (v8.4). 2023. Disponible en: https://phgkb.cdc.gov/PHGKB/specificPHGKB.action?action=about .

Digital Solution Foundry and EPPI Centre. EPPI Centre. UCL Social Research Institute: University College London; 2022.

Download references

Acknowledgements

We are grateful for the library support received from Teresa Carretero (Instituto de Salud Carlos III, ISCIII) and, from Concepción Campos-Asensio (Hospital Universitario de Getafe, Comité ejecutivo BiblioMadSalud) for the seminar on the Scoping Reviews methodology and for their continuous teachings through their social networks.

Also, we would like to thank Dr. Héctor Bueno (Centro Nacional de Investigaciones Cardiovasculares (CNIC), Hospital Universitario 12 de Octubre) and Dr. Pascual Sánchez (Fundación Centro de Investigación de Enfermedades Neurológicas (CIEN)) for their advice in their fields of expertise.

The PROPHET project has received funding from the European Union’s Horizon Europe research and innovation program under grant agreement no. 101057721. UK participation in Horizon Europe Project PROPHET is supported by UKRI grant number 10040946 (Foundation for Genomics & Population Health).

Author information

Plans-Beriso E and Babb-de-Villiers C contributed equally to this work.

Kroese M and Pérez-Gómez B contributed equally to this work.

Authors and Affiliations

Department of Epidemiology of Chronic Diseases, National Centre for Epidemiology, Instituto de Salud Carlos III, Madrid, Spain

E Plans-Beriso, C Barahona-López, P Diez-Echave, O R Hernández, E García-Ovejero, O Craciun, P Fernández-Navarro, N Fernández-Larrea, E García-Esquinas, M Pollan-Santamaria & B Pérez-Gómez

CIBER of Epidemiology and Public Health (CIBERESP), Madrid, Spain

E Plans-Beriso, D Petrova, C Barahona-López, P Diez-Echave, O R Hernández, N F Fernández-Martínez, P Fernández-Navarro, N Fernández-Larrea, E García-Esquinas, V Moreno, F Rodríguez-Artalejo, M J Sánchez, M Pollan-Santamaria & B Pérez-Gómez

PHG Foundation, University of Cambridge, Cambridge, UK

C Babb-de-Villiers, H Turner, L Blackburn & M Kroese

Instituto de Investigación Biosanitaria Ibs. GRANADA, Granada, Spain

D Petrova, N F Fernández-Martínez & M J Sánchez

Escuela Andaluza de Salud Pública (EASP), Granada, Spain

Cambridge University Medical Library, Cambridge, UK

National Library of Health Sciences, Instituto de Salud Carlos III, Madrid, Spain

V Jiménez-Planet

Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), L’Hospitalet de Llobregat, Barcelona, 08908, Spain

Colorectal Cancer Group, ONCOBELL Program, Institut de Recerca Biomedica de Bellvitge (IDIBELL), L’Hospitalet de Llobregat, Barcelona, 08908, Spain

Department of Preventive Medicine and Public Health, Universidad Autónoma de Madrid, Madrid, Spain

F Rodríguez-Artalejo

IMDEA-Food Institute, CEI UAM+CSIC, Madrid, Spain

You can also search for this author in PubMed   Google Scholar

Contributions

BPG and MK supervised and directed the project. EPB and CBV coordinated and managed the development of the project. CBL, PDE, ORH, CBV and EPB developed the search strategy. All authors reviewed the content, commented on the methods, provided feedback, contributed to drafts and approved the final manuscript.

Corresponding author

Correspondence to E Plans-Beriso .

Ethics declarations

Competing interests.

There are no conflicts of interest in this project.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: glossary., additional file 2: glossary of biomarkers that may define high risk groups., additional file 3: search strategy., additional file 4: data extraction sheet., additional file 5: example of interactive maps in cancer and primary prevention., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Plans-Beriso, E., Babb-de-Villiers, C., Petrova, D. et al. Biomarkers for personalised prevention of chronic diseases: a common protocol for three rapid scoping reviews. Syst Rev 13 , 147 (2024). https://doi.org/10.1186/s13643-024-02554-9

Download citation

Received : 19 October 2023

Accepted : 03 May 2024

Published : 01 June 2024

DOI : https://doi.org/10.1186/s13643-024-02554-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Personalised prevention
  • Precision Medicine
  • Precision prevention
  • Cardiovascular diseases
  • Chronic diseases

Systematic Reviews

ISSN: 2046-4053

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

primary and secondary data research methodology

This paper is in the following e-collection/theme issue:

Published on 31.5.2024 in Vol 26 (2024)

Comparison of the Working Alliance in Blended Cognitive Behavioral Therapy and Treatment as Usual for Depression in Europe: Secondary Data Analysis of the E-COMPARED Randomized Controlled Trial

Authors of this article:

Author Orcid Image

Original Paper

  • Asmae Doukani 1 , MSc   ; 
  • Matteo Quartagno 2 , PhD   ; 
  • Francesco Sera 3 , PhD   ; 
  • Caroline Free 1 , PhD   ; 
  • Ritsuko Kakuma 1 , PhD   ; 
  • Heleen Riper 4 , PhD   ; 
  • Annet Kleiboer 5, 6 , PhD   ; 
  • Arlinda Cerga-Pashoja 1 , PhD   ; 
  • Anneke van Schaik 4, 7 , PhD   ; 
  • Cristina Botella 8, 9 , PhD   ; 
  • Thomas Berger 10 , PhD   ; 
  • Karine Chevreul 11, 12 , PhD   ; 
  • Maria Matynia 13 , PhD   ; 
  • Tobias Krieger 10 , DPhil   ; 
  • Jean-Baptiste Hazo 11, 12 , PhD   ; 
  • Stasja Draisma 14 , PhD   ; 
  • Ingrid Titzler 15 , PhD   ; 
  • Naira Topooco 16 , PhD   ; 
  • Kim Mathiasen 17, 18 , PhD   ; 
  • Kristofer Vernmark 16 , PhD   ; 
  • Antoine Urech 10, 19 , PhD   ; 
  • Anna Maj 13 , PhD   ; 
  • Gerhard Andersson 20, 21, 22 , PhD   ; 
  • Matthias Berking 23 , MD, PhD   ; 
  • Rosa María Baños 9, 24 , PhD   ; 
  • Ricardo Araya 25 , PhD  

1 Department of Population Health, London School of Hygiene & Tropical Medicine, London, United Kingdom

2 Medical Research Council Clinical Trials Unit, University College London, London, United Kingdom

3 Department of Statistics, Computer Science and Applications “G. Parenti”, University of Florence, Florance, Italy

4 Department of Psychiatry, Amsterdam University Medial Centre, Vrije Universiteit Amsterdam, Amsterdam, Netherlands

5 Department Clinical, Neuro, and Developmental Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands

6 Amsterdam Public Health Institute, Amsterdam, Netherlands

7 Academic Department for Depressive Disorders, Dutch Mental Health Care, Amsterdam, Netherlands

8 Department of Basic Psychology, Clinical and Psychobiology, Universitat Jaume I, Castellón de la Plana, Spain

9 Centro de Investigación Biomédica en Red Fisiopatología Obesidad y Nutrición, Instituto Carlos III, Madrid, Spain

10 Department of Clinical Psychology and Psychotherapy, University of Bern, Bern, Switzerland

11 Unité de Recherche Clinique in Health Economics, Assistance Publique–Hôpitaux de Paris, Paris, France

12 Health Economics Research Unit, Inserm, University of Paris, Paris, France

13 Faculty of Psychology, SWPS University, Warsaw, Poland

14 Department on Aging, Netherlands Institute of Mental Health and Addiction (Trimbos Institute), Utrecht, Netherlands

15 Department of Clinical Psychology and Psychotherapy, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany

16 Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden

17 Department of Clinical Medicine, Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark

18 Centre for Digital Psychiatry, Mental Health Services of Southern Denmark, Odense, Denmark

19 Department of Neurology, Inselspital Bern, Bern University Hospital, Bern, Switzerland

20 Department of Behavioral Sciences and Learning, Linköping University, Linköping, Sweden

21 Department of Clinical Neuroscience, Karolinska Institute, Stockholm, Sweden

22 Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden

23 Department of Clinical Psychology and Psychotherapy, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany

24 Department of Personality, Evaluation and Psychological Treatments, Universidad de Valencia, Valencia, Spain

25 Department of Health Service and Population Research, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom

Corresponding Author:

Asmae Doukani, MSc

Department of Population Health

London School of Hygiene & Tropical Medicine

Keppel Street

London, WC1E 7HT

United Kingdom

Phone: 44 020 7636 8636 ext 2463

Email: [email protected]

Background: Increasing interest has centered on the psychotherapeutic working alliance as a means of understanding clinical change in digital mental health interventions in recent years. However, little is understood about how and to what extent a digital mental health program can have an impact on the working alliance and clinical outcomes in a blended (therapist plus digital program) cognitive behavioral therapy (bCBT) intervention for depression.

Objective: This study aimed to test the difference in working alliance scores between bCBT and treatment as usual (TAU), examine the association between working alliance and depression severity scores in both arms, and test for an interaction between system usability and working alliance with regard to the association between working alliance and depression scores in bCBT at 3-month assessments.

Methods: We conducted a secondary data analysis of the E-COMPARED (European Comparative Effectiveness Research on Blended Depression Treatment versus Treatment-as-usual) trial, which compared bCBT with TAU across 9 European countries. Data were collected in primary care and specialized services between April 2015 and December 2017. Eligible participants aged 18 years or older and diagnosed with major depressive disorder were randomized to either bCBT (n=476) or TAU (n=467). bCBT consisted of 6-20 sessions of bCBT (involving face-to-face sessions with a therapist and an internet-based program). TAU consisted of usual care for depression. The main outcomes were scores of the working alliance (Working Alliance Inventory-Short Revised–Client [WAI-SR-C]) and depressive symptoms (Patient Health Questionnaire-9 [PHQ-9]) at 3 months after randomization. Other variables included system usability scores (System Usability Scale-Client [SUS-C]) at 3 months and baseline demographic information. Data from baseline and 3-month assessments were analyzed using linear regression models that adjusted for a set of baseline variables.

Results: Of the 945 included participants, 644 (68.2%) were female, and the mean age was 38.96 years (IQR 38). bCBT was associated with higher composite WAI-SR-C scores compared to TAU ( B =5.67, 95% CI 4.48-6.86). There was an inverse association between WAI-SR-C and PHQ-9 in bCBT ( B =−0.12, 95% CI −0.17 to −0.06) and TAU ( B =−0.06, 95% CI −0.11 to −0.02), in which as WAI-SR-C scores increased, PHQ-9 scores decreased. Finally, there was a significant interaction between SUS-C and WAI-SR-C with regard to an inverse association between higher WAI-SR-C scores and lower PHQ-9 scores in bCBT ( b =−0.030, 95% CI −0.05 to −0.01; P =.005).

Conclusions: To our knowledge, this is the first study to show that bCBT may enhance the client working alliance when compared to evidence-based routine care for depression that services reported offering. The working alliance in bCBT was also associated with clinical improvements that appear to be enhanced by good program usability. Our findings add further weight to the view that the addition of internet-delivered CBT to face-to-face CBT may positively augment experiences of the working alliance.

Trial Registration: ClinicalTrials.gov NCT02542891, https://clinicaltrials.gov/study/NCT02542891; German Clinical Trials Register DRKS00006866, https://drks.de/search/en/trial/DRKS00006866; Netherlands Trials Register NTR4962, https://www.onderzoekmetmensen.nl/en/trial/25452; ClinicalTrials.Gov NCT02389660, https://clinicaltrials.gov/study/NCT02389660; ClinicalTrials.gov NCT02361684, https://clinicaltrials.gov/study/NCT02361684; ClinicalTrials.gov NCT02449447, https://clinicaltrials.gov/study/NCT02449447; ClinicalTrials.gov NCT02410616, https://clinicaltrials.gov/study/NCT02410616; ISRCTN Registry ISRCTN12388725, https://www.isrctn.com/ISRCTN12388725?q=ISRCTN12388725&filters=&sort=&offset=1&totalResults=1&page=1&pageSize=10; ClinicalTrials.gov NCT02796573, https://classic.clinicaltrials.gov/ct2/show/NCT02796573

International Registered Report Identifier (IRRID): RR2-10.1186/s13063-016-1511-1

Introduction

Depression is one of the most significant contributors to the global disease burden, affecting an estimated 264 million people globally [ 1 , 2 ]. Depression accounts for 7.2% of the overall disease burden in Europe, costing an estimated €113,405 billion (US $123,038 billion) per year. However, 45% of people with major depression will go untreated [ 3 ]. High costs and suboptimal access to mental health care are among the many reasons to foster digital mental health interventions (DMHIs), which promise greater quality of care and lower costs of delivery [ 4 , 5 ].

Evidence concerning the effectiveness of DMHIs has increased substantially over the past decade. Growing evidence indicates that internet-delivered cognitive behavioral therapy (iCBT) might be just as effective as face-to-face cognitive behavioral therapy (CBT) for a range of mental health conditions, particularly depression [ 6 - 13 ]. iCBT is delivered with varying degrees of support ranging from a stand-alone self-administered digital program to a blended treatment with the active involvement of a therapist through regular face-to-face meetings. Blended psychotherapies provide higher levels of therapist support compared to guided approaches that provide minimal or some guidance from a mental health practitioner [ 4 ]. Blended delivery has gained interest, with emerging evidence suggesting that such interventions can lead to improved adherence and treatment outcomes [ 14 ].

As interest in DMHIs increases, considerable attention has centered around the concept of the client-therapist alliance, of which there are many variations (therapeutic, working, helping, etc). While different therapeutic approaches have historically failed to agree on a definition of the alliance, Edward Bordin [ 15 - 17 ] proposed a pan-theoretical tripartite conceptualization called the working alliance that is characterized by 3 key dimensions, including the emotional “bond” between the client and the therapist, the agreement on the therapeutic “goals,” and the “task” needed to advance the client’s goals toward clinical improvement. This concept is particularly important because it has consistently predicted positive treatment outcomes for a range of psychological approaches, including CBT for depression [ 18 - 20 ].

The client-therapist alliance was identified as a key research priority for research policy and funding in digital technologies in mental health care, in a large consensus study involving people with lived experiences of mental health problems and service use, their carers, and mental health practitioners [ 21 ]. The integration of digital technologies in psychotherapy has led to changes in the way the alliance is conceptualized and assessed [ 19 ], with variability depending on the type of DMHI (digital program [ 22 ], avatar [ 23 ], or mobile app [ 24 ]).

The literature investigating the client-therapist alliance has largely focused on addressing 2 key questions. The first question is “Do alliance scores predict changes in clinical outcomes?” [ 21 , 25 - 29 ], and the second question, which has been focused on to a lesser extent, is “Does the alliance vary depending on how psychotherapy is delivered?” Systematic reviews that have addressed these questions specifically in relation to interventions that are guided, adopt CBT [ 21 ], or target the treatment of depression [ 27 ] found that the working alliance can be established in guided DMHIs at a comparable level to face-to-face therapy [ 21 ]; however, the literature on the outcome-alliance relationship is mixed [ 21 , 26 , 27 ].

To this end, only 3 studies have examined the working alliance in blended CBT (bCBT). The first was an uncontrolled study in Sweden, which offered 4 face-to-face and 10 iCBT sessions to a total of 73 participants in primary care services and which was part of the E-COMPARED (European Comparative Effectiveness Research on Blended Depression Treatment versus Treatment-as-usual) study [ 30 ]. The findings showed that the alliance was rated highly by both clients and therapists. However, only therapist alliance ratings were associated with client score changes in depression, while client ratings were not.

The second study was conducted in the Netherlands and recruited 102 participants from specialist care services. Participants were either randomized to bCBT (n=47), which consisted of a 20-week intervention (10 face-to-face and 10 online sessions), or a control condition (n=45), which consisted of 15-20 face-to-face CBT sessions [ 31 ]. Similar to the findings from the study conducted in Sweden [ 30 ], the working alliance was rated highly by both clients and therapists, and no differences were observed between scores. Client ratings of the working alliance were associated with lower depression scores over time in face-to-face CBT but not in bCBT. Therapist working alliance ratings were not significantly associated with depression scores over time in both treatment conditions [ 31 ].

The third and most recent study was conducted in Denmark. The study recruited a total of 76 participants who were either randomized to bCBT (n=38), which consisted of 6 face-to-face sessions that were alternated with 6-8 online modules of an internet-based program, or a control condition (n=38), which consisted of 12 face-to-face CBT sessions [ 32 ]. The findings showed a significant difference in client and therapist working alliance scores, in which clients rated their working alliance higher than therapists. However, only the therapist ratings across conditions were significantly associated with outcomes in depression. Working alliance ratings across face-to-face CBT and bCBT were comparable. Working alliance ratings in both face-to-face CBT and bCBT did not significantly predict treatment outcomes. It is not clear why an in-group effect was found for therapists across the pooled data and not within treatment conditions [ 32 ]. These findings might indicate that the study was not powered enough to detect an association for client ratings in each treatment condition.

While research has mainly focused on measuring the alliance between the client and therapist, emerging qualitative research suggests that DMHIs may offer additional relational alliance benefits [ 29 , 33 , 34 ]. An example comes from a qualitative study that examined the working alliance demands in a bCBT intervention for people with mild-to-moderate depression in the United Kingdom, as part of the E-COMPARED trial [ 35 ]. Qualitative data indicated a potential fourth dimension called “usability heuristics,” which appeared to uniquely promote the working alliance in bCBT. Usability heuristics defines the digital program’s role in promoting active engagement, self-discovery, and autonomous problem-solving, with higher levels expected to enhance the quality of the working alliance. Features that enable “usability heuristics” include digital technologies that increase access and immediacy to the therapeutic task (availability), appropriately respond to the client’s input (interactivity), are easy to use, have esthetic appeal, and promote self-directed therapy [ 36 ]. Findings regarding usability heuristics and the respective subfeatures were also reported in another qualitative study that tested this framework in a Spanish sample of participants who experienced self-guided or low-intensity supported iCBT [ 37 ]. It is therefore possible that experiences of digital program features may influence the way that the working alliance is experienced in blended formats of CBT [ 36 ].

Aims and Objectives

To our knowledge, we report the largest investigation of the working alliance in bCBT for depression, using pooled data from 9 country sites involved in a pragmatic noninferiority randomized controlled trial investigating the effectiveness of bCBT for depression when compared with treatment as usual (TAU) [ 35 ]. Further to this, our study will explore if system usability, a newly conceptualized feature of the working alliance, in bCBT interacts with the working alliance and treatment outcome association [ 36 ]. Our primary objectives are to test the difference in working alliance scores between bCBT and TAU (objective 1), and determine if working alliance scores are associated with depression scores (objective 2). Our secondary objective is to test for an interaction between system usability and the working alliance with regard to an association between the working alliance and depression scores in bCBT (objective 3).

Study Design and Settings

We conducted a nonprespecified secondary analysis of data collected in the E-COMPARED study, a large European 2-arm, noninferiority randomized controlled trial investigating the effectiveness of bCBT compared with TAU across 9 European countries (France: ClinicalTrials.gov NCT02542891, September 4, 2015; Germany: German Clinical Trials Register DRKS00006866, December 2, 2014; The Netherlands: Netherlands Trials Register NTR4962, January 5, 2015; Poland: ClinicalTrials.gov NCT02389660, February 18, 2015; Spain: ClinicalTrials.gov NCT02361684, January 8, 2015; Sweden: ClinicalTrials.gov NCT02449447, March 30, 2015; Switzerland: ClinicalTrials.gov NCT02410616, April 2, 2015; United Kingdom: ISRCTN Registry ISRCTN12388725, March 20, 2015; Denmark: ClinicalTrials.gov NCT02796573, June 1, 2016) [ 35 , 38 ]. Data were collected between April 2015 and December 2017. Clients seeking treatment for depression were recruited, assessed, and treated across routine primary care in Germany, Poland, Spain, Sweden, and the United Kingdom, and specialized mental health services in France, the Netherlands, Switzerland, and Denmark [ 35 ]. Following the start of recruitment, an additional satellite site was added in Denmark to boost recruitment [ 38 ]. The E-COMPARED trial was funded by the European Commission FP7-Health-2013-Innovation-1 program (grant agreement number: 603098).

Participants

Recruitment procedures differed in each country, but all sites screened new clients seeking help for depression, who scored 5 or higher on the Patient Health Questionnaire-9 (PHQ-9) [ 39 ]. The study was explained to potential participants either face-to-face or over a telephone call. Clients who agreed to take part in the study were invited to an initial appointment to assess eligibility. The inclusion criteria applied at all sites were as follows: age ≥18 years and meeting the diagnostic criteria for major depressive disorder as confirmed by the MINI International Neuropsychiatric Interview (M.I.N.I) version 5.0 [ 40 ]. The exclusion criteria were as follows: high risk of suicide and psychiatric comorbidity (ie, substance dependence, bipolar affective disorder, psychotic illness, or obsessive compulsive disorder) assessed during the M.I.N.I. interview; receiving psychological treatment for depression in primary or specialized mental health care at the point of recruitment; inability to comprehend the spoken and written language of the country site; lacking access to a computer or a fast internet connection (ie, broadband or comparable); and lacking a smartphone or being unwilling to carry a smartphone if one was provided by the research team [ 35 ].

After baseline assessments, participants were randomized to 1 of 2 treatment arms (bCBT or TAU) using block randomization, with stratification by country [ 35 ]. All participants provided written informed consent before taking part in the trial [ 35 ].

Ethical Considerations

The trial was conducted in accordance with the Declaration of Helsinki and was approved by all local ethics committees. Ethics approval to conduct a secondary analysis was obtained from the London School of Hygiene and Tropical Medicine Research Ethics Committee on October 7, 2019 (ethics reference number: 17852). For further information on the trial, including local ethics approvals and the randomization process, see the trial protocol [ 35 ].

Interventions: bCBT and TAU

bCBT for depression consisted of integrating a digital program (iCBT plus mobile app) with face-to-face CBT in a single treatment protocol [ 35 , 41 ]. iCBT programs included 4 mandatory core modules of CBT (ie, psychoeducation, behavioral activation, cognitive restructuring, and relapse prevention) plus optional modules (eg, physical exercise and problem solving) typically completed at home, while face-to-face CBT was delivered in the clinic [ 35 ]. Clients worked through treatment modules, completed exercises, and monitored their symptoms on the digital program, while face-to-face sessions were used by the therapist to set up modules, monitor client progress, and address client-specific needs. Sequencing and time spent on each module were flexibly applied; however, the 4 mandatory modules on the digital program had to be completed. Data on treatment and dosage were not collected for TAU in the trial. See Table 1 for a breakdown of recruitment, bCBT format and dosage, and treatments offered in TAU across all country sites [ 30 , 35 , 42 ]. It was not possible to blind therapists to treatment allocation; however, assessors were blinded [ 35 ].

a bCBT: blended cognitive behavioral therapy.

b TAU: treatment as usual.

c Sequencing of face-to-face and online sessions can include more than one session per week for either component.

d CBT: cognitive behavioral therapy.

e GP: general practitioner.

f IAPT: improving access to psychological therapy.

g NHS: National Health Service.

h Denmark was added as a satellite recruitment site [ 38 ] after the commencement of the project.

Based on the registered data, 194 therapists delivered trial interventions. In Germany, therapists only delivered bCBT in the treatment arm, whereas therapists from the remaining 8 country sites delivered interventions across both treatment arms. The risk of contamination was not perceived as a concern, as CBT was also offered in TAU, with the focus of the trial on investigating the blending of an internet-based intervention with face-to-face CBT when compared to routine care. Data on therapist ratings of the working alliance will be published in a separate paper to enable comprehensive reporting and discussion of the findings.

Diagnostic Assessment

In the E-COMPARED study [ 35 ], a diagnosis of major depression according to the Diagnostic and Statistical Manual of Mental Disorders IV (DSM-IV) was established at baseline using the M.I.N.I [ 40 ], a structured diagnostic interview that has been translated into 65 languages and is used for both clinical and research practice. The interview compares well with the Structured Clinical Interview for DSM-IV disorders [ 43 ] and the Composite International Diagnostic Interview [ 40 , 43 ]. The M.I.N.I. was also used to assess the following comorbid disorders that were part of the exclusion criteria: substance dependence, bipolar affective disorder, psychotic illness, and obsessive-compulsive disorder. The M.I.N.I was administered face-to-face or via telephone at baseline and 12-month follow-up assessments. Telephone administration of diagnostic interviews has shown good validity and reliability [ 44 , 45 ].

Primary Measures

The study outcomes were the working alliance and depression severity, which were measured using the Working Alliance Inventory-Short Revised–Client (WAI-SR-C) [ 46 ] and the PHQ-9 [ 39 ], respectively. The WAI-SR-C scale is based on the theory of working alliance containing 3-item subscales assessing bond, task, and goals by Bordin [ 15 , 16 ]. The 12 items are rated on a 5-point scale from 1 (seldom) to 5 (always), with total scores ranging between 12 and 60. Higher scores on the scale indicate better working alliance. The WAI-SR-C scale has demonstrated good reliability (internal consistency) for all 3 factors, including the bond, task, and goals subscales (Cronbach α=0.92, 0.92, and 0.89, respectively) [ 47 ]. The scale has been shown to be correlated with other therapeutic alliance scales such as the California Therapeutic Alliance Rating System [ 19 , 48 ] and the Helping Alliance Questionnaire-II [ 19 , 49 ]. The WAI-SR-C scale was only administered at 3-month assessments. Data for the WAI-SR-C scale were not collected in the TAU arm of the Swedish country site.

The PHQ-9 [ 39 ] was used to assess depression as the trial’s primary clinical outcome. The PHQ-9 is a 9-item scale that can be used to screen and diagnose people for depressive disorders. Each of the 9 items is scored on a 4-point scale from 0 (not at all) to 3 (nearly every day). The total score ranges between 0 and 27, with higher scores indicating greater symptom severity. Depression severity can be grouped into the following: mild (score 0-5), moderate (6-10), moderately severe (11-15), and severe (≥16). The PHQ-9 has been shown to have good psychometric properties [ 39 ] and has demonstrated its utility as a valid diagnostic tool [ 50 ]. The PHQ-9 was administered at the baseline and 3-, 6-, and 12-month assessments; however, this study only used baseline and 3-month assessment data as the study was interested in investigating depression scores that generally corresponded to before and after treatment.

Other Measures

System Usability Scale-Client (SUS-C) [ 51 , 52 ] was used to assess the usability of the digital programs. The SUS-C is a 10-item self-reported questionnaire. Items are measured on a 5-point scale ranging from 1 (strongly disagree) to 5 (strongly agree). The total SUS-C score ranges between 10 and 50 to produce a global score. Higher scores indicate better system usability. The total sum score has been found to be a valid and interpretable measure to assess the usability of internet-based interventions by professionals in mental health care settings [ 53 ]. The SUS has shown high internal reliability (eg, coefficient Ω=0.91) and good concurrent validity and sensitivity [ 52 , 53 ]. The SUS-C was administered at the 3-month follow-up assessment.

Demographic data on the participant’s gender, age, educational attainment, marital status, and country site were collected at baseline. Baseline variables entered as covariates in the regression models included age, gender (male, female, and other), marital status (single, divorced, widowed, living together, and married), and educational level (low, middle, and high, corresponding to secondary school education or equivalent [low], college or equivalent [middle], and university degree or higher [high]).

Baseline data were completed online, face-to-face, via telephone, or a combination of these approaches. The 3-month follow-up assessments were largely completed online, with the exception of the PHQ-9 that was collected via telephone to maximize data collection of the trial’s primary outcome. Data that were directly collected by researchers (ie, either in person or via telephone) were double entered to increase the accuracy of the data entry process.

Statistical Analysis

The study used an intention-to-treat (ITT) population for the data analysis [ 54 ]. While the ITT approach is standard for RCTs, some methodologists advise that a per-protocol population is more suitable for pragmatic noninferiority trials owing to concerns that a “flawed trial” is likely to incorrectly demonstrate noninferiority (eg, a trial that loses the ability to distinguish any true differences between treatment groups that are present). However, contrary to the primary analysis in the E-COMPARED trial, noninferiority tests were not performed in our analyses. A decision was made to use a pure ITT population in order to maintain the original treatment group composition achieved after the random allocation of trial participants, therefore minimizing the confounding between the treatment groups and providing unbiased estimates of the treatment effects on the working alliance [ 54 ].

Data of the E-COMPARED trial were downloaded from a data repository. All analyses employed an ITT population. All models were adjusted for baseline PHQ-9 scores, age, gender, marital status, educational attainment, and country site. Analyses were performed on SPSS (version 26 or above) [ 55 ], STATA (version 16 or above) [ 56 ], and PROCESS Macro plug-in for SPSS (version 3.5 or above) [ 57 ]. Reported P values are 2-tailed, with significance levels at P ≤.05.

Treatment Assignment as a Predictor for WAI-SR-C Scores at 3-Month Assessments

In order to test if treatment assignment predicted WAI-SR-C scores at 3-month assessments (objective 1), a fixed effects linear regression model [ 58 ] was fitted separately for WAI-SR-C composite and subscale scores (goals, task, and bond). Four models were fitted altogether.

Association Between PHQ-9 Scores and WAI-SR-C Scores at 3-Month Assessments

To determine if WAI-SR-C scores were associated with PHQ-9 scores at 3-month assessments (objective 2), a fixed effects linear regression model was fitted to investigate this association separately for the bCBT and TAU arms in order to understand the alliance-outcome association within different treatment conditions in the trial. The model was also fitted separately for WAI-SR-C composite and subscale scores. Eight models were fitted altogether.

Testing the Interaction Between WAI-SR-C and SUS-C Scores With Regard to the Relationship Between WAI-SR-C and PHQ-9 Scores

To test the interaction between 3-month SUS-C and 3-month WAI-SR-C scores in a model examining the relationship between 3-month WAI-SR-C and 3-month PHQ-9 scores, a multiple regression model was fitted separately for WAI-SR-C composite and subscale scores in order to estimate the size of the interaction. Four models were fitted altogether.

Missing Data

Multiple imputation was used to handle high levels of missing data, under the missing at random (MAR) assumption. In particular, 36.6% (345/943) of data were missing for the PHQ-9, 20.7% (195/943) were missing for the WAI-SR-C, and 27.9% (133/476) were missing for the SUS-C at 3-month assessments. We imputed data sets using the chained equation approach [ 59 ]. Tabulations of missing data across treatment conditions and country sites are presented in Tables S1-S3 in Multimedia Appendix 1 . Chi-square results of differences in missing and complete data between E-COMPARED country sites are presented in Tables S4 and S5 in Multimedia Appendix 1 . In the imputation model, we included all variables that were part of the analyses, including observations from the PHQ-9 at baseline and demographic variables. To account for the interaction term in the regression model, data were imputed using the just another variable (JAV) approach [ 60 ]. Multiple imputation was performed separately for bCBT and TAU to allow for condition-specific variables to be considered. For example, the SUS-C variable was only entered in the bCBT arm, as those in the TAU arm did not use a digital program.

Post hoc Analysis

Post hoc sensitivity analyses were conducted to examine if the multiple imputation approach that was used to handle missing data would lead to different conclusions when compared to a complete case analysis. Under the MAR assumption, consistent findings between the primary analysis and sensitivity analysis can strengthen the reliability of the findings [ 61 - 64 ], at least in situations where both the primary and sensitivity analyses are expected to be valid under similar assumptions (eg, multiple imputation and complete case analysis under the MAR assumption in the outcome variable only).

Owing to the heterogeneity of interventions offered in the TAU arm within the current pragmatic trial, a subgroup analysis was conducted to explore the magnitude of treatment effects on the working alliance when using a subset of the sample, which compared bCBT with face-to-face CBT offered in the TAU arm in Denmark, France, Poland, Switzerland, and the United Kingdom country sites of the E-COMPARED trial [ 35 ]. The subanalysis replicated the main analysis in just 5 country sites. This enabled the working alliance in bCBT to be directly compared with a defined comparator. Results between the primary analysis and the subanalysis were compared to understand if results vary when there are multiple interventions in TAU and when there is a defined comparator (ie, face-to-face CBT) [ 65 - 67 ].

Clinical and Demographic Characteristics

Table 2 summarizes the baseline characteristics. Among the 943 participants who consented and were randomized in the trial (bCBT=476; TAU=467) (See Figure S1 in Multimedia Appendix 1 for the trial’s profile), most were female (644/943, 68.3%), were middle-aged, and had a university degree or higher (447/943, 47.4). The PHQ-9 scores (median 15, IQR 7) reflected depression of moderate severity. PHQ-9 scores at 3 months will be reported in the main trial paper, which is being prepared. The median WAI-SR-C score was 47.42 (IQR 6) in the bCBT arm and 42 (IQR 8) in the TAU arm. The median SUS-C score was 42 (IQR 9) in the bCBT arm. See Table 3 for the median (IQR) values of the WAI-SR-C and SUS-C scores across treatment groups, and see Tables S6-S8 in Multimedia Appendix 1 for the median (IQR) values of the WAI-SR-C and SUS-C scores by country site.

c Data collected were in respect to what would be considered low, middle, and high levels of education in each setting. Data were missing for 1 of 943 (0.2%) individuals in the bCBT arm.

d Self-reported country of birth can be found in Table S9 in Multimedia Appendix 1 .

e PHQ-9: Patient Health Questionnaire-9.

f PHQ-9 severity cutoff points are as follows: 5-9, mild depression; 1-14, moderate depression; 15-19, moderately severe depression; and ≥20, severe depression [ 39 ].

c WAI-SR-C: Working Alliance Inventory-Short Revised–Client.

d SUS-C: System Usability Scale-Client.

e N/A: not applicable.

Treatment Assignment as a Predictor for WAI-SR-C Scores

Treatment assignment significantly predicted WAI-SR-C composite, goals, task, and bond scores (See Table 4 for model summaries). Being allocated to bCBT predicted higher WAI-SR-C composite and subscale scores at 3-month assessments when compared to TAU.

a WAI-SR-C: Working Alliance Inventory-Short Revised–Client.

b Separate models were generated for WAI-SR-C composite and subscale scores (ie, goals, task, and bond).

c Unstandardized beta.

Across both treatment arms, WAI-SR-C composite scores and goals and task subscale scores were significantly associated with PHQ-9 scores, in which lower PHQ-9 scores were associated with higher WAI-SR-C composite scores and goals and task subscale scores. WAI-SR-C bond scores were not significantly associated with PHQ-9 scores in both treatment arms (see Table 5 for model summaries).

c bCBT: blended cognitive behavioral therapy.

d TAU: treatment as usual.

e Unstandardized beta.

There was a significant interaction between WAI-SR-C and SUS-C scores with regard to the association between WAI-SR-C composite scores and PHQ-9 scores at 3 months ( b =−0.008, 95% CI −0.01 to −0.00; P =.03). Similar findings were noted for the goals ( b =−0.021, 95% CI −0.04 to −0.00; P =.03) and task ( b =−0.028, 95% CI −0.05 to −0.01; P =.003) subscales but not for the bond subscale ( b =−0.010, 95% CI −0.03 to 0.01; P =.30). Figure 1 shows the presence of an inverse association between composite WAI-SR-C (for composite, and the goals and task subscales but not the bond subscale) and PHQ-9 scores among those with high SUS-C scores.

primary and secondary data research methodology

Sensitivity and Subgroup Results

The sensitivity analysis with the complete case data set and subgroup analysis of 5 country sites that only offered face-to-face CBT in the TAU arm produced results that were comparable to those reported in the main paper. However, the interaction between SUS-C (and all subscales) and WAI-SR-C scores with regard to the association between WAI-SR-C and PHQ-9 scores was not significant in terms of sensitivity. Other differences are summarized in Results S2 in Multimedia Appendix 1 , while the full results of the sensitivity and subgroup analyses can be found in Results S3 and S4 in Multimedia Appendix 1 .

Principal Findings

This study investigated the client-rated working alliance in a bCBT intervention for depression when compared to TAU [ 35 ]. Overall, our study found that treatment allocation (bCBT versus TAU) was a significant predictor of working alliance scores, in which ratings of the working alliance (composite scale and goals, task, and bond subscales) were higher in bCBT than in TAU. The working alliance was significantly associated with treatment outcomes. Across both bCBT and TAU groups, as working alliance scores increased, PHQ-9 scores decreased for composite, goals, and task scores but not for bond scores. Finally, there was a significant interaction between average and above-average system usability and higher working alliance (composite scale and goals and task subscales, but not bond subscale) scores when examining the relationship between the working alliance and PHQ-9 scores at 3-month assessments.

To our knowledge, our study is the first to report that working alliance composite scores and all subscale scores were higher in bCBT than in TAU. A post hoc analysis using data from country sites that only offered face-to-face CBT in the TAU arm found that the working alliance was significantly higher in the bCBT arm compared to face-to-face CBT. These findings indicate that a blended approach may offer additional alliance-building benefits when compared to face-to-face CBT and other types of usual care for depression offered in TAU such as talking therapies and psychopharmacological interventions. A possible explanation for our findings is that the digital elements of the intervention may enable better definition and coverage of the goals and the task than what might be possible in face-to-face sessions alone [ 68 ]. A study exploring program usage across 4 country sites of the E-COMPARED study found that clients received an average of 10 messages from their therapists online [ 69 ]. Features of the digital program that enabled the client to receive contact from the therapist away from the clinic may therefore play a role in increasing the availability of the therapist and enhancing opportunities to further strengthen the working alliance [ 69 ].

Further support for our findings comes from a qualitative study that examined the working alliance in bCBT in the United Kingdom country site of the E-COMPARED trial [ 36 ], which found that participants preferred bCBT compared to face-to-face CBT alone. The “immediacy” of access to the therapeutic task was reported to enhance engagement with the intervention and provide a higher sense of control and independence. The digital program was also described as a “secure base” that allowed participants to progressively explore self-directed treatment [ 36 ]. Similarly, a qualitative study from the German country site of the E-COMPARED trial found that bCBT was perceived to strengthen patient self-management and autonomy in relation to place and location [ 70 ].

Our study appears to be the first to identify a significant association between lower depression scores and higher working alliance composite scores and goals and task subscale scores but not bond subscale scores. In alignment with our findings, a narrative review of the working alliance in online therapy found that most guided iCBT studies included in the review reported significant associations between outcomes and the task and goals subscale scores but not the bond subscale scores [ 26 ]. A possible explanation could be that the bond is experienced differently in bCBT compared to traditional formats of CBT [ 26 ]. Bordin’s [ 15 , 16 ] conceptualization of the working alliance suggests that while the pan-theoretical theory allows for the basic measurement of the goals, task, and bond to produce beneficial therapeutic change, the ideal alliance profile is likely to be different across therapeutic approaches and interventions [ 15 , 16 , 18 ]. The findings may therefore indicate that the working alliance profile might differ in b-CBT. However, further research is needed to investigate this.

Finally, our finding that average and higher system usability ratings may strengthen the working alliance (especially the task subscale) may point to the digital programs’ influence on how the working alliance is experienced. This is not surprising given that CBT activities (eg, content and exercises) were primarily completed in the iCBT program and may indicate its relevance in the building of the working alliance and in supporting the task within a capacity that is potentially parallel to the bond. These findings partially test and support a conceptual framework of the working alliance that incorporates features that are derived from the digital program within a blended setting called “digital heuristics” (the promotion of active engagement and autonomous problem solving) in which “ease of use” and “interactivity” were identified as key features for optimizing “active engagement” with the task in the iCBT program [ 36 ]. These qualitative findings were mirrored in another study that tested the abovementioned framework, in which digital heuristics emerged as a fourth dimension when examining the working alliance in self-guided and low-intensity supported iCBT for depression [ 37 ]. High and low iCBT program functionalities were also identified by therapists as facilitators and barriers, in building the working alliance in bCBT in the German and UK country sites of the E-COMPARED trial [ 36 , 70 - 72 ]. Although our findings remain preliminary and do not show a causal effect, further investigation concerning the effect of the digital program on the working alliance may be a fruitful direction for future research.

Collectively, our findings suggest that blending face-to-face CBT with an iCBT program may enhance the working alliance and treatment outcomes for depression. These findings hold important implications for clinical practice, especially following the COVID-19 pandemic that resulted in major shifts from in-person care to blended health care provision. The findings of this study suggest that a blended approach may enhance rather than worsen mental health care. Our study’s findings regarding the interaction between system usability and the working alliance in terms of treatment outcomes represent a preliminary step to quantitively understand the influence of the digital program and its role in how the working alliance is experienced. While further research is required to explore digital taxonomies that contribute toward fostering the working alliance in bCBT, our findings build on previous qualitative research [ 29 , 34 , 36 , 68 ] to explore a conceptualization of the working alliance that goes beyond the client and the therapist in order to consider the role of the digital program. The impact of the digital program on the working alliance may support the case of employing digital navigators who can help clients to use the intervention and troubleshoot technology and program usability issues, and remove the added burden of managing program-related problems that would otherwise fall on the therapist [ 70 , 72 , 73 ].

We propose 4 directions for future research. First, future research is required to build a comprehensive understanding of what, how, and when digital features (eg, usage, interface, interactivity, and accessibility) influence the working alliance [ 36 ]. Second, psychometric scales measuring the working alliance in bCBT should be adapted or developed to conceptually reflect a construct that also incorporates the client-program working alliance [ 42 ]. Third, the working alliance should be investigated early in the intervention and across multiple stages of treatment [ 74 ]. Fourth, future research should investigate if our results can be replicated across different DMHIs and treatment dosages.

Limitations

Several study limitations should be noted. First, working alliance data were collected at a single point that corresponded with 3-month assessments. While this is common in clinical trials [ 25 , 58 ], the measurement of the alliance is recommended early in treatment within the first 5 sessions and at different points across treatment [ 74 - 77 ]. However, the number of face-to-face sessions varied between the 9 country sites (eg, 5 to 10 sessions), which would have posed significant challenges for the systematic data collection required in a clinical trial [ 54 ]. Second, the study engaged in multiple comparisons, which may have increased the risk of type 1 error (a positive result may be due to chance). However, given the exploratory nature of this analysis and the fact that different outcomes are likely to be highly correlated, a multiple adjustment comparison was not deemed necessary [ 78 ]. Third, the results of the analysis are valid under the MAR assumption, which we believe to be plausible because the effect of country sites appears to influence the missingness of the main outcome variables, stemming from country-specific data collection procedures and experiences. This is supported by chi-square analyses that indicate significantly higher rates of missing data for the PHQ-9 and WAI-SR-C across some countries compared to others. Nevertheless, it should be noted that this paper cannot rule out that data are missing not at random. Future research can explore this further using a sensitivity analysis. Fourth, the heterogeneity of interventions offered in the TAU group limits the study from conclusively tying causation to a specific comparator intervention. However, it should be noted that interventions offered by services in TAU were regarded as evidence-based, largely consisting of CBT and psychopharmacological interventions [ 35 ]. This may reduce the limitations associated with the multiple treatments offered in TAU [ 66 , 79 ] and adhering to the pragmatic trial’s ancillary objective to not impose specific constraints on clients and clinicians concerning data collection [ 79 ]. However, additional steps were also taken to address this limitation by conducting a subanalysis with a subset of trial country sites that only offered face-to-face CBT in TAU. The findings showed comparable results to those of the main analysis, highlighting that the addition of iCBT to face-to-face CBT may improve the quality of the working alliance. Fifth, another potential limitation is related to the variation in how bCBT was delivered across the trial’s country sites, concerning the number of sessions and the types of iCBT programs delivered, across different country sites. However, it should be noted that the study was focused on investigating the noninferiority of blending CBT given that there is a sufficient level of evidence concerning key treatment components, such as the CBT approach, and different delivery formats, including in-person and internet-based delivery of CBT for depression [ 80 , 81 ]. Although the number of treatment sessions varied between settings, to our knowledge, there is no evidence to suggest that the number of sessions of CBT effect the client-therapist alliance as the alliance is typically developed early in treatment and within the first 5 sessions [ 74 - 77 ]. Moreover, another study exploring the usage of different components of bCBT and treatment engagement when compared to intended use in the E-COMPARED study concluded that personalized blended care was more suitable compared to attempting to achieve a standardized optimal blend [ 69 ]. Variations in the number of treatment sessions described may enable a pragmatic understanding of the working alliance in bCBT interventions in real-world clinical settings [ 66 ].

Conclusions

To our knowledge, this is the first study to show that bCBT may enhance the working alliance when compared to routine care for depression and when compared to face-to-face CBT. The working alliance in bCBT was also associated with clinical improvements in depression, which appear to be enhanced by good program usability. Collectively, our findings appear to add further weight to the view that the addition of iCBT to face-to-face CBT may positively augment experiences of the working alliance.

Authors' Contributions

AD had full access to all of the data and takes full responsibility for the integrity of the data and the accuracy of the data analysis. AD, MQ, FS, RA, and CF contributed to the design concept. AD drafted the manuscript. AD, RA, MQ, FS, CF, RK, HR, AK, ACP, AvS, CB, TB, KC, MM, TK, JBH, SD, IT, NT, KM, KV, AU, GA, MB, and RMB critically revised the manuscript for important intellectual content. AD, MQ, RA, HR, FS, AK, RK, CF, ACP, and SD contributed to data acquisition, analysis, and interpretation. AD contributed to statistical analysis. HR, SD, AK, MB, and ACP provided administrative, technical, or material support. AD, RA, MQ, FS, RK, and CF supervised the study.

Conflicts of Interest

None declared.

Supplementary methods and results that include information on: Participants' country of birth, information on missing data, medians and IQR for working alliance, participant characteristics, depression and system usability scores, trial profile, and results of the sensitivity analysis and subgroup analysis.

CONSORT-eHEALTH checklist (V.1.6.1).

  • GBD 2017 DiseaseInjury IncidencePrevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. Nov 10, 2018;392(10159):1789-1858. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Fact Sheet: Suicide. World Health Organization. URL: https://www.who.int/news-room/fact-sheets/detail/suicide [accessed 2024-03-24]
  • Kohn R, Saxena S, Levav I, Saraceno B. The treatment gap in mental health care. Bull World Health Organ. Nov 2004;82(11):858-866. [ FREE Full text ] [ Medline ]
  • Fairburn CG, Patel V. The impact of digital technology on psychological treatments and their dissemination. Behav Res Ther. Jan 2017;88:19-25. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Torous J, Jän Myrick K, Rauseo-Ricupero N, Firth J. Digital Mental Health and COVID-19: Using Technology Today to Accelerate the Curve on Access and Quality Tomorrow. JMIR Ment Health. Mar 26, 2020;7(3):e18848. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kaltenthaler E, Parry G, Beverley C. Computerized Cognitive Behaviour Therapy: A Systematic Review. Behav. Cogn. Psychother. Feb 18, 2004;32(1):31-55. [ CrossRef ]
  • Ruwaard J, Lange A, Schrieken B, Emmelkamp P. Efficacy and effectiveness of online cognitive behavioral treatment: a decade of interapy research. Stud Health Technol Inform. 2011;167:9-14. [ Medline ]
  • Foroushani P, Schneider J, Assareh N. Meta-review of the effectiveness of computerised CBT in treating depression. BMC Psychiatry. Aug 12, 2011;11(1):131. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ivarsson D, Blom M, Hesser H, Carlbring P, Enderby P, Nordberg R, et al. Guided internet-delivered cognitive behavior therapy for post-traumatic stress disorder: A randomized controlled trial. Internet Interventions. Mar 2014;1(1):33-40. [ FREE Full text ] [ CrossRef ]
  • Cuijpers P, Donker T, Johansson R, Mohr DC, van Straten A, Andersson G. Self-guided psychological treatment for depressive symptoms: a meta-analysis. PLoS One. Jun 21, 2011;6(6):e21274. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Karyotaki E, Ebert DD, Donkin L, Riper H, Twisk J, Burger S, et al. Do guided internet-based interventions result in clinically relevant changes for patients with depression? An individual participant data meta-analysis. Clin Psychol Rev. Jul 2018;63:80-92. [ CrossRef ] [ Medline ]
  • Andrews G, Basu A, Cuijpers P, Craske M, McEvoy P, English C, et al. Computer therapy for the anxiety and depression disorders is effective, acceptable and practical health care: An updated meta-analysis. J Anxiety Disord. Apr 2018;55:70-78. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Josephine K, Josefine L, Philipp D, David E, Harald B. Internet- and mobile-based depression interventions for people with diagnosed depression: A systematic review and meta-analysis. J Affect Disord. Dec 01, 2017;223:28-40. [ CrossRef ] [ Medline ]
  • Erbe D, Eichert H, Riper H, Ebert D. Blending Face-to-Face and Internet-Based Interventions for the Treatment of Mental Disorders in Adults: Systematic Review. J Med Internet Res. Sep 15, 2017;19(9):e306. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bordin ES. The generalizability of the psychoanalytic concept of the working alliance. Psychotherapy: Theory, Research & Practice. 1979;16(3):252-260. [ CrossRef ]
  • Bordin ES. Theory and research on the therapeutic working alliance: New directions. In: Horvath AO, Greenberg LS, editors. The working alliance: Theory, research, and practice. New York, NY. John Wiley & Sons; 1994:13-37.
  • Raue P, Goldfried M. The therapeutic alliance in cognitive-behavior therapy. In: Horvath AO, Greenberg LS, editors. The working alliance: Theory, research, and practice. New York, NY. John Wiley & Sons; 1994:131-152.
  • Lambert MJ. Psychotherapy outcome research: Implications for integrative and eclectical therapists. In: Norcross JC, Goldfried MR, editors. Handbook of psychotherapy integration. New York, NY. Basic Books; 1992:94-129.
  • Norcross JC, Lambert MJ. Psychotherapy relationships that work II. Psychotherapy (Chic). Mar 2011;48(1):4-8. [ CrossRef ] [ Medline ]
  • Cameron S, Rodgers J, Dagnan D. The relationship between the therapeutic alliance and clinical outcomes in cognitive behaviour therapy for adults with depression: A meta-analytic review. Clin Psychol Psychother. May 2018;25(3):446-456. [ CrossRef ] [ Medline ]
  • Pihlaja S, Stenberg J, Joutsenniemi K, Mehik H, Ritola V, Joffe G. Therapeutic alliance in guided internet therapy programs for depression and anxiety disorders - A systematic review. Internet Interv. Mar 2018;11:1-10. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gómez Penedo J, Berger T, Grosse Holtforth M, Krieger T, Schröder J, Hohagen F, et al. The Working Alliance Inventory for guided Internet interventions (WAI-I). J Clin Psychol. Jun 2020;76(6):973-986. [ CrossRef ] [ Medline ]
  • Heim E, Rötger A, Lorenz N, Maercker A. Working alliance with an avatar: How far can we go with internet interventions? Internet Interv. Mar 1, 2018;11:41-46. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Henson P, Wisniewski H, Hollis C, Keshavan M, Torous J. Digital mental health apps and the therapeutic alliance: initial review. BJPsych Open. Jan 2019;5(1):e15. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sucala M, Schnur JB, Constantino MJ, Miller SJ, Brackman EH, Montgomery GH. The therapeutic relationship in e-therapy for mental health: a systematic review. J Med Internet Res. Aug 02, 2012;14(4):e110. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Berger T. The therapeutic alliance in internet interventions: A narrative review and suggestions for future research. Psychother Res. Sep 2017;27(5):511-524. [ CrossRef ] [ Medline ]
  • Wehmann E, Köhnen M, Härter M, Liebherz S. Therapeutic Alliance in Technology-Based Interventions for the Treatment of Depression: Systematic Review. J Med Internet Res. Jun 11, 2020;22(6):e17195. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hayati R, Bastani P, Kabir M, Kavosi Z, Sobhani G. Scoping literature review on the basic health benefit package and its determinant criteria. Global Health. Mar 02, 2018;14(1):26. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Tremain H, McEnery C, Fletcher K, Murray G. The Therapeutic Alliance in Digital Mental Health Interventions for Serious Mental Illnesses: Narrative Review. JMIR Ment Health. Aug 07, 2020;7(8):e17204. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Vernmark K, Hesser H, Topooco N, Berger T, Riper H, Luuk L, et al. Working alliance as a predictor of change in depression during blended cognitive behaviour therapy. Cogn Behav Ther. Jul 2019;48(4):285-299. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kooistra L, Ruwaard J, Wiersma J, van Oppen P, Riper H. Working Alliance in Blended Versus Face-to-Face Cognitive Behavioral Treatment for Patients with Depression in Specialized Mental Health Care. J Clin Med. Jan 27, 2020;9(2):347. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Askjer S, Mathiasen K. The working alliance in blended versus face-to-face cognitive therapy for depression: A secondary analysis of a randomized controlled trial. Internet Interv. Sep 2021;25:100404. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Barazzone N, Cavanagh K, Richards D. Computerized cognitive behavioural therapy and the therapeutic alliance: a qualitative enquiry. Br J Clin Psychol. Nov 2012;51(4):396-417. [ CrossRef ] [ Medline ]
  • Clarke J, Proudfoot J, Whitton A, Birch M, Boyd M, Parker G, et al. Therapeutic Alliance With a Fully Automated Mobile Phone and Web-Based Intervention: Secondary Analysis of a Randomized Controlled Trial. JMIR Ment Health. Feb 25, 2016;3(1):e10. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kleiboer A, Smit J, Bosmans J, Ruwaard J, Andersson G, Topooco N, et al. European COMPARative Effectiveness research on blended Depression treatment versus treatment-as-usual (E-COMPARED): study protocol for a randomized controlled, non-inferiority trial in eight European countries. Trials. Aug 03, 2016;17(1):387. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Doukani A, Free C, Michelson D, Araya R, Montero-Marin J, Smith S, et al. Towards a conceptual framework of the working alliance in a blended low-intensity cognitive behavioural therapy intervention for depression in primary mental health care: a qualitative study. BMJ Open. Sep 23, 2020;10(9):e036299. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Barceló-Soler A, García-Campayo J, Araya R, Doukani A, Gili M, García-Palacios A, et al. Working alliance in low-intensity internet-based cognitive behavioral therapy for depression in primary care in Spain: A qualitative study. Front Psychol. 2023;14:1024966. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mathiasen K, Andersen TE, Riper H, Kleiboer AAM, Roessler KK. Blended CBT versus face-to-face CBT: a randomised non-inferiority trial. BMC Psychiatry. Dec 05, 2016;16(1):432. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. Sep 2001;16(9):606-613. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Lecrubier Y, Sheehan D, Weiller E, Amorim P, Bonora I, Sheehan KH, et al. The Mini International Neuropsychiatric Interview (MINI). A short diagnostic structured interview: reliability and validity according to the CIDI. Eur. psychiatr. Apr 16, 2020;12(5):224-231. [ CrossRef ]
  • van der Vaart R, Witting M, Riper H, Kooistra L, Bohlmeijer E, van Gemert-Pijnen L. Blending online therapy into regular face-to-face therapy for depression: content, ratio and preconditions according to patients and therapists using a Delphi study. BMC Psychiatry. Dec 14, 2014;14:355. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Herrero R, Vara M, Miragall M, Botella C, García-Palacios A, Riper H, et al. Working Alliance Inventory for Online Interventions-Short Form (WAI-TECH-SF): The Role of the Therapeutic Alliance between Patient and Online Program in Therapeutic Outcomes. Int J Environ Res Public Health. Aug 25, 2020;17(17):6169. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sheehan D, Lecrubier Y, Harnett Sheehan K, Janavs J, Weiller E, Keskiner A, et al. The validity of the Mini International Neuropsychiatric Interview (MINI) according to the SCID-P and its reliability. European Psychiatry. 1997;12(5):232-241. [ FREE Full text ] [ CrossRef ]
  • Rohde P, Lewinsohn PM, Seeley JR. Comparability of telephone and face-to-face interviews in assessing axis I and II disorders. Am J Psychiatry. Nov 1997;154(11):1593-1598. [ CrossRef ] [ Medline ]
  • Ruskin PE, Reed S, Kumar R, Kling MA, Siegel E, Rosen M, et al. Reliability and acceptability of psychiatric diagnosis via telecommunication and audiovisual technology. Psychiatr Serv. Aug 1998;49(8):1086-1088. [ CrossRef ] [ Medline ]
  • Horvath AO, Greenberg LS. Development and validation of the Working Alliance Inventory. Journal of Counseling Psychology. 1989;36(2):223-233. [ CrossRef ]
  • Cahill J, Barkham M, Hardy G, Gilbody S, Richards D, Bower P, et al. A review and critical appraisal of measures of therapist-patient interactions in mental health settings. Health Technol Assess. Jun 2008;12(24):iii, ix-iii, 47. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Fenton L, Cecero J, Nich C, Frankforter T, Carroll K. Perspective is everything: the predictive validity of six working alliance instruments. J Psychother Pract Res. 2001;10(4):262-268. [ FREE Full text ] [ Medline ]
  • Luborsky L, Barber J, Siqueland L, Johnson S, Najavits L, Frank A, et al. The Revised Helping Alliance Questionnaire (HAq-II) : Psychometric Properties. J Psychother Pract Res. 1996;5(3):260-271. [ FREE Full text ] [ Medline ]
  • Wittkampf KA, Naeije L, Schene AH, Huyser J, van Weert HC. Diagnostic accuracy of the mood module of the Patient Health Questionnaire: a systematic review. Gen Hosp Psychiatry. Sep 2007;29(5):388-395. [ CrossRef ] [ Medline ]
  • Brooke J. SUS: A 'Quick and Dirty' Usability Scale. In: Jordan PW, Thomas B, McClelland IL, Weerdmeester B, editors. Usability Evaluation In Industry. London, UK. CRC Press; 1996.
  • Bangor A, Kortum PT, Miller JT. An Empirical Evaluation of the System Usability Scale. International Journal of Human-Computer Interaction. Jul 30, 2008;24(6):574-594. [ CrossRef ]
  • Mol M, van Schaik A, Dozeman E, Ruwaard J, Vis C, Ebert D, et al. Dimensionality of the system usability scale among professionals using internet-based interventions for depression: a confirmatory factor analysis. BMC Psychiatry. May 12, 2020;20(1):218. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ranganathan P, Pramesh C, Aggarwal R. Common pitfalls in statistical analysis: Intention-to-treat versus per-protocol analysis. Perspect Clin Res. 2016;7(3):144-146. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • IBM SPSS Statistics 26. IBM Corp. URL: https://www.ibm.com/support/pages/ibm-spss-statistics-26-documentation [accessed 2024-03-24]
  • Stata Statistical Software: Release 16. Stata Corp. URL: https://www.scirp.org/reference/referencespapers?referenceid=2757660 [accessed 2024-03-24]
  • Hayes AF. Introduction to Mediation, Moderation, and Conditional Process Analysis (Second Edition): A Regression-Based Approach. New York, NY. Guilford Press; 2017.
  • Kirkwood BR, Stern JAC. Essential Medical Statistics, 2nd Edition. Hoboken, NJ. Wiley; 2003.
  • Carpenter JR, Kenward MG. Multiple Imputation and its Application. New York, NY. John Wiley & Sons; 2013.
  • von Hippel PT. 8. How to Impute Interactions, Squares, and other Transformed Variables. Sociological Methodology. Aug 01, 2009;39(1):265-291. [ FREE Full text ] [ CrossRef ]
  • de Souza R, Eisen R, Perera S, Bantoto B, Bawor M, Dennis B, et al. Best (but oft-forgotten) practices: sensitivity analyses in randomized controlled trials. Am J Clin Nutr. Jan 2016;103(1):5-17. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Thabane L, Mbuagbaw L, Zhang S, Samaan Z, Marcucci M, Ye C, et al. A tutorial on sensitivity analyses in clinical trials: the what, why, when and how. BMC Med Res Methodol. Jul 16, 2013;13:92. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Parpia S, Morris T, Phillips M, Wykoff C, Steel D, Thabane L, et al. Retina Evidence Trials InterNational Alliance (R.E.T.I.N.A.) Study Group. Sensitivity analysis in clinical trials: three criteria for a valid sensitivity analysis. Eye (Lond). Nov 2022;36(11):2073-2074. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Morris TP, Kahan BC, White IR. Choosing sensitivity analyses for randomised trials: principles. BMC Med Res Methodol. Jan 24, 2014;14(1):11. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Burke J, Sussman J, Kent D, Hayward R. Three simple rules to ensure reasonably credible subgroup analyses. BMJ. Nov 04, 2015;351:h5651. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Dawson L, Zarin D, Emanuel E, Friedman L, Chaudhari B, Goodman S. Considering usual medical care in clinical trial design. PLoS Med. Sep 2009;6(9):e1000111. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Farrokhyar F, Skorzewski P, Phillips M, Garg S, Sarraf D, Thabane L, et al. Retina Evidence Trials InterNational Alliance (R.E.T.I.N.A.) Study Group. When to believe a subgroup analysis: revisiting the 11 criteria. Eye (Lond). Nov 2022;36(11):2075-2077. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Doukani A, Free C, Araya R, Michelson D, Cerga-Pashoja A, Kakuma BR. Practitioners' experience of the working alliance in a blended cognitive-behavioural therapy intervention for depression: qualitative study of barriers and facilitators. BJPsych Open. Jul 25, 2022;8(4):e142. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kemmeren LL, van Schaik A, Smit JH, Ruwaard J, Rocha A, Henriques M, et al. Unraveling the Black Box: Exploring Usage Patterns of a Blended Treatment for Depression in a Multicenter Study. JMIR Ment Health. Jul 25, 2019;6(7):e12707. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Titzler I, Saruhanjan K, Berking M, Riper H, Ebert D. Barriers and facilitators for the implementation of blended psychotherapy for depression: A qualitative pilot study of therapists' perspective. Internet Interv. Jun 2018;12:150-164. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Titzler I, Berking M, Schlicker S, Riper H, Ebert D. Barriers and Facilitators for Referrals of Primary Care Patients to Blended Internet-Based Psychotherapy for Depression: Mixed Methods Study of General Practitioners' Views. JMIR Ment Health. Aug 18, 2020;7(8):e18642. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cerga-Pashoja A, Doukani A, Gega L, Walke J, Araya R. Added value or added burden? A qualitative investigation of blending internet self-help with face-to-face cognitive behaviour therapy for depression. Psychother Res. Nov 05, 2020;30(8):998-1010. [ CrossRef ] [ Medline ]
  • Wisniewski H, Torous J. Digital navigators to implement smartphone and digital tools in care. Acta Psychiatr Scand. Apr 2020;141(4):350-355. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Crits-Christoph P, Gibbons MBC, Hamilton J, Ring-Kurtz S, Gallop R. The dependability of alliance assessments: the alliance-outcome correlation is larger than you might think. J Consult Clin Psychol. Jun 2011;79(3):267-278. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Piper W, Azim H, Joyce A, McCallum M. Transference interpretations, therapeutic alliance, and outcome in short-term individual psychotherapy. Arch Gen Psychiatry. Oct 1991;48(10):946-953. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Eames V, Roth A. Patient attachment orientation and the early working alliance-a study of patient and therapist reports of alliance quality and ruptures. Psychother Res. Dec 23, 2000;10(4):421-434. [ CrossRef ] [ Medline ]
  • Norcross JC, Wampold BE. Evidence-based therapy relationships: research conclusions and clinical practices. Psychotherapy (Chic). Mar 2011;48(1):98-102. [ CrossRef ] [ Medline ]
  • Rothman K. No Adjustments Are Needed for Multiple Comparisons. Epidemiology. 1990;1(1):43-46. [ FREE Full text ] [ CrossRef ]
  • Giraudeau B, Caille A, Eldridge S, Weijer C, Zwarenstein M, Taljaard M. Heterogeneity in pragmatic randomised trials: sources and management. BMC Med. Oct 28, 2022;20(1):372. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Karyotaki E, Efthimiou O, Miguel C, Bermpohl F, Furukawa T, Cuijpers P, Individual Patient Data Meta-Analyses for Depression (IPDMA-DE) Collaboration, et al. Internet-Based Cognitive Behavioral Therapy for Depression: A Systematic Review and Individual Patient Data Network Meta-analysis. JAMA Psychiatry. Apr 01, 2021;78(4):361-371. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kambeitz-Ilankovic L, Rzayeva U, Völkel L, Wenzel J, Weiske J, Jessen F, et al. A systematic review of digital and face-to-face cognitive behavioral therapy for depression. NPJ Digit Med. Sep 15, 2022;5(1):144. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by Y Hong; submitted 29.03.23; peer-reviewed by A AL-Asadi, A González-Robles; comments to author 26.01.24; revised version received 09.02.24; accepted 11.02.24; published 31.05.24.

©Asmae Doukani, Matteo Quartagno, Francesco Sera, Caroline Free, Ritsuko Kakuma, Heleen Riper, Annet Kleiboer, Arlinda Cerga-Pashoja, Anneke van Schaik, Cristina Botella, Thomas Berger, Karine Chevreul, Maria Matynia, Tobias Krieger, Jean-Baptiste Hazo, Stasja Draisma, Ingrid Titzler, Naira Topooco, Kim Mathiasen, Kristofer Vernmark, Antoine Urech, Anna Maj, Gerhard Andersson, Matthias Berking, Rosa María Baños, Ricardo Araya. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 31.05.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

primary and secondary data research methodology

Pengaruh Pengetahuan Perpajakan, Kesadaran Wajib Pajak Dan Kualitas Pelayanan Fiskus Terhadap Kepatuhan Wajib Pajak Dalam Membayar Pajak Bumi Dan Bangunan Di Kecamatan Bojongloa Kaler

  • Geovanca Arto Jurusan Akuntansi, Univeristas Indonesia Membangun
  • Kasir Kasir Jurusan Akuntansi, Univeristas Indonesia Membangun

The research objective is to provide a deep understanding of the set of factors that hold influence on tax compliance, so that either the government or stakeholders can take strategic steps to increase tax revenue. The quantitative method of research applied descriptive and verification approaches. Primary data is obtained directly from the sample under study through the distribution of questionnaires which are then analyzed using statistical test tools. While secondary data is obtained through documents and literature relevant to this research. Sampling was carried out through the application of purposive sampling techniques with the Slovin formula. The population in the study was 13,779 land and building taxpayers whose objects were located in Bojongloa Kaler District. With this population and a margin of error of 10%, the number of samples obtained in this study was 100 respondents in Bojongloa Kaler District. Based on the research results, it is concluded that there is a strong positive influence between taxation knowledge (X1) and taxpayer awareness (X2) on taxpayer compliance (Y) with a significance level of 0.000 (p < 0.005) and t count 4.482 > t table 1.984 for X1, and 0.000 (p < 0.005) with t count 3.783 > t table 1.984 for X2. However, there is no strong influence between the quality of fiscal services (X3) on taxpayer compliance (Y) with a significance level of 0.547 (p> 0.005) and t count 0.605 < t table 1.984. So it can be concluded that taxation knowledge and taxpayer awareness hold a strong influence on taxpayer compliance, while the quality of tax authorities' services does not hold a strong influence on taxpayer compliance.

primary and secondary data research methodology

  • Klik disini untuk file PDF

Copyright (c) 2024 JURNAL MANEKSI

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License .

primary and secondary data research methodology

Decision Of The Director General Of Strengthening Research And Development Ministry Of Research, Technology, And Higher Education Republic Of Indonesia: 

primary and secondary data research methodology

Journal Statistics

Flag Counter

Information

  • For Readers
  • For Authors
  • For Librarians

Editorial Office POLITEKNIK NEGERI AMBON PUSAT PENELITIAN DAN PENGABDIAN MASYARAKAT JL. Ir. M. Putuhena, Wailela-Rumahtiga, Ambon Maluku, Indonesia Kode Pos: 97234 Contact: 081343016488

Lisensi Creative Commons

COMMENTS

  1. What Is Primary Data and Secondary Data in Research Methodology

    Primary data is collected through surveys, interviews, experiments, or observations while secondary data is obtained from existing sources such as books, journals, newspapers, and websites. Collecting both types of data requires careful planning and execution to ensure accuracy and reliability. Analyzing the results of primary and secondary ...

  2. Primary Data

    The purpose of primary data is to gather information directly from the source, without relying on secondary sources or pre-existing data. This data is collected through research methods such as surveys, interviews, experiments, and observations. Primary data is valuable because it is tailored to the specific research question or problem at hand ...

  3. Dissertations 4: Methodology: Methods

    The use of primary data, as opposed to secondary data, demonstrates the researcher's effort to do empirical work and find evidence to answer her specific research question and fulfill her specific research objectives. Thus, primary data contribute to the originality of the research. Ultimately, you should state in this section of the methodology:

  4. Primary Research vs Secondary Research in 2024: Definitions

    When doing secondary research, researchers use and analyze data from primary research sources. Secondary research is widely used in many fields of study and industries, such as legal research and market research. In the sciences, for instance, one of the most common methods of secondary research is a systematic review. ... Combining primary and ...

  5. Primary vs secondary research

    Primary research definition. When you conduct primary research, you're collecting data by doing your own surveys or observations. Secondary research definition: In secondary research, you're looking at existing data from other researchers, such as academic journals, government agencies or national statistics. Free Ebook: The Qualtrics ...

  6. Primary Research

    Primary research is a research method that relies on direct data collection, rather than relying on data that's already been collected by someone else. In other words, primary research is any type of research that you undertake yourself, firsthand, while using data that has already been collected is called secondary research .

  7. Primary vs. Secondary Data

    Primary data collection vs. secondary data collection. The distinction between primary and secondary data lies in their origin and the method through which they are collected. Collecting primary data means obtaining information directly from the source.. Researchers collect these data for the specific purpose of addressing the research question at hand. The focus on collecting data from ...

  8. What is Secondary Research?

    When to use secondary research. Secondary research is a very common research method, used in lieu of collecting your own primary data. It is often used in research designs or as a way to start your research process if you plan to conduct primary research later on.. Since it is often inexpensive or free to access, secondary research is a low-stakes way to determine if further primary research ...

  9. Acquiring data in medical research: A research primer for low- and

    Sources of data: primary vs secondary data. To answer a research question, there are many potential sources of data. Two main categories are primary data and secondary data. Primary data is newly collected data; it can be gathered directly from people's responses (surveys), or from their biometrics (blood pressure, weight, blood tests, etc.).

  10. Primary Data

    What is the difference between primary and secondary data? Understanding the distinction between primary and secondary data is fundamental in the realm of research, as it influences the research design, methodology, and analysis.Primary data is information collected firsthand for a specific research purpose.

  11. Primary vs. Secondary Sources

    Primary sources provide raw information and first-hand evidence. Examples include interview transcripts, statistical data, and works of art. Primary research gives you direct access to the subject of your research. Secondary sources provide second-hand information and commentary from other researchers. Examples include journal articles, reviews ...

  12. Secondary Qualitative Research Methodology Using Online Data within the

    Data collected for primary research with a particular question in mind may not fit or be appropriate for the secondary research question (Heaton, 2008; Sindin, 2017). Yet, if the data is relevant to the secondary research question although not in the ideal format, with flexibility of the methodology of the secondary analysis, this can be resolved.

  13. Primary & Secondary Data Definitions

    Primary Data: Data that has been generated by the researcher himself/herself, surveys, interviews, experiments, specially designed for understanding and solving the research problem at hand. Secondary Data: Using existing data generated by large government Institutions, healthcare facilities etc. as part of organizational record keeping.The data is then extracted from more varied datafiles.

  14. Secondary Data

    Types of secondary data are as follows: Published data: Published data refers to data that has been published in books, magazines, newspapers, and other print media. Examples include statistical reports, market research reports, and scholarly articles. Government data: Government data refers to data collected by government agencies and departments.

  15. Secondary Data Analysis: Using existing data to answer new questions

    Secondary data analysis is a valuable research approach that can be used to advance knowledge across many disciplines through the use of quantitative, qualitative, or mixed methods data to answer new research questions ( Polit & Beck, 2021 ). This research method dates to the 1960s and involves the utilization of existing or primary data ...

  16. Difference Between Primary and Secondary Data

    In research, there are different methods used to gather information, all of which fall into two categories, i.e. primary data, and secondary data. As the name suggests, primary data is one which is collected for the first time by the researcher while secondary data is the data already collected or produced by others.

  17. PDF An Introduction to Secondary Data Analysis

    Secondary analysis of qualitative data is a topic unto itself and is not discussed in this volume. The interested reader is referred to references such as James and Sorenson (2000) and Heaton (2004). The choice of primary or secondary data need not be an either/or ques-tion. Most researchers in epidemiology and public health will work with both ...

  18. Primary vs Secondary Research Methods: 15 Key Differences

    Primary research is a research approach that involves gathering data directly while secondary research is a research approach that involves relying on already existing data when carrying out a systematic investigation. This means that in primary research, the researcher is directly involved in the data collection and categorization process.

  19. Primary vs Secondary Research: Differences, Methods, Sources, and More

    Navigating the Pros and Cons. Balance Your Research Needs: Consider starting with secondary research to gain a broad understanding of the subject matter, then delve into primary research for specific, targeted insights that are tailored to your precise needs. Resource Allocation: Evaluate your budget, time, and resource availability. Primary research can offer more specific and actionable data ...

  20. Research Methods In Psychology

    Open questions in questionnaires and accounts from observational studies collect qualitative data. Primary data is first-hand data collected for the purpose of the investigation. Secondary data is information that has been collected by someone other than the person who is conducting the research e.g. taken from journals, books or articles.

  21. Difference Between Primary and Secondary Data

    Learn the difference between primary and secondary data in research. ... The researcher's objectives and purposes will determine the choice of research method. 5. Conclusion. In this article, we compare primary and secondary data. The former helps in analysis with more precision and detail but demands more time and resources for collecting.

  22. Conducting secondary analysis of qualitative data: Should we, can we

    SDA involves investigations where data collected for a previous study is analyzed - either by the same researcher(s) or different researcher(s) - to explore new questions or use different analysis strategies that were not a part of the primary analysis (Szabo and Strang, 1997).For research involving quantitative data, SDA, and the process of sharing data for the purpose of SDA, has become ...

  23. Secondary Data in Research

    In simple terms, secondary data is every. dataset not obtained by the author, or "the analysis. of data gathered b y someone else" (Boslaugh, 2007:IX) to be more sp ecific. Secondary data may ...

  24. Primary, Secondary and Tertiary Sources

    A secondary source is a document or work where its author had an indirect part in a study or creation; an author is usually writing about or reporting the work or research done by someone else. Secondary sources can be used for additional or supporting information; they are not the direct product of research or the making of a creative work.

  25. Biomarkers for personalised prevention of chronic diseases: a common

    In recent years, innovative health research has moved quickly towards a new paradigm. The ability to analyse and process previously unseen sources and amounts of data, e.g. environmental, clinical, socio-demographic, epidemiological, and 'omics-derived, has created opportunities in the understanding and prevention of chronic diseases, and in the development of targeted therapies that can ...

  26. Long-term consequences of urinary tract infection in childhood: an

    Background Childhood urinary tract infection (UTI) can cause renal scarring, and possibly hypertension, chronic kidney disease (CKD), and end-stage renal failure (ESRF). Previous studies have focused on selected populations, with severe illness or underlying risk factors. The risk for most children with UTI is unclear. Aim To examine the association between childhood UTI and outcomes in an ...

  27. Journal of Medical Internet Research

    Methods: We conducted a secondary data analysis of the E-COMPARED (European Comparative Effectiveness Research on Blended Depression Treatment versus Treatment-as-usual) trial, which compared bCBT with TAU across 9 European countries. Data were collected in primary care and specialized services between April 2015 and December 2017.

  28. Pengaruh Pengetahuan Perpajakan, Kesadaran Wajib Pajak Dan Kualitas

    The quantitative method of research applied descriptive and verification approaches. Primary data is obtained directly from the sample under study through the distribution of questionnaires which are then analyzed using statistical test tools. While secondary data is obtained through documents and literature relevant to this research.