Library Guides: Understanding Research and Critical Appraisal: Critical appraisal of research papers

The need for critical appraisal

Why critical appraisal?

As noted earlier, research studies can vary in methodological quality, and critical appraisal provides a means to consistently evaluate the validity, reliability and relevance of research results.

The purpose of critical appraisal is not to criticise a paper, but to appraise the methodology for any factors which could impact upon the findings. Below are some initial questions to ask of a paper when you set about reading it.

What is the research question?
Why was the research needed?
What type of study design was used?
Was the study design appropriate for answering the research question?

The answers to these questions will typically be found in the abstract, the introduction or the methods section of a paper, and should offer an initial sense of the study.

Validity

Critical appraisal will help to determine the validity of a study in two important ways. Internal validity relates to the reliability and accuracy of research results, effected by the appropriateness of the research methods used to address the research question, and the extent to which a study measures what it sets out to measure. External validity relates to the extent to which research results are generalisable to populations beyond the study participants.

Participants

Depending on the study design used, there are various details relating to research participants that should be included in a paper. These might be, for example: how the participants were recruited, demographic and other relevant characteristics of the study participants, how participants were assigned to study groups, how many participants took part, and how many participants (if any) dropped out before the end of the study.

Confounding

Many research studies aim to establish an association between a treatment or exposure, and an outcome. A confounding factor is any factor which might influence the association of the two, either suggesting an association where none exists (or distorting the nature of the association), or masking a true association, and leading to incorrect conclusions. Confounding factors might include demographic characteristics of participants, such as age or gender, or lifestyle factors such as diet or exercise.

Researchers should take steps to identify potential confounding factors which might influence a study, and control these, either through the design of the study, or through statistical analysis of the research data.

Bias

There are various forms of bias which may be introduced into research at various points during a study, and effect the accuracy of the study findings. Researchers should be aware of any potential biases in relation to their study, and take steps to avoid these, and as readers, we should be also be aware of the relevant biases and the measures which might be taken to avoid or minimise their impact.

Below are some potential biases which can effect research findings, or the application of research evidence to practice. For a comprehensive list of biases, see the Catalogue of Bias website.

Selection bias is introduced when the participants in a study have some systematically different characteristic to those not under study. This bias can also be introduced if there is some systematic difference between participants in the treatment group, and those in the control group. When reading a paper, look for details on the number of participants screened and included, the randomisation procedure used (if applicable), baseline comparisons of study groups, and procedures for handling any missing data.

Observer bias is introduced when there are differences between the observations or assessments made by researchers and the true values of such observations. When a study is collecting subjective data, observer bias might occur owing to the beliefs, values or perspective of the researcher(s). When a study is collecting objective data, observer bias might occur owing to different practices in the interpretation or recording of data. Blinding of those researchers who are to assess outcomes can help, where applicable, as can researcher acknowledgement of the potential impact of their bias(es).

Attrition bias is introduced when those participants who drop out of a study differ in some systematic way from the participants who complete the study. Attrition bias can distort the findings of a study, leading to incorrect assumptions about the effect of a treatment, for example. When reading a paper, look for information on how participant data was analysed, and for information on the number of participants lost and their reasons for leaving the study.

Reporting bias is introduced when researchers selectively report - or suppress - relevant data or information from a study. Reporting bias might arise from researchers withholding relevant conflicts of interest, changing study outcomes to fit data, over-reporting potential benefits or under-reporting potential harms.

Publication bias is a form of reporting bias, which results from non-publication of research studies which produce either negative or insubstantial findings. The absence of this research data can lead to the distortion of the full body of evidence on a subject, where only positive findings are published. The registration and reporting of clinical trials in trial registries allows access to some of this data, but this is reliant on researchers complying with requirements. Systematic reviews and meta-analyses can also play a part in reducing the impact of publication bias, by going beyond published journal articles when collecting their data sources, and by employing appropriate statistical methods to assess for publication bias.

Statistics

Two very commonly reported statistical calculations of the results of a trial are the p-value and the confidence interval (CI).

The p-value indicates the probability of any given outcome arising through chance. A smaller p-value equates to a smaller likelihood of the outcome occurring by chance, and the conventional cut off point for a statistically significant p-value is 0.05 or less (equivalent to a probability of 1 in 20 or less).

The confidence interval is a statistical range within which the true effect or result of a study lies, and is a means for researchers to convey their confidence in their findings. The convention is to state the 95% confidence interval, which is the range within which true results would lie on 95% of occasions.

As research studies involve only sample populations, reporting the p-value and/or CI can help to statistically generalise the applicability of research findings from the study participants to the overall population. In both cases, however, it is important to note that a smaller study sample population offers less certainty in the ability to generalise findings.

Significance

A research study might produce findings which have statistical significance or clinical significant, or both. The p-value and/or CI reported in a study can help determine the statistical significance of results, while the clinical significance of a study relates to whether the findings translate to worthwhile or noticeable improvements for a real population.

It is important to bear in mind that findings which are not statistically significant may have clinical significance.

Further statistical measures of clinical effectiveness

For further information on common statistical terms and concepts you will encounter when appraising papers such as odds, odd ratios, risk, risk ratios and numbers needed to treat, you will find a number of helpful articles here:

AKT Statistics Topics – Dr Chris Cates' EBM Website (nntonline.net)

Critical appraisal checklists

The below sites host checklists which can be used to help appraise the reliability and applicability of publications, ranging from peer-reviewed articles to grey literature

Amstar systematic review checklist
AMSTAR (A MeaSurement Tool to Assess systematic Reviews) can be used to assess the methodological quality of a systematic review and as a guide to performing a systematic review. Two agreements are required during quality assessment ensuring lower risk of bias. AMSTAR has guidelines explaining each outlined item
CASP Appraisal Checklists
This set of eight critical appraisal tools are designed to be used when reading research, these include tools for Systematic Reviews, Randomised Controlled Trials, Cohort Studies, Case Control Studies, Economic Evaluations, Diagnostic Studies, Qualitative studies and Clinical Prediction Rule.
Critical Appraisal tools
A series of critical appraisal worksheets for different types of research studies, from the Oxford Centre for Evidence Based Medicine (CEBM) - including versions in multiple languages
Critical Appraisal Tools
Critical appraisal checklists for a broad range of research study types, from the Joanna Briggs Institute
SIGN checklists
These checklists were subjected to evaluation and adaptation to meet the Scottish Intercollegiate Guidelines Network's (SIGN’s) requirements for a balance between methodological rigour and practicality of use.
Mixed Methods Appraisal Tool (MMAT)
The MMAT is intended to be used as a checklist for concomitantly appraising and/or describing studies included in systematic mixed studies reviews (reviews including original qualitative, quantitative and mixed methods studies).
AACODS grey literature checklist
The AACODS checklist is a tool for evaluating the Authority, Accuracy, Coverage, Objective, Date and Significance of grey literature.

Critical appraisal for anti-racism

Traditional appraisal tools do not prompt appraisers to look into questions around racial under-representation in research and racial bias in interpretation of findings with regards to minoritised ethnic groups in medical research. In order to combat this, librarian and information specialist, Ramona Naicker has developed a supplementary tool that can be used to address these issues.

Watch the video below for an outline of the issues and how the checklist can be used.