Critical appraisal of evidence
Appraising systematic reviews
Appraising randomised controlled trials
Appraising cohort studies
Appraising qualitative studies
Before we even consider critical appraisal and the quality of an article, it should be clear that every article should first be assessed according to its level in the ‘hierarchy of evidence’ as shown in Step 2. The hierarchy of evidence reflects the potential of each study to answer a particular type of question (i.e. treatment, diagnosis, prognosis, aetiology, screening intervention). However, each design within the hierarchy has its own inherent strengths and weaknesses, and their usefulness can vary according to what it is you want to find out.
Critical appraisal is the process of systematically examining research evidence to assess its validity and relevance before using it to inform a clinical decision (Hill & Spittlehouse 2003). It can help clinicians identify the strengths and weaknesses of a research article and make a judgement on the usefulness of its findings. Critical appraisal assists clinicians to decide whether a piece of reported research should be used to influence their clinical practice. In other words, the underlying reason why one needs to critically appraise articles is to be able to make an informed decision as to whether the results of the paper are believable or not, and whether they are applicable to their clinical situation. If you are to practice according to the principles of EBP, then it would make sense to underpin your practice with good quality evidence, rather than poor quality evidence which might prove to be biased and unreliable, as there is no point changing practice as a result of poor evidence.
Critical appraisal is typically done via the use of standard appraisal forms or checklists. The type of critical appraisal tool chosen to appraise a research article will depend on how the research was designed (i.e. study design). There are tools which are generic and can therefore be used for many different types of research designs. Reading the “overview” of each critical appraisal tool will provide information about its scope/coverage.
For a comprehensive list of critical appraisal tools, visit this site: International Centre for Allied Health Evidence repository of critical appraisal tools.
How to appraise a systematic review using the CASP (Critical Appraisal Skills Programme) tool
Screening Questions |
|
Did the review address a clearly focused question? Hint: An issue can be focused in terms: The population studied; The intervention (exposure) given; The outcome considered |
This question determines the relevance of the study to your clinical practice. Examine if the study clearly defined their population of interest, the nature of the intervention (or exposure) and the outcomes of interest. If these are not clearly defined, you may find it difficult to determine which patients the results apply to, the intervention that the study proposes and whether this intervention produces outcomes which you (as a practitioner) and your patients/clients consider important. |
Did the authors look for the appropriate sort of papers? Hint: The ‘best sort of studies’ would address the review’s question, and have an appropriate study design. |
This question looks at the appropriateness of the study designs included in the systematic review in addressing the research question. For example, if the review is asking about the effectiveness of an intervention, a randomised controlled trial is considered a gold standard. On the other hand, if the review question is about prognosis or aetiology, then cohort studies are considered appropriate. For a guide as to which study designs are appropriate for a review question, you may refer to the ‘hierarchy of evidence’ table by NHMRC. |
Your answer to these screening questions will help you determine whether or not it is worth continuing with the paper. Answering yes to both questions means it is worth continuing, otherwise, you may have to search for another paper. |
|
Detailed Questions |
|
Do you think the important, relevant studies were included? Hint: Look for: which bibliographic databases were used; follow up from reference lists; personal contact with experts; search for unpublished as well as published studies; search for non-English language studies |
Here you are looking at how thorough the search was done, and if other potentially important sources were explored. A good systematic review should have undertaken a comprehensive search of the literature and is unlikely to have missed any relevant articles. |
Did the review’s authors do enough to assess the quality of the included studies? |
The authors need to consider the rigour of the studies they have identified. Lack of rigour may affect the studies’ results. |
If the results of the review have been combined, was it reasonable to do so? Hint: Consider whetherthe results of each study are clearly displayedthe results were similar from study to study (look for tests of heterogeneity)the reasons for any variations in results are discussed |
This question looks at the appropriateness of meta-analysis (i.e. a statistical technique for combining the findings from independent studies) in synthesising the results of a systematic review. If a meta-analysis was not undertaken and a narrative synthesis was done instead, this should be justified by the authors. For more information about meta-analysis: Crombie I and Davies H. What is meta-analysis? 2009. |
What are the overall results of the review? |
Consider if you are clear about the review’s bottom line results what these are how were the results expressed (odds ratio, mean difference, etc.) |
How precise are the results? Hint: Look at the confidence intervals, if given (i.e. if a meta-analysis was undertaken) |
Precision of the results can be determined based on the width of the confidence intervals. The confidence interval describes a range of values within which you can be reasonably sure that the true effect actually lies. If the confidence interval is relatively narrow (e.g. 0.70 to 0.80), the effect size is known precisely. If the interval is wider (e.g. 0.60 to 0.93) the uncertainty is greater, although there may still be enough precision to make decisions about the utility of the intervention. Intervals that are very wide (e.g. 0.50 to 1.10) indicate that there is little knowledge about the effect, and that further information is needed [Cochrane Library].The width of a confidence interval for a meta-analysis depends on the precision of the individual study estimates and on the number of studies combined. As more studies are added to a meta-analysis the width of the confidence interval usually decreases. However, if the additional studies increase the heterogeneity in the meta-analysis and a random-effects model is used, it is possible that the confidence interval width will increase [Cochrane Library]. For more information about interpreting meta-analysis: |
Can the results be applied to the local population? |
Hint: Consider whether the patients covered by the review could be sufficiently different to your population to cause concern your local setting is likely to differ much from that of the review |
Were all outcomes considered? |
Hint: Consider whether the review considered the outcomes that the clinician and the patient are likely to view as important? |
Are the benefits worth the harms and costs? |
|
Activity
Read the following article and complete the CASP for systematic review tool.
To check your appraisal, open the link below:
CASP Appraisal for Baker et al 2011
The PEDro scale is a tool for assessing the methodological quality of randomised controlled trials. Please click the link below to access the PEDro scale. There are explicit criteria for addressing each of the questions in the scale.
PEDro scale: guidelines for appraisal
Activity
Read the following article and complete the PEDro scale.
To check your appraisal, open the link below:
PEDro Appraisal for Hansson et al 2010
Screening Questions |
|
Did the trial address a clearly focused question? Hint: An issue can be focused in terms of: The population studied; The intervention (exposure) given; The comparator given; The outcome considered |
This question determines the relevance of the study to your clinical practice. Examine if the study clearly defined their population of interest, the nature of the intervention (or exposure) and the outcomes of interest. If these are not clearly defined, you may find it difficult to determine which patients the results apply to, the intervention that the study proposes and whether this intervention produces outcomes which you (as a practitioner) and your patients/clients consider important. |
Was the assignment of patients to treatments randomized? |
Randomisation minimises bias and confounding factors which adds to the strength of the study. Why randomise: Beller EM, Gebski V, Keech AC. Randomisation in clinical trials. Med J Aust 2002; 177(10): 565-567. |
Were all of the patients who entered the trial properly accounted for at its conclusion? |
Hint:
Loss to follow up can lead to attrition bias in randomised trials. Attrition bias: Dumville JC, Torgerson DJ, Hewitt CE. Reporting attrition in randomised controlled trials. BMJ2006 April; 332(7547): 969-971. |
Your answer to these screening questions will help you determine whether or not it is worth continuing with the paper. Answering yes to these questions means it is worth continuing, otherwise, you may have to search for another paper. |
|
Detailed Questions |
|
Were patients, health workers and study personnel ‘blind’ to treatment? Hint: Look for: Were the patients blinded; Were the health workers blinded; Were the study personnel blinded? |
This adds to the rigour of the study and minimises bias. Studies can be single, double or triple blinded. Triple blinding is often seen in pharmaceutical research but is difficult to execute in allied health research. Blinding: Day SJ and Altman DG. Blinding in clinical trials and other studies. BMJ 2000; 321:504. |
Were the groups similar at the start of the trial? |
This question helps determine whether both groups are equal at baseline –in terms of any factors that might have an effect on the outcome such as age, sex, social class, level of health/fitness, education. Baseline imbalance: Roberts C and Torgerson DJ. Baseline imbalance in randomised controlled trials.BMJ 1999; 319:185.1. |
Aside from the experimental intervention, were the groups treated equally? |
This question addresses if there is any reason (apart from the intervention) that influences the performance of one group over the other. |
How large was the treatment effect? What outcomes are measured? |
Treatment effect in a randomized trial can be estimated from a sample using a comparison in mean outcomes for treated and untreated groups. What is effect size: Using effcet size-or why the P value is not enough |
How precise was the estimate of the treatment effect? Hint: Look at the confidence intervals, if given. |
Precision of the results can be determined based on the width of the confidence intervals. The confidence interval describes a range of values within which you can be reasonably sure that the true effect actually lies. If the confidence interval is relatively narrow (e.g. 0.70 to 0.80), the effect size is known precisely. If the interval is wider (e.g. 0.60 to 0.93) the uncertainty is greater, although there may still be enough precision to make decisions about the utility of the intervention. Intervals that are very wide (e.g. 0.50 to 1.10) indicate that there is little knowledge about the effect, and that further information is needed [Cochrane Library]. |
Can the results be applied to the local population? |
Consider whether
|
Were all clinically important outcomes considered? If not, does this affect the decision? |
Consider whether the review considered the outcomes that the clinician and the patient are likely to view as important? |
Are the benefits worth the harms and costs? |
This is unlikely to be addressed by the trial. But what do you think? |
Activity
Read the following article and complete the CASP scale.
To check your appraisal, open the link below:
CASP Appraisal for Jull et al 2002
How to appraise cohort studies using the CASP (Critical Appraisal Skills Programme) tool
Screening Questions |
|
Did the study address a clearly focused issue? Hint: An issue can be focused in the following terms: the population studied; the risk factors studied; the outcomes considered is it clear whether the study tried to detect a beneficial or harmful effect? |
This question determines the relevance of the study to your clinical practice. Examine if the study clearly defined their population of interest and the outcomes of interest. If these are not clearly defined, you may find it difficult to determine which patients the results apply to, and whether the outcomes produced are ones which you (as a practitioner) and your patients/clients consider important. |
Did the authors use an appropriate method to answer their question? |
Consider:
|
Your answer to these screening questions will help you determine whether or not it is worth continuing with the paper. Answering yes to both questions means it is worth continuing, otherwise, you may have to search for another paper. |
|
Detailed Questions |
|
Was the cohort recruited in an acceptable way? |
Here you are looking for selection bias which might compromise the generalisability of the findings:
|
Was the exposure accurately measured to minimize bias? |
For this question you are looking for measurement or classification bias:
|
Was the outcome accurately measured to minimize bias? |
For this question you are looking for measurement or classification bias:
|
A. Have the authors identified all important confounding factors? List the ones you think might be important, that the author missed. B. Have they taken account of the confounding factors in the design and/or analysis? |
Consider:
What is a confounding factor: Dealing with confounding in the analysis |
A. Was the follow up of subjects complete enough? B. Was the follow up of subjects long enough? |
Consider:
|
What are the results of this study? |
Consider:
|
How precise are the results? |
Are confidence intervals reported? Precision of the results can be determined based on the width of the confidence intervals. If the confidence interval is relatively narrow (e.g. 0.70 to 0.80), the effect size is known precisely. If the interval is wider (e.g. 0.60 to 0.93) the uncertainty is greater, although there may still be enough precision to make decisions about the utility of the intervention. Intervals that are very wide (e.g. 0.50 to 1.10) indicate that there is little knowledge about the effect, and that further information is needed [Cochrane Library]. |
Do you believe the results? |
Points to consider:
|
Can the results be applied to the local population? |
Consider whether:
|
Do the results of this study fit with other available evidence? |
Consider if the findings of this study contradict other studies on the same topic and how that would influence your application of the findings. |
Activity
Read the following article and complete the PEDro scale.
To check your appraisal, open the link below:
CASP Appraisal for Howard 2005
How to appraise qualitative studies using the CASP (Critical Appraisal Skills Programme) tool
Screening Questions |
|
Was there a clear statement of the aims of the research? |
Consider:
|
Is a qualitative methodology appropriate? |
Consider if the research seeks to interpret or illuminate the actions and/or subjective experiences of research participants |
Your answer to these screening questions will help you determine whether or not it is worth continuing with the paper. Answering yes to both questions means it is worth continuing, otherwise, you may have to search for another paper. |
|
Detailed Questions |
|
Was the research design appropriate to address the aims of the research? |
Consider if the researcher has justified their choice of research design (e.g. have they discussed how they decided which method to use)? |
Was the recruitment strategy appropriate to the aims of the research? |
Consider:
|
Were the data collected in a way that addressed the research issue? |
Consider:
|
Has the relationship between researcher and participants been adequately considered? |
Consider:
|
Have ethical issues been taken into consideration? |
Consider:
|
Was the data analysis sufficiently rigorous? |
Consider:
|
Is there a clear statement of findings? |
Consider:
|
How valuable is the research? |
Consider:
|
Activity
Read the following article and complete the CASP scale.
To check your appraisal, open the link below:
CASP Appraisal for Sommerseth & Dysvik 2008