Skip to Main Content
St George's University of London Logo

Understanding research and critical appraisal

This guide provides an outline of the key types of research study designs, and the difference between primary and secondary research


The below glossary provides brief explanations for some of the terminology that you may encounter in research articles.

Absolute risk

Absolute risk measures the size of a risk in a person or group of people. This could be the risk of developing a disease over a certain period, or it could be a measure of the effect of a treatment – for example, how much the risk is reduced by treatment in a person or group. There are different ways of expressing absolute risk. For example, someone with a 1 in 10 risk of developing a certain disease has "a 10% risk" or "a 0.1 risk", depending on whether percentages or decimals are used.



Blinding is not telling someone what treatment a person has received or, in some cases, the outcome of their treatment. This is to avoid them being influenced by this knowledge. The person who is blinded could be either the person being treated or the researcher assessing the effect of the treatment (single blind), or both of these people (double blind).



Clustering is a form of randomisation where participants are randomised to either intervention or control groups not as individuals, but in groups (clusters).


Confidence interval

A confidence interval (CI) is a statistical calculation that gives the range within which the estimated mean result from a sample population in a study is found. There will always be some uncertainty in results, because studies are conducted on samples and not entire populations; the size of the CI indicates how confident researchers are in the estimated results, and their statistical significance. A smaller interval indicates a more precise, more stable estimate.

By convention, a 95% CI is typically used when reporting results in medical trials, meaning that the confidence interval shows the range within which the researchers are confident that the true result from a population would lie 95% of the time.


Confounding factor

A confounder is a factor that may distort the true relationship between two (or more) characteristics. When it is not taken into account, false conclusions can be drawn about associations or causality. An example is to conclude that if people who carry a lighter are more likely to develop lung cancer, it is because carrying a lighter causes lung cancer. In fact, smoking is a confounder here. People who carry a lighter are more likely to be smokers, and smokers are more likely to develop lung cancer. A list of possible confounding factors in a study should be provided in the methods section.


Control group

A control group serves as a basis of comparison in a study. In this group, no experimental stimulus or intervention is received. In many cases, the control group will receive a placebo.


Hazard ratio

A measure of the relative probability of an event in two groups over time.

A hazard ratio of one indicates that the relative probability of the event in the two groups over time is the same. A hazard ratio of more than or less than one indicates that the relative probability of the event over time is greater in one of the two groups.


Intention-to-treat analysis

Intention-to-treat (ITT) analysis is the preferable way to look at the results of randomised controlled trials (RCTs).

In ITT analysis, people are analysed in the treatment groups to which they were assigned at the start of the RCT, regardless of whether they drop out of the trial, don't attend follow-up, or switch treatment groups.

If follow-up data is not available for a participant in one of the treatment groups, the person would normally be assumed to have had no response to treatment, and that their outcomes are no different from what they were at the start of the trial.


Number needed to treat

The number needed to treat (NNT), is the number of people who would need to be treated with the intervention under study, to either experience the beneficial outcome, or for the negative outcome to be prevented. A lower NNT is better.



The P-value represents the statistically calculated probability of the results of a study occurring by chance. The lower the P-value, the smaller the likelihood is that results have occurred by chance. A P-value of less than 0.05 (P < 0.05) is considered to be statistically significant, and a P-value of less than 0.01 (P<0.01) is considered to be statistically highly significant.


Phase I trial

Phase I trials are the early phases of drug testing in humans. These are usually quite small studies that primarily test the safety and suitability for use in humans, rather than its effectiveness.

They often involve between 20 and 100 healthy volunteers, although they sometimes involve people who have the condition the drug is aimed at treating. To test the safe dosage range of the drug, very small doses are given initially and are gradually increased until the levels suitable for use in humans are found.


Phase II trial

During this phase of testing, the effectiveness of a drug in treating the targeted disease in humans is examined for the first time and more is learnt about appropriate dosage levels.

This stage usually involves 200 to 400 volunteers who have the disease or condition the drug is designed to treat. The effectiveness of the drug is examined, and more safety testing and monitoring of its side effects are carried out.


Phase III trial

In this phase of human testing of treatments, the effectiveness and safety of the drug undergoes a rigorous examination in a large, carefully controlled trial to see how well it works and how safe it is.

The drug is tested in a much larger sample of people with the disease or condition than before, with some trials including thousands of volunteers. Participants are followed up for longer than in previous phases, sometimes over several years.

These controlled tests usually compare the effectiveness of the new drug with either existing drugs or a placebo. These trials are designed to give the drug as unbiased a test as possible to ensure that the results accurately represent its benefits and risks.

The large number of participants and the extended period of follow-up give a more reliable indication of whether the drug will work, and allows rarer or longer term side effects to be identified.


Publication bias

Publication bias arises because researchers and editors tend to handle positive experimental results differently from negative or inconclusive results. Trials reporting negative or inconclusive results are less likely to be published. It is especially important to detect publication bias in studies that pool the results of several trials, such as systematic reviews.


Relative risk or risk ratio

Relative risk compares a risk in two different groups of people. All sorts of groups are compared to others in medical research to see if belonging to a particular group increases or decreases the risk of developing certain diseases. This measure of risk is often expressed as a percentage increase or decrease, for example, "a 20% increase in risk" of treatment A compared with treatment B.


Sample size

Having an adequate sample size allows the findings from research participants to be generalised to the overall population. Researchers should calculate the required sample size before a study begins, to ensure that their study is likely to detect effects. Studies with too small a sample size might produce findings which are not statistically significant, but which might still be of clinical significance.


Selection bias

Selection bias primarily relates to the recruitment of research participants, and might be introduced by the method used to select participants, or by any other reason which could influence who chooses, or is chosen to participate in a study. 

Selection bias can result in a sample population that is significantly different from the general population, thereby introducing a potential limit to the generalisability of any findings or results 


Statistical significance

If the results of a test have statistical significance, it means that they are not likely to have occurred by chance alone. In such cases, we can be more confident that we are observing a ‘true’ result.