

CURRICULUM IN CARDIOLOGY  STATISTICS 

Year : 2019  Volume
: 5
 Issue : 2  Page : 108110 

How to read a forest plot
Ushmita Seth
Technology Consultant, B Tech ( Delhi Technological University), Delhi, India
Date of Submission  21Jun2019 
Date of Decision  10Jul2019 
Date of Acceptance  28Jul2019 
Date of Web Publication  19Aug2019 
Correspondence Address: Ms. Ushmita Seth Technology Consultant, B Tech ( Delhi Technological University), Delhi India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/jpcs.jpcs_39_19
When the databased practice began to accumulate, forest plots were introduced to realize the collective power of the statistical data. It is a graphical representation of a metaanalysis, also known as blobbogram. It allows you to view and analyze the resulting sample individual statistics from multiple similar studies all in one place, along with summary statistics at the bottom. The plot includes the point value of the sample statistic as well as its confidence interval (usually taken as 95%).
Keywords: Forest plot, point estimate, statistics
How to cite this article: Seth U. How to read a forest plot. J Pract Cardiovasc Sci 2019;5:10810 
A forest plot is a graphical display of one common statistical conclusion from a number of studies directing the same problem. This tackles the complexities of collective inferences of various experiments which lead to a powerful conclusion.
In 1990, oncologist Richard Peto joked that the plot was named after fellow breast cancer researcher Pat Forrest, resulting in the frequent misnaming of the plot as Forest plot. However, it was named as the graph had a resemblance to an image of a forest when placed at a right angle [Figure 1]. As the plot consists of lines and large dots, somewhere along each line, the line represents a tree and the dot corresponds to the leaf cover.^{[1]}
Let us understand the different branches of a forest plot with the given example [Table 1].  Table 1: Let us understand the different branches of a forest plot with the given example
Click here to view 
Here is a common representation of the raw data^{[2]} for the plot. The first column signifies the name of the study. The second and third columns describe the experimental results for treatment and control groups, respectively. “n” stands for the number of patients who had the outcome, and “N” stands for the total number of people in the group.
The third column generally indicates the point estimate of the common statistic that is being used to compare all the studies. It could be a relative statistic, such as odds ratio (OR) or relative risk (RR), or it could be an absolute statistic such as standardized mean difference or absolute risk reduction. The fourth and fifth columns represent the upper and lower bounds of the confidence interval (CI), respectively.
The pooling of diverse statistical analysis is done by two methods either using fixedeffects model or randomeffects model.^{[3]} It has been recommended to use the randomeffects pooling model in clinical psychology and the health sciences.^{[4]} The fixedeffects model assumes that all studies are conducted on a single homogeneous population. While pooling the effect sizes, a weighted average of a sample statistic is conducted with the study with smaller variance (i.e., greater precision) given a larger weight.
However, in practice, all studies can almost never be from the same population, and therefore, alternatively, we can do it using the randomeffects model. Here, we assume that studies are conducted not only on one single population but also on a “diverse” population. We, therefore, assume that there is not only one true effect size but also a distribution of true effect sizes. We, therefore, want to estimate the mean of this distribution of true effect sizes.
θk = θF+ϵk + ζk
θk = Observed effect size of an individual study k
θF = True effect size of the population
ϵk = Sampling error
ζk = Second type of error as even the true effect size θF is also a part of distribution of true effect sizes (of the universe of populations)
To take ζk into account, we have to estimate the variance of the distribution of true effect sizes, which is denoted by τ2, or tau^{2}. There are several estimators for τ2.
As in fixedeffects model, we require a weight to be assigned to each study which would decide its influence on the overall metaanalysis. The choice of estimator defines the final calculation of the variance and, therefore, leads to different pooled sized estimates and CIs. An article by Veroniki et al.^{[5]} provides a summary of various estimators and their biases.
Let us now draw the forest plot corresponding to the above data [Figure 1].
First, we look at the two axes. The Xaxis is the scale for the statistics being displayed (OR in our case). The vertical line is not a Yaxis as such; it is the line of “null effect” for the statistic which has been used in our case – the value of the point statistic which signifies no difference between treatment and control groups. It would be placed at 1 for a relative statistic and at 0 for an absolute statistic.
Next, the results of each study are placed one below the other on the plot. For each study, the location of square with respect to Xaxis marks its point estimate, the size of the square marks the sample size, and the length of the horizontal line on which the square lies represents the CI for the point estimate. If at any point, the horizontal line crosses the line of null effect, it basically means that the point of null effect lies within your CI and could even be the true value. Therefore, the study is not statistically significant. Forest plot indicates the estimated effects of CIs for individual study and also overall estimated effects of CIs.
The diamond at the bottom represents the summary statistic and CI based on a metaanalysis. The center of the diamond (or if you draw a vertical line joining its vertical points) represents the point estimate. The horizontal points represent the CIs. As the diamond is a culmination of all the individual studies, the CI would be the smallest (CIs are inversely proportional to sample size, as larger sample size means smaller standard error and vice versa).
The final point about analyzing a forest plot is its “heterogeneity.” Heterogeneity arises due to the bias creeping into the final estimate as the individual studies have been conducted using different methods across different populations. Therefore, an additional commonly used metric called “I^{2}” or Isquared^{[6]} is calculated at the end of the plot. If I^{2} is <50%, then the individual studies fall within the acceptable range of inconsistency. If it is >50%, then they are too inconsistent to be used together for the metaanalysis.
Conclusion   
A forest plot is a graphical display of results from a number of studies addressing the same question. It is called a forest plot [Figure 2] because it represents a forest of lines. It was developed as a means of graphically representing a metaanalysis. They are commonly presented with two columns. The lefthand column lists the names of the studies. The righthand column is a plot of the measure of effect (e.g., RR) for each of these studies, represented by a square, incorporating CIs represented by horizontal lines. The overall measure of effect is represented as a dashed vertical line. This is plotted as a diamond, the lateral points of which indicate CIs for this estimate. A vertical line representing no effect is also plotted, and if the points of the diamond overlap the line of no effect, the overall result cannot be said to differ from no effect at the given level of confidence.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Lewis S, Clarke M. Forest plots: Trying to see the wood and the trees. BMJ 2001;322:147980. 
2.  Available from: https://www.students4bestevidence.net/wpcontent/uploads/2016/06/Howtoreadaforestplot2.jpg. [Last accessed on 2019 Jul 08]. 
3.  Borenstein M, Hedges LV, Higgins JP, Rothstein HR. Introduction to MetaAnalysis. United Kingdom: John Wiley & Sons; 2011. 
4.  
5.  Veroniki AA, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G. Methods to estimate the betweenstudy variance and its uncertainty in metaanalysis. Res Synth Methods 2016;7:5579. 
6.  Higgins JP, Thompson SG. Quantifying heterogeneity in a metaanalysis. Stat Med 2002;21:153958. 
[Figure 1], [Figure 2]
[Table 1]
