

CURRICULUM IN CARDIOLOGY  STATISTICAL PAGES 

Year : 2017  Volume
: 3
 Issue : 1  Page : 3638 

Decoding the Bland–Altman plot: Basic review
Aakshi Kalra
Research Fellow, FIND (International Diagnostic Organization)
Date of Web Publication  17Jul2017 
Correspondence Address: Aakshi Kalra Research Fellow, FIND (International Diagnostic Organization)
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/jpcs.jpcs_11_17
The Bland–Altman plot is a method for comparing two measurements of the same variable. The concept is that Xaxis is the mean of your two measurements, and the Yaxis is the difference between the two measurements. The chart can then highlight anomalies, for example, if one method always gives too high a result, then all points are above or below the zero line. It can also reveal that one method overestimates high values and underestimates low values. If the points on the Bland–Altman plot are scattered all over the place, above and below zero, then it suggests that there is no consistent bias of one approach versus the other. It is, therefore, a good first step for two measurement techniques of a variable. Keywords: Bland–Altman plot, line of agreement, two measurements
How to cite this article: Kalra A. Decoding the Bland–Altman plot: Basic review. J Pract Cardiovasc Sci 2017;3:368 
Introduction   
In the current era of research, it has become indispensable to keep pace with the new methods of measurement. The researchers in the medical field often face a need to compare two methods of measurement; this could be a new method to be compared with an existing method; simply due to alignment issues between instruments, there is a requirement of some tool to measure and appraise the differences or checking a new method with a goldstandard test. Here, one should be cautious that whenever a variable is measured through an instrument, there is some degree of implied error and often no instrument can be 100% accurate. In spite of this, there arises a need to ensure that there is an agreement between the two methods, new and existing one and/or two available tests.
The dilemma remains how to assess the agreement, and there is often confusion between correlation and agreement. The main difference is that correlation tells about the strength of the linear relationship between two variables, not the differences which actually limit of agreement tells. However, it is important that the two methods that are designed to measure the same variable should have good correlation. The correlation is often represented by the correlation coefficient (or “r”). The value of r varies from −1.0 to +1.0 where closer the coefficients are to +1.0, greater the strength of the linear relationship.^{[1]} For agreement, Bland and Altman introduced a plot to illustrate the agreement between two quantitative measurements. In the following section, the basics of the plot would be described in detail.
Bland–Altman Plot   
Bland–Altman plot is a graphical method to plot the difference scores of two measurements against the mean for each subject.^{[2],[3]} This is basically done by studying the mean difference and constructing limits of agreement. The plot is solely meant to define the intervals of agreements, and it does not say whether those limits are acceptable or not. The acceptable limits must be defined before, based on varying factors of clinical, biological, and other considerations.^{[4],[5]} The duo Bland and Altman defined limits of agreement using a simple formula using the mean and the standard deviation (SD) of the differences between two measurements.
Interpretation   
The graph is plotted on the XY axis where X represents the difference of the two measurements, and the Yaxis shows the mean of the two measurements. The plot can also be plotted using percentages or ratios. As for other relevant measures, it was recommended here that 95% of the data points should lie within ±1.96 SD of the mean difference – limits of agreement.^{[6],[7]}
Learning with an example
There is a hypothetical situation in a laboratory where a test is being done on twenty adolescents. The prerequisite is to measure the weight of all the adolescents in kilograms, which is a critical parameter for the final conclusions from the test. There are two methods available  A and B (results from both measurement scales is represented in [Table 1]).  Table 1: Data of twenty adolescents  weight measured through two methods  A and B
Click here to view 
The r = Pearson correlation of both methods is 0.95 with P < 0.001. This is suggestive that the correlation is significant, thus, there is a positive relationship between the two methods A and B. This establishes correlation among the tests but may not necessarily depict agreement. There would be an agreement only if the points lie perfectly along the line of equality. A change in scale of measurement does not affect the correlation, but it affects the agreement.
For calculating agreement:
Xaxis: Would be equal to mean of the two measurements, and
Yaxis: Difference between the two values
This is basically the calculation that we are estimating the difference between the two methods compared to mean of the two indicating the best measure of “true value.” As mentioned in the explanation about plot, limits of agreement would be defined.
From our example [Table 2], the average of the differences is −1.3 units. The data suggest that on average the second method (B) measures 1.3 units more than the first one as mean difference is nonzero.
The three lines in [Figure 1] represent mean of differences  called bias and rest two lines are limits of agreement mean +1.96 SD and mean −1.96 SD. In this example, many points lie outside the limits. To interpret the results, it is important to decide a priori about the level to which the error would be acceptable to the researcher. Further with reference to this example, more than 50% of the values lie outside the limit which indicates that there is no agreement between the tests. As a general rule implied goals whether biological or clinical goals could define whether the agreement interval is wide or narrow for any purpose.
This analysis can be done in Excel, SPSS (SPSS for Windows, Version 16.0. Chicago, USA, SPSS Inc.), and other online calculators easily. The presented exercise was plotted through Excel wherein using “data analysis” under “data” tab. In case, “data analysis” is not there by default, then it can be added from addin option through “file” tab. Under “data analysis” pack, the regression should be selected as it would not only give r but also P value [Figure 2].  Figure 2: Output in Excel after following the steps mentioned in the text. Highlighted portion depicts r (yellow) and P (gray).
Click here to view 
After this, as shown in [Table 2], calculate mean and difference column and using “insert” tab, add scatter plot chart option. The option of “select data” would enable adding lines of agreement to the plot. The following screen would appear once the required data points have been selected [Figure 3].
Conclusion   
The Bland–Altman plot is a useful graphical representation of the agreement between the two tests or measurement tools. The interpretation can depend on the predetermined conditions and requirements.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Taylor R. Interpretation of the correlation coefficient: A basic review. J Diagn Med Sonogr 1990;6:359. 
2.  Altman D, Bland J. Measurement in medicine: The analysis of method comparison studies. The Statistician 1983;32:30717. 
3.  Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res 1999;8:13560. [ PUBMED] 
4.  Giavarina D. Understanding Bland Altman analysis. Biochem Med (Zagreb) 2015;25:14151. [ PUBMED] 
5.  Dewitte K, Fierens C, Stöckl D, Thienpont LM. Application of the BlandAltman plot for interpretation of methodcomparison studies: A critical investigation of its practice. Clin Chem 2002;48:799801. 
6.  Sedgwick P. Limits of agreement (BlandAltman method). BMJ 2013;346:f1630. [ PUBMED] 
7.  Earthman CP. Body Composition tools for assessment of adult malnutrition at the bedside: A tutorial on research considerations and clinical applications. JPEN J Parenter Enteral Nutr 2015;39:787822. [ PUBMED] 
[Figure 1], [Figure 2], [Figure 3]
[Table 1], [Table 2]
This article has been cited by  1 
Decomposing age effects in EEG alpha power 

 Marius Tröndle, Tzvetan Popov, Andreas Pedroni, Christian Pfeiffer, Zofia BaranczukTurska, Nicolas Langer   Cortex. 2023;   [Pubmed]  [DOI]   2 
Automated measurement for image distortion analysis in 2D panoramic imaging of dental CBCT system: A phantom study 

 J.A. Rabba, F.M. Suhaimi, M.Z. Mat Jafri, H.A. Jaafar, N.D. Osman   Radiography. 2023; 29(3): 533   [Pubmed]  [DOI]   3 
Simultaneous quantitative detection of hematocrit and hemoglobin from whole blood using a multiplexed paper sensor with a smartphone interface 

 Anjali, Soumen Das, Suman Chakraborty   Lab on a Chip. 2023;   [Pubmed]  [DOI]   4 
Comparison of multiple 3D scanners to capture foot, ankle, and lower leg morphology 

 Muhannad Farhan, Joyce Zhanzi Wang, Jonathon Lillia, Tegan L. Cheng, Joshua Burns   Prosthetics & Orthotics International. 2023; Publish Ah   [Pubmed]  [DOI]   5 
A laborsaving marking and sampling technique for markreleaserecapture research 

 James R. Hagler, Miles T. Casey, Allya M. Hull, Scott A. Machtley   Entomologia Experimentalis et Applicata. 2022;   [Pubmed]  [DOI]   6 
The clot lysis timebased assay and the variability associated with interpretation of data 

 Silmara Aparecida Lima Montalvão, Beatriz de Moraes Martinelli, Gabriele da Silva Souza Gois, Stephany Cares Huber, Erich Vinícius De Paula, Joyce Maria AnnichinoBizzacchi   International Journal of Laboratory Hematology. 2022;   [Pubmed]  [DOI]   7 
Analytical Validation of the IMMULITE® 2000 XPi Progesterone Assay for Quantitative Analysis in Ovine Serum 

 Kristi L. Jones, Ameer A. Megahed, Brittany N. Diehl, Ann M. Chan, Oscar Hernández, Catalina Cabrera, João H. J. Bittar   Animals. 2022; 12(24): 3534   [Pubmed]  [DOI]   8 
Psychometric Properties of the Urdu Translation of Berg Balance Scale in People with Parkinson’s Disease 

 Muhammad Kashif, Ashfaq Ahmad, Muhammad Ali Mohseni Bandpei, Syed Amir Gilani, Humaira Iram, Maryam Farooq   International Journal of Environmental Research and Public Health. 2022; 19(4): 2346   [Pubmed]  [DOI]   9 
The Utility of Length of Mining Service and Latency in Predicting Silicosis among Claimants to a Compensation Trust 

 Haidee Williams, Rodney Ehrlich, Stephen Barker, Sophia KistingCairncross, Muzimkhulu Zungu, Annalee Yassi   International Journal of Environmental Research and Public Health. 2022; 19(6): 3562   [Pubmed]  [DOI]   10 
Profiling the Typical Training Load of a Law Enforcement Recruit Class 

 Danny Maupin, Ben Schram, Elisa F. D. Canetti, Joseph M. Dulla, J. Jay Dawes, Robert G. Lockie, Robin M. Orr   International Journal of Environmental Research and Public Health. 2022; 19(20): 13457   [Pubmed]  [DOI]   11 
Mirels’ Score for upper limb metastatic lesions: Do we need a different cutoff for recommending prophylactic fixation? 

 Katie A. Hoban, Samantha Downie, Douglas (JA) Adamson, James G. MacLean, Paul Cool, Arpit C. Jariwala   JSES International. 2022;   [Pubmed]  [DOI]   12 
Monitoring sleep in realworld conditions using lowcost technology tools 

 Hassan Shama, Nahum Gabinet, Orna Tzischinsky, Boris Portnov   Biological Rhythm Research. 2022; : 1   [Pubmed]  [DOI]   13 
Comparing the agreement of a commercial cortisol kit with a biologically validated assay in evaluating faecal cortisol metabolite levels in koala joeys 

 Harsh Kirpal Pahuja, Edward Jitik Narayan   Comparative Biochemistry and Physiology Part A: Molecular & Integrative Physiology. 2022; : 111353   [Pubmed]  [DOI]   14 
Validation of Skeletal Muscle and Adipose Tissue Measurements using a Fully Automated Body Composition Analysis Neural Network versus a SemiAutomatic Reference Program with Human Correction in Patients with Lung Cancer 

 Cecily A. Byrne, Yanyu Zhang, Giamila Fantuzzi, Thomas Geesey, Palmi Shah, Sandra L. Gomez   Heliyon. 2022; : e12536   [Pubmed]  [DOI]   15 
Interrater agreement of scores to assess quality of care in public sector primary health care facilities – a pattern of performance 

 Ronel Steinhöbel,Jacqueline E. Wolvaardt,Elizabeth M. Webb   Evaluation and Program Planning. 2021; : 102004   [Pubmed]  [DOI]   16 
A method for automatic classification of gender based on text independent handwriting 

 Payal Maken,Abhishek Gupta   Multimedia Tools and Applications. 2021;   [Pubmed]  [DOI]   17 
Dead Time Estimation of the Transient Digitizer of the Raman Lidar System Installed at a HighAltitude Station Palampur in India 

 Thomas Jaswant,Soman R. Radhakrishnan,Shishir Kumar Singh,Chhemendra Sharma   MAPAN. 2021;   [Pubmed]  [DOI]   18 
Contemporary chemical lean determination used in the Australian meat processing industry: A method comparison 

 Peter Watkins,Katherine Stockham,Sarah Stewart,Graham Gardner   Meat Science. 2021; 171: 108289   [Pubmed]  [DOI]   19 
Chemical lean determination of boneless beef and lamb using a halogen moisture analyser 

 Peter Watkins   Animal Production Science. 2021; 61(7): 715   [Pubmed]  [DOI]   20 
A Validation of Supervised Deep Learning for Gait Analysis in the Cat 

 Charly G. Lecomte,Johannie Audet,Jonathan Harnie,Alain Frigon   Frontiers in Neuroinformatics. 2021; 15   [Pubmed]  [DOI]   21 
Dental long axes using digital dental models compared to conebeam computed tomography 

 Amalia Cong,Camila Massaro,Antonio Carlos de Oliveira Ruellas,Mary Barkley,Marilia Yatabe,Jonas Bianchi,Marcos Ioshida,María Antonia Alvarez,Juan Fernando Aristizabal,Diego Rey,Lucia Cevidanes   Orthodontics & Craniofacial Research. 2021;   [Pubmed]  [DOI]   22 
Effect of advanced biofuels on WLTC emissions of a Euro 6 diesel vehicle with SCR under different climatic conditions 

 A CalleAsensio, JJ Hernández, J RodríguezFernández, M Lapuerta, A Ramos, J Barba   International Journal of Engine Research. 2021; 22(12): 3433   [Pubmed]  [DOI]   23 
Fuel economy, NOx emissions and lean NOx trap efficiency: Lessons from current driving cycles 

 José RodríguezFernández, Juan José Hernández, Ángel Ramos, Alejandro CalleAsensio   International Journal of Engine Research. 2021; : 1468087421   [Pubmed]  [DOI]   24 
Aging Effects and Test–Retest Reliability of Inhibitory Control for Saccadic Eye Movements 

 Martyna Beata Plomecka, Zofia BaranczukTurska, Christian Pfeiffer, Nicolas Langer   eneuro. 2020; 7(5): ENEURO.045   [Pubmed]  [DOI]   25 
TestRetestReliability of VideoOculography During Free Visual Exploration in RightHemispheric Stroke Patients With Neglect 

 Brigitte Charlotte Kaufmann,Dario Cazzoli,René Martin Müri,Tobias Nef,Thomas Nyffeler   Frontiers in Neuroscience. 2020; 14   [Pubmed]  [DOI]   26 
Nanoliposome Precursors for Shape Modulation: Use of Heuristic Algorithm and QBD Principles for Encapsulating Phytochemicals 

 Sameer J. Nadaf,Suresh G. Killedar   Current Drug Delivery. 2020; 17(7): 599   [Pubmed]  [DOI]   27 
Correlation Designs and Analyses 

 Sandra L. Siedlecki   Clinical Nurse Specialist. 2020; 34(4): 143   [Pubmed]  [DOI]   28 
Laboratory evaluation of two pointofcare detection systems for early and accurate detectaion of influenza in the Lao Peopleæs Democratic Republic 

 Wanitchaya Kittikraisak,Bouaphanh Khamphaphongphane,Sinakhone Xayadeth,Virasack Som Oulay,Viengphone Khanthamaly,Onanong Sengvilaipaseuth,C. Todd Davis,Genyan Yang,Natosha Zanders,Joshua A. Mott,Phonepadith Xangsayarath   International Journal of Infectious Diseases. 2020;   [Pubmed]  [DOI]   29 
Agreement in Infant Growth Indicators and Overweight/Obesity between Community and Clinical Care Settings 

 Holly A. Harris,Samantha M.R. Kling,Michele Marini,Sandra G. Hassink,Lisa BaileyDavis,Jennifer S. Savage   Journal of the Academy of Nutrition and Dietetics. 2020;   [Pubmed]  [DOI]   30 
Caregivers’ Perceptions of Stuttering Impact in Young Children: Agreement in Mothers’, Fathers’ and Teachers’ Ratings 

 Linn Stokke Guttormsen,J. Scott Yaruss,KariAnne B. Næss   Journal of Communication Disorders. 2020; : 106001   [Pubmed]  [DOI]   31 
Comparative analyses of SPI and SPEI as drought assessment tools in Tigray Region, Northern Ethiopia 

 Amare Sisay Tefera,J. O. Ayoade,N. J. Bello   SN Applied Sciences. 2019; 1(10)   [Pubmed]  [DOI]   32 
Nitrous Oxide Exposure Among Dental Personnel and Comparison of Active and Passive Sampling Techniques 

 John Hansen,Nicholas Schaal,Theodore Juarez,Charles Woodlee   Annals of Work Exposures and Health. 2019;   [Pubmed]  [DOI]   33 
Evaluation of HomeBased Rehabilitation Sensing Systems with Respect to Standardised Clinical Tests 

 Ioannis Vourganas,Vladimir Stankovic,Lina Stankovic,Anna Lito Michala   Sensors. 2019; 20(1): 26   [Pubmed]  [DOI]   34 
Impact of sanitation monitoring approaches on sanitation estimates in SubSaharan Africa 

 Muchaneta Munamati,Innocent Nhapi,Shepherd Nimrod Misi   Journal of Water, Sanitation and Hygiene for Development. 2018; 8(3): 481   [Pubmed]  [DOI]  



