When examining the binary results, the diagonal elements of an emergency chart indicate the frequency of the match. Cohens Kappa is a measure of an agreement considered more robust than a simple percentage agreement, because it takes into account the possibility of obtaining a random agreement. It is given by: Cohens` can also be used if the same counsellor evaluates the same patients at two times (z.B 2 weeks apart) or, in the example above, re-evaluated the same response sheets after 2 weeks. Its limitations are: (i) it does not take into account the magnitude of the differences, so it is unsuitable for ordinal data, (ii) it cannot be used if there are more than two advisors, and (iii) it does not distinguish between agreement for positive and negative results – which can be important in clinical situations (for example. B misdiagnosing a disease or falsely excluding them can have different consequences). For ordination data, where there are more than two categories, it is useful to know whether the evaluations of the various counsellors end slightly or vary by a significant amount. For example, microbiologists can assess bacterial growth on cultured plaques such as: none, occasional, moderate or confluence. In this case, the assessment of a plate given by two critics as “occasional” or “moderate” would mean a lower degree of disparity than the absence of “growth” or “confluence.” Kappa`s weighted statistic takes this difference into account. It therefore gives a higher value if the evaluators` responses correspond more closely with the maximum scores for perfect match; Conversely, a larger difference in two credit ratings offers a value lower than the weighted kappa. The techniques of assigning weighting to the difference between categories (linear, square) may vary. A common question in clinical research is whether a new measurement method fits an established method. As a statistical advisor at PHASTAR, I see an increase in studies comparing a new diagnostic tool for artificial intelligence or machine learning with an existing tool or with a clinician. The methodology for analyzing binary data is well established, but the methodology for continuous results is less developed.
Here we will review the current methodology and show some of the common pitfalls. It should be noted that the match analysis does not guarantee the accuracy of the measurement methods, but shows the extent to which different measurement techniques correspond. To properly assess a new measurement method, it is also necessary to take into account quantities relating to the validity of measurements, such as sensitivity, specificity and positive and negative forecast values.