One of the more interesting phenomena in medicine is the scenario under which new tests or new uses for old tests are introduced. In most cases the initial reports are highly enthusiastic. Also in most cases there is eventual follow-up by other investigators who either cannot reproduce the initial good results or who uncover substantial drawbacks to the test. In some cases the problem lies in the fact that there may not be any way to provide an unequivocal standard against which test accuracy can be measured. An example is acute myocardial infarction, because there is no conclusive method to definitively separate severe myocardial ischemia from early infarction (i.e., severe reversible change from irreversible change). Another example is acute pancreatitis. In other cases the initial investigators may use analytical methods (e.g., “homemade” reagents) that are not identical to those of subsequent users. Other possible variances include different populations tested, different conditions under which testing is carried out, and effects of medication. Historical perspective thus suggests that initial highly enthusiastic claims about laboratory tests should be received with caution.

Many readers of medical articles do not pay much attention to the technical sections where the materials and methods are outlined, how the subjects or patient specimens are selected and acquired, and how the actual data from the experiments are presented. Unfortunately, rather frequently the conclusions (both in the article and in the abstract) may not be proven or, at times, even may not be compatible with the actual data (due to insufficient numbers of subjects, conflicting results, or most often magnifying the significance of relatively small differences or trends). This often makes a test appear to give clear-cut differentiation, whereas in reality there is substantial overlap between two groups and the test cannot reliably differentiate individual patients in either group. Another pitfall in medical reports is obtaining test sensitivity by comparing the test being evaluated with some other procedure or test. While there usually is no other way to obtain this information, the reader must be aware that the gold standard against which the new test is being compared may itself not be 100% sensitive. It is rare for the report to state the actual sensitivity of the gold standard being used; even if it is, one may find that several evaluations of the gold standard test had been done without all evaluations being equally favorable. Therefore, one may find that a new test claimed to be 95% sensitive is really only 76% sensitive because the gold standard test against which the new test is being compared is itself only 80% sensitive. One should be especially wary when the gold standard is identified only as “a standard test” or “another (same method) test.” In addition, even if the gold standard were claimed to be 100% sensitive, this is unlikely because some patients would not be tested by the gold standard test due to subclinical or atypical illness; or patients could be missed because of interferences by medications, various technical reasons, or how the gold standard reference range was established (discussed previously).