Prostate cancer topics, links and more. Now at 200+ posts!

News: Health Day, Medical News Today, ScienceDaily, Urol Times, Urotoday, Zero Cancer Papers: Pubmed (all), Pubmed (Free only), Amedeo
Journals: Eur Urol, J Urol, JCO, The Prostate Others Pubmed Central Journals (Free): Adv Urol, BMC Urol, J Endourol, Kor J Urol, Rev Urol, Ther Adv Urol, Urol Ann
Reviews: Cochrane Summaries, PC Infolink Newsletters: PCRI, US Too General Medical Reviews: f1000, Health News Review

Monday, June 4, 2007

PSA Screening and Early Detection - Part 5. More Diagnostic Testing Concepts

PSA Screening and Early Detection. Part 1 - Guides
PSA Screening and Early Detection. Part 2 - Key Points on PSA
PSA Screening and Early Detection. Part 3. Current Environment
PSA Screening and Early Detection - Part 4. Diagnostic Testing Concepts [previous]
PSA Screening and Early Detection - Part 5. More Diagnostic Testing Concepts [current]

[Updated May 17, 2010]

In the following we give a number of actual examples of press releases and research results about PSA and related testing and show how to analyze these sources in terms of the diagnostic testing concepts that were discussed in the last part of this series. The examples illustrate how to gain insight into these articles and even uncover errors. Although in keeping with the focus of this blog, the examples will all involve prostate cancer the material can also be useful for understanding diagnostic testing, in general, for other diseases, as well.

First we review the material from the last part covering it from additionl perspectives. Some of the discussion could, in principle, involve algebra (but nothing beyond elementary high school level); however, we will avoid even that by making use of the free symbolic algebra program, Mathomatic, that can be used to perform all the calculations with hardly any knowledge of mathematics. It is free and available on Windows, Mac and Linux. It can also be accessed online, i.e. over the internet without installing it on your computer, by using the following line run from the Windows or the UNIX command line:

telnet 63011


The prevalence of prostate cancer is the proportion of men in a population that have the disease at a point in time.

The prevalence is also your probability of having prostate cancer before you know your test results. Because of this, the prevalence is sometimes referred to as the prior probability.

Based on the data derived from a meta study in Vollmer, 2006[PMID: 16613336] the prevalence of prostate cancer among men subject to test is 0.9595% among men aged 55, 5.015% among men aged 65 and 11.947% among men aged 75. We shall use 5% as an overall prevelance when we wish to use a single figure regardless of age.

Sensitivity/Specificity and PPV/NPV

We can divide the subjects under test into two groups in two different ways. We can either divide them into the diseased vs. healthy populations or we can divide them into the groups having a positive test (i.e. indicative of cancer) and negative test (i.e. indicative of no cancer). We consider each of these divisions in turn:

1. Diseased vs. Healthy

The first way to divide the population into two is to divide subjects into the diseased and healthy populations. The fraction of correct tests in each of these populations is called the sensitivity and specificity, respectively. For example, in the PCPT trial discussed in the last part of this series out of every 5 diseased subjects 2 had positive tests (assuming a cutoff of 2.5) so the sensitivity is 2/5 = 40%. Also from the PCPT out of 95 healthy men we had 77 negative tests which is a specificity of 77/95 = 81%.

Tests with high sensitivity are good at detecting that disease is present.

Tests with high specificity are good at avoiding needless treatment on healthy individuals. In the case of prostate cancer if one gets a positive PSA test one will proceed to have a biopsy so the higher the specificity of the PSA test the less likely a healthy individual will needlessly get a biopsy.

The sensitivity and specificity depend on the cutoff but not on the prevalence. They are measures of the test but not of the population under test (all other factors being the same). Thus if we wish to decide which of several alternative tests to undergo we would want one with the higher sensitivity and higher specificity.

2. Those Testing Positive vs. Those Testing Negative

The second way of dividing the population into two is to divide it into those who test positive and those who test negative. (A positive test is indicative of disease and negative test is indicative of being healhty. This is different than actually having or not having cancer because tests are not perfect. Having a positive test result does not necessarily mean you have cancer and having a negative test does not necessasarily mean you do not have cancer.) The fraction of correct tests in each of these populations is called the positive predictive value (PPV) and negative predictive value (NPV). That is the PPV is the number of men with cancer and positive tests as a fraction of those with positive tests. The NPV is the number of men who do not have cancer and have negative tests as a fraction of all men who have negative tests.

In the PCPT trial discussed in the last part of this series out of every 20 positive tests 2 actually had cancer which yields a PPV of 2/20 = 10%. Out of every 80 negative tests 77 did not have cancer which gives an NPV of 77/80 = 96%.

Like sensitivity and specificity, the PPV and NPV depend on the cutoff used for the test but unlike sensitivity and specificity they also depend on the prevalence of disease in the population. Since PPV and NPV depend on the prevalence they are not pure measures of the test. Their values will change by applying the test to a different population. In particular, if the prevalence is higher the PPV will be higher and the NPV will be lower.

The PPV and NPV answer the questions: if I have a positive test what is my chance of having cancer (PPV) and if I have a negative test what is my chance of being free of cancer (NPV)? Thus the PPV and 1-NPV are the prevalences of cancer in the positive and negative test groups respectively. As a result they are sometimes referred to as the posterior probabilities in contrast to the prevalence in the entire population which is called the prior probability. Thus before you are tested your probability of cancer equals the prevalence and after you are tested your probability of cancer is PPV or 1-NPV depending on whether your test was positive or negative, respectively.

Stated in terms of prevalence, the prevalence is the fraction of cancer in the entire population while the PPV is the prevalence of cancer in the subgroup that tested positive (and 1-NPV is the prevalence of cancer in the subgroup that tested negative).

Since anyone with a positive test is sent for a biopsy PPV is also the proportion of biopsies that were really needed while 1-PPV is the proportion of biopsies that were done needlessly (since the subjects tested positive but did not have cancer).

Since PPV rises as prevalence rises and since men with a family history of prostate cancer have a higher prevalence they would have a higher PPV. In fact [PMID: 16515992] found that the PPV in a group of men with prostate cancer family history was 32.2% vs. 23.6% in a group with no family history.


We use prevalence to decide whether to get tested, sensitivity and specificity to decide which test to undergo and we use PPV and NPV to assess our likelihood of being diseased after we get back our test results. We deal with each of these cases:

  1. Prevalence is used to decide whether to get tested in the first place. If the prevalence is very low, i.e. the disease is rare, then you probably don't need to get tested. Prostate cancer is sufficiently prevalent past a certain age that you probably want to get tested (at least according to nearly every patient and physician group but not according to governnment groups as discussed in Part 3 of this series). If you are in a high risk group (family member with prostate cancer, African American, carrier of BRCA1 and BRAC2 genes) then it would be even more important as the prevalence among your group is even higher. The idea of screening high risk groups is called targeted screening and is discussed in this Nov 2006 paper by Mitra et al.

  2. Once you have decided to get tested the sensitivity and specificity can be used to allow you to determine which test to take. You want the test with the highest sensitivity and specificity. You want high sensitivity so that it detects disease if its there and you want high specificity so you are not subject to a biopsy if you don't have disease. Of course, since PSA testing is the only widely available test for screening you do not really have a choice here but in the future there may be new tests and in that case if one of the future tests has higher sensitivity and higher specificity than the PSA test then you would want that one.

  3. Once you are tested, if you test positive the PPV is your probability of having cancer and if you test negative the NPV is your probability of not having cancer.

Data on PSA Testing

The following table summarizes the 2x2 table for each of the three data sources in the last part of this series together with the sensitivity, specificity, PPV, NPV and prevalence derived from each 2x2 table. The cutoff for the PCPT data was 2.5 and is unknown for the other two data sets.

Data Source Sensitivity Specificity PPV NPV Prevalence
PC no PC
PSA +ve 2 18 20
PSA -ve 3 77 80
5 95 100
2/5 = 40.0 77/95 = 81.1 2/20 = 10.0 77/80 = 96.3 5/100 = 5
PC no PC
PSA +ve 3 5 8
PSA -ve 2 90 92
5 95 100
60.0 94.7 37.5 97.8
PC no PC
PSA +ve 3 7 10
PSA -ve 1 89 90
4 96 100
75.0 93.0 30.0 98.9 4

A 2009 meta analysis of PSA testing [PMID: 1974436] summarized various studies on PSA accuracy with the following table:

Study Year No TP FP FN TN Sensitivity Specificity
1 Aragona 2005 3171 1073 1695 98 305 0.92 0.15
2 Beneduce 2007 101 42 31 8 20 0.84 0.39
3 Ciatto 2004 410 167 171 18 54 0.90 0.24
4 Espana 1998 170 53 96 15 6 0.78 0.06
5 Fischer 2005 178 61 76 13 28 0.82 0.27
6 Hofer 2000 184 67 81 7 33 0.91 0.29
7 McArdle 2004 171 93 52 10 16 0.90 0.24
8 Ryden 2007 361 180 146 8 27 0.96 0.16
9 Unal 2000 59 30 10 0 19 1.00 0.66
10 Wymenga 2000 716 253 228 68 15 0.79 0.06

In terms of our previous 2x2 tables, for each study the TP (true positive) and FP (false positive) entries form the first row of the 2x2 table for that study and the FN (false negative) and TN (true negative) form the bottom row. In this notation Sensitivity = TP / (TP + FN) and Specificity = TN /(TN + FP) so that for example in the first row we have Sensitivity = TP / (TP + FN) = 1073 / (1073 + 98) = 0.92 and Specificity = TN /(TN + FP) = 305 / ( 305 + 1695 ) = 0.15 .


Because the sensitivity and specificity are pure measures of the test whereas PPV and NPV partly measure the test and partly measure the population, the sensitivity and specificity are the two numbers that are typically shown for a diagnostic test. Once we are tested we will be interested in the PPV if we test positive and NPV if we test negative. To get PPV and NPV given the sensitivity, specificity and prevalence use these formulas (where p = prevalence, Sn = Sensitivity, Sp = Specificity, PPV = positive predictive value, NPV = negative predictive value).

PPV = p * Sn / ((p * Sn) + ((1-p) * (1-Sp)))
NPV = (1-p) * Sp / ((1-p) * Sp + p * (1-Sn))

We will mainly use the first of these two formulas. That formula relates PPV to Sn, Sp and p. By solving the formula for p, Sn or Sp we can get a new formula which gives p, Sn or Sp in terms of the other three variables. When we need to do that we will use the free Mathomatic software to calculate the solution in order to avoid explicit algebraic manipulation. Similar comments apply to the NPV formula.


Example 1. PPV of the ECPA-2 Prostate Cancer Test

According to this June 26, 2007 ABC News article the ECPA-2 "test has been shown to correctly identify which male patients did not have cancer 97 percent of the time, and which men did have prostate cancer 94 percent of the time.

By comparison, figures from the National Cancer Institute and others have shown the PSA test has specificity rate as low as 15 to 30 percent - meaning that on average, for every four or six men who test positive, only one actually has prostate cancer.

'For men with elevated PSA levels, only one in six who get biopsies today actually have prostate cancer,' he explained. 'This amounts to 1.3 million to 1.6 million men being biopsied to find the 230,000 or so who actually have prostate cancer.'"

The first paragraph says that the sensitivity of the ECPA-2 test is 94% and the specificity is 97%.

The second paragraph illustrates a confusion on the writer's part. He refers to the term specificity but then defines it as the percentage of positive tests for which the subject has cancer. That is the definition of PPV, not specificity.

Its already clear that the ECPA-2 test is superior to the ordinary PSA test (if the numbers in the article are correct) since ECPA-2 has both sensitivity and specificity in the 90's; however, since the article intends to compare the PPV of the ECPA-2 with the PSA test, let us make such a comparison ourselves.


We have:

p = 5% (assumed)
Sn = 94%
Sp = 97%

PPV = p * Sn / (p * Sn + (1-p) * (1-Sp))
= 0.05 * 0.94 / (0.05 * 0.94 + (1-0.05) * (1-0.97))
= 62.25%

Note that this PPV of 62.25% for the ECPA-2 is the one that should be compared to the PSA PPV. The PSA PPV of the PCPT trial, the Ontario data and the AAFP data were 10%, 37.5% and 30%. Thus, while EPCA-2 continues to show a significant advantage for the ECPA-2 test, the advantage is completely misrepresented by the article by wrongly comparing the 15 - 30% number to the 94% or 97% numbers.

Also aside from any data and calculation issues there is the issue that the study is not independent. See [comment] and [PMID: 11829700].

(Aside: The EPCA-2 may also be usable to determine how aggressive the cancer is as well as whether there is cancer or not. See 2007 ASCO presentation by Dr. Getzenberg. Also according to the Schlotz article starting on page 8 of [link] there are indications that EPCA-2 may be able to distinguish between organ confined and non-organ confined disease.)

Example 2. Miraculins has a press release in which says that 70% of biopsies done as a result of a positive PSA test are needless -- the patient never had prostate cancer. Assuming the sensitivity and specificity for the PSA test are as in each of the 3 data sets (PCPT, Ontario, AFFP) in our table above what is the prevalence?


We have three of the 4 variables in the PPV equation above, i.e. we have PPV, Sn and Sp, so we can get the remaining one, i.e. p, by implication. An easy way to do that is to use the free Mathomatic software which is available for Windows, Mac and Linux. Download, unzip it and double click it to start it up or just run it online via telnet as described near the beginning of this post. Then copy the following three lines to the clipboard and paste them into the running instance of Mathomatic. The first line is the PPV equation we showed previously, the second line tells it to solve for p and the third line tells it to calculate the result. After entering calculate it will prompt us for the values of the other variables and then display the value of p implied by them:

PPV = p * Sn / ((p * Sn) + ((1-p) * (1-Sp)))

The mathomatic session looks like this. Lines that begin with 1-> or the word Enter were entered by us and all other lines were output by Mathomatic.

1-> PPV = p * Sn / ((p * Sn) + ((1-p) * (1-Sp)))

#1: PPV = -----------------------------
((p*Sn) + ((1 - p)*(1 - Sp)))

1-> p

PPV*(1 - Sp)
#1: p = --------------------------
(Sn - (PPV*(Sn + Sp - 1)))

1-> calculate ; PCPT data
Enter PPV: .3
Enter Sn: .4
Enter Sp: .811
p = 0.168399168399
1-> calculate ; Ontario data
Enter PPV: .3
Enter Sn: .6
Enter Sp: .947
p = 0.036476256022
1-> calculate ; AAFP data
Enter PPV: .3
Enter Sn: .75
Enter Sp: .93
p = 0.0384615384615

From the above we see that the implied prevalence is 17% based on the PCPT trial data or slightly less than 4% based on the other two data sets. 17% seems to be a bit high but the 4% seems within the ballpark we would expect so we accept the figures as reasonable.

Note that the formulas generated by Mathomatic can be used directly (as opposed to using Mathomatic calculate command) so if, in the future, we have another situation where we wish to calculate prevalence from sensitivity, specificity and PPV we could just use the above formula that Mathomatic came up with directly without access to Mathomatic at all. For example, in the case of the PCPT trial we could redo the calculation above like this:

p = PPV*(1 - Sp) / (Sn - PPV*(Sn + Sp - 1))
= .3 * (1 - .811) / (.4 - .3 * (.4 + .811 - 1))
= .168

Example 3. For this example we will calculate the specificity of the Miraculins test using a prevalence of 5%, the sensitivity given in the article and the PPV implied by the article. The article says that the fraction of all biopsies that are needless with the Miraculins test is 20% less than the needless biopsies based on the PSA test alone. Also the the sensivity of their test is 96%. Let us calculate the specificity of their test.

Answer. The fraction of all biopsies that are needless is 1-PPV and we know that this is 20% less than the 1-PPV of the PSA test. For the PSA test, the PCPT trial yielded a PPV of 10% so 1-PPV is 90% and if the Miraculins' 1-PPV is 20% less than that then the Miraculins' 1-PPV is (1 - 0.20) * 0.90 = 0.72 so the Miraculins' PPV is 1-0.72 = 0.28. Assuming a prevalence of p = 0.05 we have:

- PPV = 0.28
- p = 0.05
- Sn = 0.96

We have previously entered the PPV equation into Mathomatic so we need not enter it again. We can just enter:


and we get the following. When it prompts for the variable values we enter them. Below we show the actual session including the output from Mathomatic. We had previously entered the PPV equation so we did not have to enter an equation again. We simply entered Sp which caused Mathomatic to solve the equation alrady entered so as to express Sp in terms of the other variables. Then we enter calculate and it prompts for the value of each of those variables finally giving us a value of specificity of 0.87 for the Miraculins test (based on our assumptions and the data in the news article).

1-> Sp

p*Sn*(1 - PPV)
#1: Sp = -1*(-------------- - 1)
(PPV*(1 - p))

1-> calculate
Enter PPV: 0.28
Enter p: 0.05
Enter Sn: 0.96
Sp = 0.87007518797

Example 4. Sensitivity

Suppose that in the last example we did not know the sensitivity but did know values of the other three variables. Then we could solve for sensitivity as shown below.

1-> Sn

(1 - Sp)*PPV*(1 - p)
#1: Sn = --------------------
(p*(1 - PPV))

1-> calculate
Enter PPV: 0.28
Enter p: 0.05
Enter Sp: 0.87
Sn = 0.960555555556

We get a sensitivity of 0.96 which agrees with the prior exammple.

Example 5. Specificity of PSA Nanotest

This article on an experimental PSA Nanotest indicates that the test has a sensitivity of Sn = 100% and 1-PPV of 24% so PPV = 76%. Assuming the prevalence is p = 5% what is the specificity?


Using the equation for specificity that we got from Mathomatic in Example 3, we have:

Sp = (p*((Sn*(1 - PPV)) + PPV) - PPV) / (PPV*(p - 1))
= (.05 * ((1 * (1-.76)) + .76) - .76) / (.76 * (.05 - 1))
= .983

Example 6. Color Doppler

In Kuligowska et al, 2001 the authors write: "Color Doppler US alone had a sensitivity of 43.2%, a specificity of 66.4%, a PPV of 40.8%, an NPV of 68.5%, and an accuracy of 58.3%." Let us determine what prevalence the paper is assuming to get this PPV.

In Example 2 we already computed the formula for prevalence using Mathomatic so instead of repeating it let us just write it down again from Example 2:

p = PPV*(1 - Sp) / (Sn - (PPV*(Sn + Sp - 1)))
= .408 * (1 - .664) / (.432 - .408 * (.432 + .664 - 1))
= 0.35

Assuming that the population under test is those who have a positive PSA test, the prevalence of such a population would be the PPV of the PSA test. This is higher than the PPV from the PCPT trial and the AAFP data but is consistent with the Ontario data.

If we were to use the PPV of the PCPT trial as the prevalence then the PPV of color doppler would be much less than the 40% claimed above:

PPV = p * Sn / (p * Sn + (1-p) * (1-Sp))
= 0.10 * 0.432 / ((0.10 * 0.432 + (1 - 0.10) * (1 - 0.664)))
= 0.125

Thus based on the PPV of the PSA test from the PCPT data if you have a positive color doppler there is a 12.5% chance of having prostate cancer. The corresponding figures for the Ontario and AAFP data are 37.3% and 30.8%. Thus if we accept the Ontario data the PPV calculated for color doppler is about right but it seems high relative to the PCPT and AAFP data.

Interestingly the PPV of color doppler is higher than for PSA even though both the sensitivity and the specificity are lower than for PSA. That is because the color doppler is used on a population with higher prevalence of prostate cancer, namely the population of patients with positive PSA tests.

Example 7. Power Doppler

Power doppler is a type of color doppler that uses the amplitude of the echo rather than its frequency shift and is believed to be better at detecting cancer.

In a 1998 investigation reported in [PMID: 9586699] there were 23 patients with prostate cancer and 19 of them
were successfully detected with power Doppler sonography. Also there were 17 who did not have prostate cancer and 4 of them had positive tests anyways. Thus we have this table:
Test +ve19423
Test -ve41317

The diagonal elements divided by their column totals ive the sensitivity, 19/23 = 82.6%, and specificity, 13/17 = 76.5% and the upper diagonal element divided by its row total gives the positive predictive value, PPV, which is 19/23 = 82.6% -- in this case is coincidentally the same as the sensitivity. The prevalence is the ratio of the first column total to the grand total which is 23/40 = 57.5% . This is much higher than the prevalence of prostate cancer in the general population but such imaging would likely only be done on patients who already had some positive indication from a PSA test or DRE. At any rate the PPV but not the sensitivity and specificity depend on the prevalence.

A second 1998 power doppler investigation was described like this: [PMID: 9772875]
OBJECTIVE: To determine the role of transrectal power Doppler ultrasonography (PDU) in the diagnosis of prostate cancer. PATIENTS AND METHODS: Thirty-six patients (mean age 66.4 years, SD 7.7, range 59-82) with possible prostate cancer, suspected from an abnormal digital rectal examination or elevated prostate specific antigen level, underwent transrectal ultrasonography, transrectal PDU and biopsy. The vascularity on PDU was graded on a scale of 0-2, where grade 1-2 was considered positive and grade 0 negative. RESULTS: The vascularity was grade 2 in 11 patients, grade 1 in 11 and grade 0 in 14; 20 of the 36 (56%) patients had prostate cancer. Of the 22 patients positive on PDU, 18 had malignant disease and four benign; two of 20 patients with histopathologically confirmed malignancy had a normal PDU. The sensitivity of PDU was 90%, the specificity 75% and the positive predictive value 82%. CONCLUSION: Focal hypervascularity on PDU was associated with an increased
likelihood of prostate cancer. Although ultrasonography alone cannot detect all
cancers, even using PDU, the technique appears to increase the sensitivity and
to help identify appropriate sites for biopsy.
The reader may wish to try filling out the table prior to looking:
Test +ve18422
Test -ve21214
and then calculate the sensitivity, specificity, PPV and prevalence (which should be 18/20 = 90%, 12/16 = 75%, 18/22 = 81.8% and and 20/36 = 56%).

These numbers are reasonably close to the first investigation so their consistency seems favorable.

A third study found that targeting biopsies at areas of high blood flow resulted in higher detection rates: [Science Daily].

More information on power Doppler sonography is summarized on the web site of Dr. Bard here: [here].

Other definitions

The cancer detection rate (or just detection rate) is used to refer to the fraction of actual cancers found. That is it is the number of subjects that tested positive and had cancer divided by the number of subjects. It is also called the true positive rate.

The term test prevalence is used to refer to the fraction of positive tests among the entire population tested. This statistic does not take into account whether the test is correct or not. It just uses the number of all positive tests as the numerator.

In these terms the PPV is the cancer detection rate divided by the test prevalence while the sensitivity is the cancer detection rate divided by the true prevalence.

Cutoff Value

As one varies the cutoff value that separates positive from negative test scores the sensitivity and specificity change. Similarly the PPV and NPV change. In fact, one can arrange for the PSA test to have any sensitivity desired by modifying the cutoff level sufficiently. Similarly one can arrange for the PSA test to have any specificity desired; however, if we fix either the sensitivity or the specificity then the other will be determined. We cannot fix both at once.

ROC Curve

If for each cutoff value we plot the fraction of positive tests among diseased individuals (along the vertical axis) against the fraction of positive tests among the healthy individuals (along the horizonal axis) we get a curve known as the receiver operator curve (ROC). In terms of sensitivity and specificity this amounts to plotting the sensitivity against 1-specificity for each possible cutoff value. The curve will start at (0,0) which corresponds to a very high cutoff value and rises to (1,1) which corresponds to a cutoff value of 0 or at least a very low cutoff value. More information on ROC curves can be found in these articles: Wikipedia and anaesthetist.


The area under the curve (AUC) is the fraction of sensitivity/specificity combinations that are worse than that of the test under consideration. The larger the AUC the better.

Comparing Tests

There are a number of ways of comparing tests:
  1. ROC. If we plot the ROC curves for two tests on the same graph and if one lies entirely above the other than that one has a higher sensitivity than the other for every specificity and so the higher curve represents a uniformly superior test.

  2. Sn+Sp-1. A test with sensitivity plus specificity equal to 1 is no better than random. For example, suppose we flip a coin and assign subjects a positive test if it comes up heads. That test has a sensitivity and specificity each of 0.5 so sensitivity plus specificity equal 1. If the coin is biased, e.g. .9 heads, then we get a sensitivity of .9 and specificity of .1. By varying the bias we can get any desired combination of sensitivity and specificity that sum to 1. Thus we can use sensitivity + specificity - 1 as a gauge of how much better a particular test at a particular cutoff is relative to a random assignment.

  3. AUC. As mentioned previously, the area under an ROC curve (AUC) is the proportion of sensitivity, specificity pairs that are less than the test in question. This can be used as a measure of test desirability. A larger number is better.

  4. Utilities. Except for #1 these methods do not explicitly take into account the seriousness of the two sorts of errors: not detecting that someone has prostate cancer vs. subjecting patients to needless further testing. Clearly the first error (missing someone with cancer) is the more serious; however, its probably not feasible to subject everyone to biopsy so some tradeoff needs to be made. If one could assign costs (not necessarily monetary) to the two errors a tradeoff analysis could be made.

Bayes Odds Form

The Bayes odds form for PPV and 1-NPV can be entered into Mathomatic instead of the explicit forms given before. They are entirely equivalent so its a matter of taste which one we use. If we were solving the equations by hand the odds form do have the advantage of being easier to solve for the component variables. The odds forms also have a certain attractive compactness and elegance to them.

The Bayes Odds form of the equations for PPV and NPV can be written as shown:

PPV / (1-PPV) = Sn / (1-Sp) * p/(1-p)
(1-NPV) / NPV = (1-Sn) / Sp * p/(1-p)

Here the quantities Sn / (1-Sp) and (1-Sn) / Sp are known as the positive and negative likelihoods. The left sides are the odds of PPV and the odds of 1-NPV. The p/(1-p) on the right hand side is the odds of p. The term odds is used here in the same sense as in racetrack betting.

Odds and probabilities are equivalent. In fact, the following two formulas can be used for translating between odds and probabilities:

odds = probability / (1 - probability)
probability = odds / (1 + odds)

The Bayes odds forms are easier to solve for Sn and Sp so instead of solving directly for the probability of PPV or p one solves for their odds and then uses the odds to probability conversion formula.

See: [link] and [link].

Tabular Form

We can form a 2x2 table in which diseased and healthy correspond to columns one and two and postiive test (i.e. indicative of disease) and negative test (i.e. indicative of healthy) correspond to rows. If p is prevalence, pT is test prevalence, Sn is sensitivity, Sp is specificity, PPV is positive predictive value and NPV is negative predictive value we can fill it in two ways like this:

Diseased Healthy Total
+Test p * Sn (1-p) * (1-Sp) pT
-Test p * (1-Sn) (1-p) * Sp 1-pT
Total p 1-p 1

Diseased Healthy Total
+Test pT * PPV pT * (1-PPV) pT
-Test (1-pT) * (1-NPV) (1-pT) * NPV 1-pT
Total p 1-p 1

The first form is particularly useful since it allows us to form the entire body of the table table given just p, Sn and Sp. The totals can then be calculated by summing the rows and columns. If we were given the PPV, NPV and pT then we could calculate the body of the table using the second formulation although this is less common.

The table can also be used to create the formulas that we have used Mathomatic to derive. For example, with reference to the first table, since PPV is the detection rate, i.e. the upper left cell, divided by the detection prevalence, i.e. the first row total, we have:

PPV = detection rate / detection prevalence
= [upper left cell] / [first row total]
= [upper left cell] / ([upper left cell] + [upper right cell in body])
= [p * Sn] / ([p * Sn] + [(1-p) * (1-Sp)])

which indeed is the formula we started out with when discussing PPV.

PSA Screening and Early Detection. Part 1 - Guides
PSA Screening and Early Detection. Part 2 - Key Points on PSA
PSA Screening and Early Detection. Part 3. Current Environment
PSA Screening and Early Detection - Part 4. Diagnostic Testing Concepts [previous]
PSA Screening and Early Detection - Part 5. More Diagnostic Testing Concepts [current]

No comments: