Morgan L. Cowan MD1 Christopher J

0 downloads 0 Views 2MB Size Report
The Paris System for Reporting Urinary Cytology: Early Review of the ... strict criteria to define the Atypical Urothelial Cells category, intending to reduce the ...
The Paris System for Reporting Urinary Cytology: Early Review of the Literature Reveals Successes and Rare Shortcomings

Morgan L. Cowan MD1 Christopher J. VandenBussche MD PhD1,2 Departments of Pathology1 and Oncology2 The Johns Hopkins University School of Medicine, Baltimore, MD Corresponding Author: Christopher J. VandenBussche Department of Pathology The Johns Hopkins Hospital 600 N. Wolfe St. Baltimore, MD 21287 Phone: (410) 955-1180 Fax: (410) 614-9556 Email: [email protected] Funding sources: None Disclosures: Dr. VandenBussche serves as a consultant for Personal Genome Diagnostics. The authors have no other relevant financial disclosures or conflicts of interest. Keywords: urine, urothelial carcinoma, urothelial neoplasia, bladder cancer, the Paris System for Reporting Urinary Cytology Running title: Paris Literature Review Acknowledgement: The authors thank Travis Sluka for providing the figure of cells with various N/C ratios.





Abstract The Paris System for Reporting Urinary Cytology (TPS) provides recommendations for the diagnosis of urinary tract cytology (UTC) specimens and has found acceptance on an international level. Since the official release of TPS in 2016, numerous research studies have been published analyzing its impact. This review summarizes the studies published since the release of TPS, highlighting areas in which TPS has performed well and other areas in which TPS may need improvement.



Introduction The Paris System for Reporting Urinary Cytology (TPS) emerged following discussions of urinary tract cytopathology at the 2013 International Cytology Congress in Paris, France1, 2. The need for a coherent and consistent system became apparent to anyone observing the great variability by which urinary tract cytology (UTC) specimens were assessed, both between individuals and institutions. Of particular concern was the inconsistent and high rate of indeterminate diagnoses, such that an “atypical” diagnosis implied only a low risk of malignancy, which greatly diminished the utility of the diagnosis for clinicians. To develop this system, expert international working committees assessed the evidence for certain practices, and practicing cytopathologists were surveyed regarding their practice patterns and preferences. Goals of the working committees included: (1) introduce and define a set of practical diagnostic categories using a standardized nomenclature; (2) define cytomorphologic criteria associated with each diagnostic category; and (3) raise awareness of unusual or striking findings that can be dismissed rather than being classified as atypical. The finalized recommendations of TPS were officially released in 2016, both published fully in book format, and presented at the 2016 International Cytology Congress in Yokohama, Japan. Several institutions have published studies to describe the impact of TPS on their practice in the brief period since TPS release. Additionally, some institutions have applied TPS criteria to archival UTC specimens in order to increase the number of specimens available to study. Alterations in Institutional Diagnoses Based on TPS Criteria Most studies have focused on the impact TPS has had on indeterminate diagnoses. TPS committees recognized both the variability in frequency (ranging from 1.9%-26%) and predictive value (ranging from 8.3%-37.5%) of “atypical” diagnoses among institutions3-13. To address this variability, TPS developed strict criteria to define the Atypical Urothelial Cells category, intending to reduce the number of

unnecessary indeterminate diagnoses (i.e. “false positive”). It should be noted that institutions with higher risk populations are likely to have a corresponding increased frequency of AUC diagnoses. The rate of malignancy (ROM), meaning the percentage of patients with an indeterminate diagnosis who go on to a frankly malignant diagnosis with subsequent testing, is one way to monitor the utility of the indeterminate category. An ideal ROM for the AUC category has not been defined, although under ideal conditions, the ROM would be high enough to affect clinical decision-making without reducing the high ROM of the Suspicious for High Grade Urothelial Carcinoma category. In four prospective studies comparing the assignment of specimens to diagnostic categories both before and after establishment of TPS, TPS resulted in a decrease in the rate of atypical diagnoses, with the decrease in rate ranging from 0.9% to 13% (Table 1)14-17. In these studies, the overall rate of an atypical diagnosis post-TPS ranged from 14.4-26%. Two studies reported a small decline in their “suspicious” rates (declines of 1.3% and 0.6%) with total Suspicious for HGUC (SHGUC) diagnosis rates of 4.5% and 2.4%. One study showed a small increase in rate of SHGUC (3.3% to 4.4%). The rate of HGUC diagnoses for the three reporting institutions actually increased using TPS from 01%, 0.2%, and 1.2% to 2.9%, 3.2%, and 5.0%, respectively. Two studies show an increase in the power (and therefore clinical utility) of an AUC diagnosis with TPS. In these studies, the percentage of patients with an AUC diagnosis ultimately diagnosed with HGUC increased (from 33% to 53% and 28.3% to 46.1%). In both these studies, the predictive value of SHGUC for HGUC slightly declined (91% to 83% and 81.5% to 75.0%) while the predictive value of the HGUC category did not decline (Table 2)14, 18, 19. The majority of studies (n=7) were structured as a retrospective slide review. In these studies, slides previously signed out before introduction of TPS were then re-reviewed and reclassified using TPS criteria (Table 3)20-26. The ROM for each TPS category ranged as follows: NHGUC, 1.6-36.8%; AUC, 0-50%; SHGUC, 11.1-100%; and HGUC, 35.5%-100%. These ROM are based on a subsequent biopsy-proven HGUC diagnosis on follow up. One reason for variability among the results and studies could be the

differences in defining the follow up period for HGUC diagnosis. Additionally, many of the studies included very few cases, limiting confidence in the results. The retrospective nature of the studies also limits the interpretation of results. This may in part explain why the results of the Granados et al. study differ significantly from those of the other studies. Despite the outliers, most studies demonstrate a gradual increase in ROM using TPS diagnostic categories and their data correlate well with the prospective data seen in Table 2. Parenthetically, these data in Table 2 demonstrate a high ROM, even for the NHGUC category; however, exclusion of UTC specimens without matched follow up biopsies introduces a bias towards patients with abnormalities identified on cystoscopy who are therefore much more likely to have a malignancy. Standardization of Nomenclature In the absence of a formal reporting system, most cytologic diagnoses fall into general categories such as benign, indeterminate, and malignant. Some institutions have used a single indeterminate category, while others have favored a two-tiered system containing a low risk category (such as “atypical”) and a high-risk category (such as “suspicious”). All publications thus far assessing the impact of TPS have come from institutions previously employing a two-tiered system; thus, the impact of switching from a onetiered system to the two-tiered TPS is unknown. Previous studies have supported the use of a two-tiered system; Joudi et al. demonstrated that their “suspicious” category had a distinct positive predictive value when compared to the “positive” (malignant) category (79.2% vs. 55.3%), providing evidence that the two should be kept as distinct categories. Numerous additional studies have indicated a much higher PPV for “suspicious” over merely “atypical” indeterminate categories27. In the modern era of ancillary molecular testing, the advantages of a two-tiered system have been illuminated. For example, using the UroVysion ancillary FISH assay, Glass et al. found that a system with two indeterminate categories performed better than systems with additional indeterminate categories.28 TPS committee kept the Atypical Urothelial Cells as a distinct diagnosis, and currently no data exist to support or refute this

decision. The clinical utility of the AUC category will depend on its performance for predicting HGUC and whether a useful reflex ancillary test can determine which patients are at higher risk of malignancy. There are also no studies evaluating the impact of subtle changes in nomenclature on a pathologist’s utilization of a given category. For instance, to most, the word “suspicious” carries a stronger malignant implication than the phrase “cannot exclude” – thus, a pathologist may have a higher threshold for labelling a specimen “suspicious for high grade urothelial carcinoma” versus “cannot exclude high grade urothelial carcinoma”. Given the high predictive value of most institution’s higher-risk indeterminate categories, the decision by TPS committees to include the word “suspicious” in the higher-risk diagnostic category appears to be appropriate, and the outcome of this decision is not limited to the pathologist. One study by Fite et al. has shown that patients are more likely to receive follow up when the word “suspicious” is included in their diagnosis, although whether the motivation arises from the treating physician or patient is unknown29. Definition of Cytomorphologic Criteria TPS defined HGUC criteria intending to create a system with a very high specificity for the diagnosis of malignancy. Given the goal of very high specificity, it may not be surprising that most institutions report a decrease in HGUC diagnoses following establishment of TPS. From these studies emerges a flaw of TPS: in some instances, populations of overtly malignant cells do not meet criteria for a malignant diagnosis. Several studies evaluating cytomorphologic features of HGUC specimens after implementing TPS criteria note a wide variety in appearances of malignant cells, especially in degenerated specimens. Combining the high specificity of the TPS criteria with a wide range of appearance in malignant cells suggests that if interpreted too literally, specimens containing overtly malignant cells would be classified into an indeterminate rather than malignant category and that these criteria may be too strict.

Cowan et al. reviewed 112 UTC specimens from 80 patients subsequently diagnosed with HGUC and reclassified them based on TPS criteria30. They found that the most restrictive morphologic criterion for a HGUC diagnosis was the N/C ratio (>0.7). However, most specimens disqualified from a HGUC diagnosis were disqualified due to an inadequate number of cells meeting all 4 cytomorphologic criteria (at least 5 non-degenerated cells). Interestingly, the same study found that approximately 40% of specimens previously diagnosed into indeterminate categories were classified into a higher-risk category, indicating that TPS provided improved risk stratification. This improvement exists despite the restrictive TPS criteria. The study showed that, despite being at times onerous, TPS criteria improved risk stratification for patients based on future risk of malignancy. Several studies have expanded on the earlier discussion that TPS, when interpreted strictly, fails to identify HGUC with certain morphologies that most experts would regard as definitively malignant. In addition, TPS recommends against assessing degenerated cells. Confoundingly, in many instances, overtly malignant cells may appear degenerated. Degenerated malignant cells can display pyknotic (“ink black”) nuclei and may have more abundant, vacuolated cytoplasm.31 Malignant cells may sometimes appear large and multi-nucleated, resembling benign umbrella cells32. In these instances, the N/C ratio of the malignant cell is often below 0.5, falling short of TPS minimum cutoff for atypia, let alone malignancy. The fact that TPS criteria, when strictly applied, may under-classify malignant cells demonstrates one limitation of a system with a high threshold for malignancy. Several studies have evaluated whether pathologists can accurately determine the N/C ratio, as N/C ratio determinations were not commonly performed prior to TPS, yet these ratios play a decisive role in the diagnostic categorizations of TPS (Figure 1). Vaickus and Tambouret found that trained morphologists can make accurate estimations of the N/C ratio, and that these estimates are more accurate at higher N/C ratios33. Conversely, Layfield et al. found that pathologists demonstrated poor accuracy and interobserver reproducibility when assessing the N/C ratio, with worse accuracy at higher

ratios34. Zhang et al. showed that pathologists tend to overestimate the N/C ratio, especially around the key cut-offs defined by TPS (0.5 and 0.7)35. It remains to be seen whether with additional training and practice, pathologists will assess these values more accurately. Hang et al. examined whether a nuclear-to-cytoplasmic ratio of 0.5 for atypia (as used in TPS) was an appropriate cut-off for the Atypical Urothelial Cells category36. Using specimens previously diagnosed as indeterminate, they calculated the N/C ratios of the atypical cells present in specimens from patients both with and without HGUC on follow up biopsy. In this study, a N/C ratio of 0.486 provided the optimal sensitivity and specificity for the prediction of HGUC, supporting the use of 0.5 as a cut-off. The study also demonstrated that small variations in the N/C ratio cutoff result in large shifts in the sensitivity and specificity of the exam. Such small variations are likely too subtle for the unaided human eye to reliably discern, and suggest a potential role for computer-aided image analysis in assessing such cases. Long et al. assessed interobserver agreement among individuals classifying specimens according to TPS criteria and found that high levels of interobserver variability for categories other than NHGUC 37. They determined that 15% of identified disagreements were clinically impactful and argue that TPS criteria may be insufficient to standardize classification between individuals. The Paris Interobserver Reproducibility Study (PIRST) surveyed 1,313 participants worldwide regarding the classification of images into TPS diagnostic categories38. The highest concordance among observers was seen for the NHGUC, LGUN, and HGUC categories, with less concordance seen for the AUC and SHGUC categories. Concordance was not impacted by practice type (academics vs. non-academic) or country (United States vs. international). One consideration regarding variances between observers and institutions is the difference arising from specimen preparation. A study by Straccia et al. comparing Cytospin and ThinPrep preparation methods

indicated that preparation method did not significantly impact the predictive value of the SHGUC category for malignancy, but resulted in a decreased positive predictive value for the AUC category when ThinPrep was used (7% vs. 33%; ThinPrep vs. Cytospin)19. Cytology is of particular importance in the detection of upper tract lesions, especially because these areas may not be easily biopsied during cystoscopy procedures. Zhang et al. found that upper tract HGUC lesions often cause atypical changes that are associated with higher rates of AUC diagnoses in voided specimens, but that dedicated upper tract washings increased the ability to definitively diagnose these malignancies39. Bertsch et al. found that TPS improved correlation of cytologic and surgical pathology diagnoses for both lower urinary tract lesions and upper tract lesions, although the improvement was greater for the lower tract lesions40. Only one study has applied TPS criteria specifically to upper tract lesions and this study found the performance of TPS categories NHGUC, AUC, SHGUC, and HGUC had similar rates of malignancy compared to other studies containing predominantly voided urine specimens (Table 3).26 The LGUN category, in particular, performed extremely well in this setting, with all LGUN lesions being classified as LGUN. As has been noted by urologists, the performance of TPS in the setting of recent treatment will be an important factor in determining the clinical utility of TPS. The assessment of urinary tract specimens following treatment is difficult, as the background of inflammation and degenerated cells obscures any remaining viable malignant cells41. Pierconti et al. examined a subset of patients with non-muscle invasive HGUC being treated with chemohyperthermia or electromotive drug administration, and found that SHGUC diagnoses were highly associated with HGUC on subsequent biopsy42. The diagnosis and categorization of infections by BK polyomavirus remains controversial. While treatment effects can cause malignant cells to be overlooked, infections with the BK polyomavirus infection may cause such dramatic cytomorphologic changes in benign cells that the infected cells can

be misinterpreted as malignant. Over time, such infected cells have earned the nickname “decoy cells” due to this distracting atypia. TPS recommends that changes secondary to BK polyomavirus be classified under the NHGUC category. However, degenerated HGUC cells can overlap with BK-like alterations, and Allison43 et al. found that even expert cytopathologists had difficulty predicting which specimens would ultimately be associated with HGUC or with benign follow up. The authors speculate that this inability to predict which specimens are benign may vary between laboratories depending on whether a given laboratory screens specimens from renal transplant patients for BK virus, as BK virus reactivation is increased in this patient population. Additionally, some studies have shown a potential link between BK polyomavirus and urothelial carcinoma, which may explain the slightly increased risk of HGUC for patients with specimens containing BK-like changes44, 45. Low Grade Urothelial Neoplasia (LGUN) The diagnosis of LGUN using UTC remains controversial. While TPS allows for this diagnosis, it recommends its use only under stringent conditions. Studies continue to demonstrate that the detection of LGUN by UTC has low sensitivity, low specificity, and high interobserver variability. Washing and instrumented specimens from patients with LGUN may be cellular and contain unusual-appearing urothelial cells, but applying TPS criteria to such cases rarely results in a diagnosis greater than Atypical Urothelial Cells (AUC) 46. Both retrospective and prospective studies of TPS provide some insight into its impact on diagnosis of LGUN lesions (Tables 4 and 5). LGUN remains a diagnostic category of TPS; however, some institutions and/or authors have chosen to not utilize it and instead to classify specimens containing features of LGUN, but not of HGUC, as NHGUC (rather than LGUN). Only three studies have examined the corresponding surgical follow up among UTC specimens diagnosed as LGUN (Table 4). Roy et al. examined 13 cases and demonstrated that 77% had LGUN on follow up25. Bertsch et al. examined 5 cases and found 80% had LGUN on follow up and 20%

had HGUC/CIS40. Zheng et al. examined 11 cases and found all 11 had LGUN on follow up; of note, this study only considered specimens from patients with lesions in the upper urinary tract and thus may not to the entire urinary tract26. While these three studies show the diagnosis of LGUN reliable indicates a neoplasm, the diagnosis of LGUN should not be taken as an exclusion of a high grade lesion. Approximately 20% of LGUN diagnoses came from patients with HGUC in two of these studies. As always, tumor sampling and tumoral heterogeneity may help explain these discrepancies. A larger amount of data are available describing classification of UTC specimens from patients with LGUN using TPS (Table 5)14, 18, 22-26, 40. For those utilizing the LGUN category, 28-100% of such specimens were classified as LGUN. The range for other diagnoses were: NHGUC, 0-70%; AUC, 0-68%; SHGUC, 020%; and HGUC, 0-20%. While these percentages vary widely between studies, importantly, they show that LGUN lesions are rarely diagnosed as SHGUC or HGUC. The sub-population of LGUN lesions diagnosed as HGUC could result from either cytomorphologic overlap between LGUN and HGUC, LGUN lesions with a minor (