Papers
A measure of the validity of the International Classification of Primary Care in the classification of reasons for encounter
Helena Britt BA PhD
Director, Family Medicine Research Unit, Department of General Practice, University of Sydney, Acacia House, Westmead Hospital, WESTMEAD NSW 2145, AUSTRALIA
Abstract
Objective: to assess the concurrent validity of the International Classification of Primary Care (ICPC) in the classification of patient reasons for encounter. Design: Analyses of utilisation of ICPC codes in classifying patient reasons for encounter (RFEs). Setting: primary health care. Subjects: 146,940 patient RFEs recorded by general practitioners in an Australian National survey. Main outcome measures: Relative frequency of utilisation, classed as: frequent (>5/1,000 contacts); intermediate (15/1,000); marginal (0.51/1,000); or rare(<0.5/1,000 contacts). Relative use of rag-bag codes which group multiple concepts. Results: Of 1,371 available codes 76.7% were used at least once. Over two-thirds of all RFEs were classified in Components 1 (symptoms/complaints) or 7 (diagnoses/disease), in which 93.8% of rubrics were selected. Only 48 ICPC codes accounted for 65.5% of all RFEs. Codes never used totalled 319, only 43 of these being in Components 1 and 7. Only 3.3% of RFEs were classified in the 86 identified rag-bag codes. In relative terms, the proportion of RFEs associated with the eye (19.6%) and the male genital system (17.6%) coded in rag-bags was high. Conclusion: ICPC was found to be a valid tool with which to classify RFEs, but some suggestions for improving future versions of the classification are put forward.
Introduction
In Australia, general practice (family medicine) is the usual point of entry into medical care. There is no patient registration and patients are free to consult multiple practitioners.
There has been one major study of patient reasons for consulting a general practitioner (GP). The Australian Morbidity and Treatment Survey (AMTS) 19901991 was a cross-sectional study in which a random sample of 495 GPs actively recorded details about more than 100,000 doctorpatient contacts on structured paper-based contact forms. Variables included patient demographics, patient reasons for encounter (RFEs), problems managed and treatments provided. At least one and up to three RFEs were requested for each contact[1]. The 146,940 patient RFEs recorded were secondarily coded with the International Classification of Primary Care (ICPC)[2].
ICPC is also being used to classify RFEs and/or problems managed in Norway[3], Canada[4,5], the Netherlands[6,7,8], Belgium[9] and the United States[10]. The extent to which ICPC is a valid tool with which to classify such data should therefore be assessed.
In the development of a database such as that gained in the AMTS, data collection moves through specific stages: GP sample selection; cluster sampling of patients around each GP; GP data recording; secondary coding; and data entry. At each stage the data can be invalidated by the application of inappropriate methods.
Previous work has reported: the extent to which the GP and patient samples represent all GPs and all patients attending any general practitioner[11]; the degree to which patient RFEs perceived and recorded by the GP accurately reflect patient recalled RFEs[12]; and the reliability of secondary coding of the recorded RFEs[13]. A related paper investigated inter-practitioner reliability and validity of the recorded morbidity[14].
This paper addresses the next question: whether ICPC is a valid tool with which to classify the data. Application of an inappropriate classification, just as application of unsuitable statistical methods, can render results drawn from a valid dataset invalid. For example, ICD and the Canadian General Practice Classification are unsuitable tools for classifying primary care data because they lack rubrics for many problems commonly managed in general practice[15,16].
Validity of an instrument is usually measured by comparing results with a gold standard set by another instrument. This approach is often referred to as convergent or criterion validity. Where no gold standard is available, validation can only occur "by the gradual accumulation of data from many different kinds of investigations"[17].
ICPC was designed and field-tested by a group of experts in classification of general practice data[18], and so could be said to have face validity. It could also be said to have consensual validity since its structure and rubrics were chosen through consensus of these experts[19]. However, these types of validity are biased by the subjective opinions of the experts and further assessment of the validity of ICPC is warranted. There is no gold standard against which results using ICPC can be compared, but we can begin to assess the concurrent validity of ICPC in its RFE mode, using the AMTS data. Kidder and Judd[20] define concurrent validity as "the ability of a test (or instrument) to distinguish between people who are known to differ". Messick[21] suggests it would be better described as "diagnostic utility". This paper explores several aspects of the validity of ICPC as a tool to classify patient RFEs in a manner that reflects the breadth of available information.
ICPC is a hierarchical classification, which only provides a specific code for the more common or more important problems dealt with in primary care. Any classification must also include some unspecific codes to cover less common conditions and poorly described information not classified elsewhere. Wood et al[22] refer to these as rag-bag codes. However, if rubrics are poorly selected and do not satisfactorily describe the concept, or describe problems rarely seen in primary care, many may never be used. If too large a proportion of concepts are classified in rag-bag codes, the classification has failed to identify some concepts which should have their own rubrics. Likewise, if a large proportion fall into few individual rubrics and the remainder are lumped into the rag-bag codes, many rubrics would remain unused and the classification would not discriminate sufficiently between individual concepts.
Studies investigating utilisation of the codes available in a classification are rare. As part of a broader study of the introduction of ICPC in Norway, Brage et al[3] included a brief overview of utilisation rates in coding morbidity for sickness certification. Lamberts et al[22] compared code utilisation in the classification of morbidity and, in some cases RFEs, from a number of studies in which data were either already coded in ICPC or were mapped to it. Results were reported for 91 rag-bag codes but the codes included were not identified. Results were reported in terms of rate per 1,000 patient years, rendering them non-comparable with those of the present study, which is consultation-based. However Lamberts utilisation classes have been adapted for the following analysis, results being reported in terms of rate per 1,000 contacts.
Method
The validity of ICPC in classifying patient RFEs was assessed by measuring the extent to which available rubrics were utilised and there was a place for everything needing to be classified. There were three stages of analysis:
- The number of codes utilised at least once in coding patient RFEs was determined
- Utilisation rates of codes in the commonly-used Components 1 (symptoms and complaints) and 7 (diagnoses) were defined as:
- Frequent
: selected >5 times per 1,000 contacts- Intermediate
: 15 per 1,000 contacts- Marginal
: 0.5<1 per 1,000 contacts- Rare
: <0.5 per 1,000 contacts
3. The relative utilisation of rag-bag codes was calculated
As Lamberts did not report the ICPC codes classed as rag-bags, subjective judgment of inclusions was required. Each of the seventeen ICPC chapters (except chapter Z: social) has two definite rag-bag codes, -29 (other symptoms and complaints) and -99 (other disease/disorder). More were identified through a search for "other" in the rubric, but where the rubric also incorporated a specific diagnosis it was excluded. The 86 rag-bag codes (6.3% of all ICPC codes) identified were: A17; A29; A77; A78; A99; B03; B74; B77; B79; B86; B99; D29; D73; D77; D80; D99; F05; F29; F73; F79; F99; H29; H74; H79; H99; K03; K29; K84; K99; L19; L29; L76; L81; L99; N29; N73; N81; N85; N99; P29; P79; P99; R29; R83; R85; R88; R99; S11; S19; S29; S76; S79; S80; S83; S99; T29; T73; T99; U29; U77; U79; U80; U85; U99; W18; W20; W29; W75; W77; W95; W96; W99; X15; X29; X77; X80; X81; X99; Y04; Y08; Y14; Y29; Y78; Y84; Y99; Z29.
Results
Codes selected at least once
In classifying 146,940 RFEs, 76.7% of the 1,371 codes were used at least once. Proportional usage in Components 1 (symptoms/complaints) and 7(diagnoses/disease) was far higher (93.8%). Of the 680 process codes (Components 26) available, 404 (59.6%) were selected at least once. Half the codes in Component 2 (diagnostic and preventive procedures) and Component 3 (medication, treatment, therapeutic procedures) were used at least once. Both rubrics in Component 4 (results) were selected in every chapter, while the single rubric in Component 5 was employed in 16 of the 17 chapters. The seven standard codes in Component 6 (referrals and other reasons for contact) provided 119 options, 91 of which were selected once or more, almost half those not utilised being -68 codes (other referrals not elsewhere classified) (see Table 1).
Table 1: Utilisation of ICPC component in classifying RFEs and diagnoses in the AMTS
Component |
||||||||
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Total |
|
| Codes available (N) | 327 |
340 |
170 |
34 |
17 |
119 |
364 |
1371 |
| Codes used | 317 |
171 |
92 |
34 |
16 |
91 |
331 |
1052 |
| % used per component | 96.9 |
50.3 |
54.1 |
100.0 |
94.1 |
76.4 |
90.9 |
76.7 |
| (% used Components 26) | (59.4) |
|||||||
Utilisation rates
In Components 1 and 7 only 43 codes were never selected. The few (48) codes arising at a frequent rate (>5/1000 contacts), accounted for almost half (47.1%) the RFEs and for 65.5% of those in components 1 and 7. A further 118 arose at an intermediate rate (15/1000 contacts) and the cumulative result demonstrated that 166 rubrics accounted for two-thirds (64.6%) of all RFEs and for almost 90% of those in components 1 and 7. The majority of the 48 frequent codes were symptom and complaint descriptions (n=41, 85.4%) rather than diagnostic labels. Of the 148 intermediate codes, 85 (57.4%) were in component 1 (see Table 2).
Table 2: Utilisation rates of codes from ICPC Component 1 (symptoms and complaints) and Component 7 (diagnosis/disease) (N=691) in the classification of patient reasons for encounter rate per 1,000 contacts
Rate of utilisation per 1,000 contacts |
|||||
Frequent |
Intermediate |
Marginal |
Rare |
Not used |
|
>5 |
15 |
0.51 |
<0.5 |
0 |
|
| Total number of codes (n) | 48 |
118 |
70 |
412 |
43 |
|
41 |
85 |
42 |
148 |
10 |
|
7 |
33 |
28 |
264 |
33 |
| Frequency (n) | 69,247 |
25,669 |
5,200 |
5,599 |
|
| Average frequency of use (n) | 1433 |
218 |
74 |
14 |
|
| Average rate/1,000 contacts | 14.6 |
2.2 |
0.74 |
0.14 |
|
| Cumulative % of Component 1&7 RFEs | 65.5 |
89.8 |
94.7 |
100.0 |
|
| Cumulative % of total RFEs (all components) | 47.1 |
64.6 |
68.2 |
71.4 |
|
The 70 codes applied at a marginal rate (0.51/1000 contacts) had little effect on the cumulative result where 236 rubrics covered 68.2% of all RFEs and 94.7% of those in Components 1 and 7. The remaining 412 symptom or diagnostic codes were rarely selected (<0.5/1000 contacts) and accounted for only 3.2% of total RFEs.
Utilisation of rag-bag codes
The 86 codes identified as rag-bags accounted for 3.3% of all RFEs. Almost half of these (44.6%) were in -29 (other symptom, complaint) or -99 (other disease) codes. Rag-bag codes were relatively evenly spread across chapters, the majority including 4 or 5 with the highest number (8) in the skin and pregnancy/family planning chapters, and the fewest (1) in the social chapter (see Table 3, Column 1).
Table 3: Distribution of rag-bag utilisation in the classification of patient reasons for contact by ICPC chapter
| ICPC Chapter | No. of rag-bag codes available |
RFEs coded in rag-bags (n) |
%of total rag-bag RFEs (N=4870) |
% of total RFEs in chapter |
| General | 5 |
390 |
8.0 |
0.2 |
| Blood | 6 |
67 |
1.4 |
5.8 |
| Digestive | 5 |
156 |
3.2 |
1.3 |
| Eye | 5 |
681 |
14.0 |
19.9 |
| Ear | 4 |
321 |
6.6 |
6.7 |
| Cardiovascular | 4 |
152 |
3.1 |
1.3 |
| Musculoskeletal | 5 |
629 |
12.9 |
3.5 |
| Neurological | 5 |
98 |
2.0 |
1.5 |
| Psychological | 3 |
153 |
2.4 |
3.2 |
| Respiratory | 5 |
339 |
6.9 |
1.4 |
| Skin | 8 |
905 |
18.6 |
5.8 |
| Endocrine,nut,met | 3 |
115 |
2.4 |
2.5 |
| Urological | 6 |
67 |
1.4 |
2.6 |
| Pregnancy, family planning | 8 |
59 |
1.2 |
1.6 |
| Female genital | 6 |
544 |
11.2 |
8.5 |
| Male genital | 7 |
133 |
2.7 |
17.6 |
| Social | 1 |
66 |
1.3 |
5.9 |
| Total rag-bags | 86 |
4870 |
100.0 |
3.3 |
| Rubrics -29 | 17 |
1790 |
36.8 |
1.3 |
| Rubrics -99 | 16 |
382 |
7.8 |
0.2 |
The majority of the 4,870 RFEs allocated to rag-bag codes were in the musculoskeletal (12.9%), skin (18.6%), eye (14.0%), and female genital (11.2%) chapters (Column 3). Utilisation rates were then calculated relative to total RFEs classified in each chapter (Column 4). One in five RFEs associated with the eye and 17.6% of those in the male genital system were classified in rag-bag codes. Although those in the female genital chapter represented more than 10% of all rag-bags, they accounted for only 8.5% of the chapter total. The remaining chapters showed a far smaller proportion of rag-bags to total (0.2%5.9%).
Discussion
The results of this study demonstrate that the symptom component in ICPC facilitated the coding of a large proportion of patient RFEs. The breakdown of utilisation rates for rubrics in each of Components 1 and 7 demonstrated that patients frequently describe their reasons for attendance in symptomatic terms. Since no other classification provides such a wide selection of rubrics that describe symptoms and complaints it is unlikely that patient RFEs could be successfully classified with other available systems.
The breadth of codes used in classifying RFEs was good with very few symptom and diagnostic codes not used. While the codes in Components 26, common to all chapters, were not all selected in every chapter, they were used sufficiently across chapters to be retained in their present cross-sectional form. Their uniform application facilitates meaningful analysis.
The fact that such a small number of codes accounted for more than two-thirds of symptomatic and diagnostic RFEs parallels Australian diagnostic data where the 22 most commonly managed problems account for more than 40% of all problems managed. However, as almost 95% of symptomatic and diagnostic RFEs were classified with only 17% of the available codes, a more detailed breakdown of some of these rubrics may be beneficial. For example, ICPC code T90 includes all forms of diabetes except that associated with pregnancy. From a population health viewpoint, differentiation between insulin-dependent and non-insulin-dependent diabetes is essential, and the rubric should be broken into two.
In many countries including Australia, the lack of specificity demonstrated in this example is being overcome by the development of extended versions of the classification for application in computerised clinical systems. Such extensions will allow the clinician to save a more specific code and description of the problem in the computer (e.g. IDDM and NIDDM) while retaining international standards at the upper ICPC level.
The very small proportion of total RFEs that were classed in the rag-bag codes suggests that ICPC provides sufficient individual rubrics to cover the large majority of patient RFEs described in general practice. Codes -29 and -99 accounted for only 1.5% of total RFEs. In contrast, in the Norwegian Sickness Benefit Register, where the practitioners coded data, 8.9% of all diagnoses were coded in this manner
[3]. One would anticipate that secondary coding of patient RFEs would have resulted in higher relative use of rag-bags. Firstly the patient description of the reason for contact is likely to be less specific than the doctors problem label/diagnosis. Secondly, if the clinician is coding s/he has greater knowledge of the problem than is available to secondary coders working with paper records. The differences in results may be due to the use of an expanded and highly detailed index in the present study[13] that led the coders to the correct rubric. In Brages study[3], where a problem label was not listed in the index, practitioners may have tended to choose other symptoms and other diagnoses codes. While the data being classified are different in the two studies the Norwegian study also noted the high proportion of rag-bag allocation in the eye chapter and this suggests that the rubrics available in this chapter should be critically evaluated.In future editions of ICPC consideration should be given to the deletion of the 43 unused symptom and diagnostic codes. Detailed analysis of codes used less than once per 100,000 contacts may also provide data indicating individual codes which may not be required. Some regrouping may be beneficial. However ICPC is an international classification and the needs of other countries must also be considered. Some rubrics infrequently applied in the AMTS may arise more often elsewhere.
Deletion of any codes should only be considered in light of related rag-bag usage. For example, in the chapter pertaining to the blood, two symptom codes were not used, but the rag-bag codes were also rarely adopted. In contrast, in the skin chapter the eight rag-bag codes were all used quite frequently, while the code for disability/impairment of the skin was never selected.
Conclusion
The results of this study suggest that ICPC is a valid tool for the classification of patient RFEs but that there is room for improvement. In future revisions of ICPC, consideration should be given to allocating more specific codes where a code is frequently utilised and covers multiple concepts. Review of the codes that were never or infrequently applied may also be beneficial, especially in chapters where use of rag-bag codes was relatively common.
Acknowledgments
The study on which this work is based was generously supported by a grant from the Australian National Health and Medical Research Council and the (then) Australian Commonwealth Department of Human Services and Health, through the General Practice Evaluation Program. The paper could not have been prepared without the statistical assistance of Geoffrey Sayer and the administrative support of Donna McIntyre of the Family Medicine Research Unit.
References