Development and Validation of an Administrative Algorithm to Identify Veterans With Epilepsy
Background: Accurate epilepsy identification in large health care systems has the potential to improve health care delivery and resource allocation. This article summarizes the creation and validation of a 3-tiered algorithm to identify veterans with epilepsy (VWE) receiving care from the Veterans Health Administration (VHA) using administrative data.
Methods: A 3-tier algorithm was developed to identify patients with epilepsy utilizing International Classification of Diseases diagnosis codes and prescription data. Tier 1 integrates seizure-specific diagnostic codes and antiseizure medication data. Tier 2 includes patients with inpatient visits. Tier 3 identifies untreated or less obvious cases by including patients with multiple outpatient visits. VHA administrative databases linked to the VHA Corporate Data Warehouse were used to identify VWE. Tier 1 validation was based on 625 patients and Tiers 2 and 3 validation was based on 300 total patients. Validation was conducted by expert epilepsy clinicians (epileptologists and a nurse care coordinator) comparing algorithm classifications against the International League Against Epilepsy definition of epilepsy to ascertain positive predictive values (PPVs). Annual trends for the number of VWE cases identified by the algorithm within the VHA are also presented.
Results: Tier 1 demonstrated a PPV of 85.1% (95% CI, 82.1%-87.8%). Tiers 2 and 3 offered broader identification and had lower PPVs: Tier 2 PPV was 61.9% (95% CI, 53.4%-70.4%) and Tier 3 PPV was 59.8% (95% CI, 52.5%-67.1%).
Conclusions: By efficiently segmenting veterans based on reliable administrative data, this 3-tiered algorithm supports enhanced surveillance, targeted health care provision, and optimal resource utilization. Though it is tailored to the VHA, this algorithmic approach holds promise for broader application in health care systems facing similar epidemiologic and administrative challenges.
Epilepsy affects about 4.5 million people in the United States and 150,000 new individuals are diagnosed each year.1,2 In 2019, epilepsy-attributable health care spending for noninstitutionalized people was around $5.4 billion and total epilepsy-attributable and epilepsy or seizure health care-related costs totaled $54 billion.3
Accurate surveillance of epilepsy in large health care systems can potentially improve health care delivery and resource allocation. A 2012 Institute of Medicine (IOM) report identified 13 recommendations to guide public health action on epilepsy, including validation of standard definitions for case ascertainment, identification of epilepsy through screening programs or protocols, and expansion of surveillance to better understand disease burden.4
A systematic review of validation studies concluded that it is reasonable to use administrative data to identify people with epilepsy in epidemiologic research. Combining The International Classification of Diseases (ICD) codes for epilepsy (ICD-10, G40-41; ICD-9, 345) with antiseizure medications (ASMs) could provide high positive predictive values (PPVs) and combining symptoms codes for convulsions (ICD-10, R56; ICD-9, 780.3, 780.39) with ASMs could lead to high sensitivity.5 However, identifying individuals with epilepsy from administrative data in large managed health care organizations is challenging.6 The IOM report noted that large managed health care organizations presented varying incidence and prevalence estimates due to differing methodology, geographic area, demographics, and definitions of epilepsy.
The Veterans Health Administration (VHA) is the largest integrated US health care system, providing care to > 9.1 million veterans.7 To improve the health and well-being of veterans with epilepsy (VWEs), a network of sites was established in 2008 called the US Department of Veterans Affairs (VA) Epilepsy Centers of Excellence (ECoE). Subsequent to the creation of the ECoE, efforts were made to identify VWEs within VHA databases.8,9 Prior to fiscal year (FY) 2016, the ECoE adopted a modified version of a well-established epilepsy diagnostic algorithm developed by Holden et al for large managed care organizations.10 The original algorithm identified patients by cross-matching ASMs with ICD-9 codes for an index year. But it failed to capture a considerable number of stable patients with epilepsy in the VHA due to incomplete documentation, and had false positives due to inclusion of patients identified from diagnostic clinics. The modified algorithm the ECoE used prior to FY 2016 considered additional prior years and excluded encounters from diagnostic clinics. The result was an improvement in the sensitivity and specificity of the algorithm. Researchers evaluating 500 patients with epilepsy estimated that the modified algorithm had a PPV of 82.0% (95% CI, 78.6%-85.4%).11
After implementation of ICD-10 codes in the VHA in FY 2016, the task of reliably and efficiently identifying VWE led to a 3-tier algorithm. This article presents a validation of the different tiers of this algorithm after the implementation of ICD-10 diagnosis codes and summarizes the surveillance data collected over the years within the VHA showing the trends of epilepsy.
Methods
The VHA National Neurology office commissioned a Neurology Cube dashboard in FY 2021 in collaboration with VHA Support Service Center (VSSC) for reporting and surveillance of VWEs as a quality improvement initiative. The Neurology Cube uses a 3-tier system for identifying VWE in the VHA databases. VSSC programmers extract data from the VHA Corporate Data Warehouse (CDW) and utilize Microsoft SQL Server and Microsoft Power BI for Neurology Cube reports. The 3-tier system identifies VWE and divides them into distinct groups. The first tier identifies VWE with the highest degree of confidence; Tiers 2 and 3 represent identification with successively lesser degrees of confidence (Figure 1).

Tier 1
Definition. For a given index year and the preceding 2 years, any of following diagnosis codes on ≥ 1 clinical encounter are considered: 345.xx (epilepsy in ICD-9), 780.3x (other convulsions in ICD-9), G40.xxx (epilepsy in ICD-10), R40.4 (transient alteration of awareness), R56.1 (posttraumatic seizures), or R56.9 (unspecified convulsions). To reduce false positive rates, EEG clinic visits, which may include long-term monitoring, are excluded. Patients identified with ICD codes are then evaluated for an ASM prescription for ≥ 30 days during the index year. ASMs are listed in Appendix 1.
Validation. The development and validation of ICD-9 diagnosis codes crossmatched with an ASM prescription in the VHA has been published elsewhere.11 In FY 2017, after implementation of ICD-10 diagnostic codes, Tier 1 development and validation was performed in 2 phases. Even though Tier 1 study phases were conducted and completed during FY 2017, the patients for Tier 1 were identified from evaluation of FY 2016 data (October 1, 2015, to September 30, 2016). After the pilot analysis, the Tier 1 definition was implemented, and a chart review of 625 randomized patients was conducted at 5 sites for validation. Adequate preliminary data was not available to perform a sample size estimation for this study. Therefore, a practical target of 125 patients was set for Tier 1 from each site to obtain a final sample size of 625 patients. This second phase validated that the crossmatch of ICD-10 diagnosis codes with ASMs had a high PPV for identifying VWE.
Tiers 2 and 3
Definitions. For an index year, Tier 2 includes patients with ≥ 1 inpatient encounter documentation of either ICD-9 345.xx or ICD-10 G40.xxx, excluding EEG clinics. Tier 3 Includes patients who have had ≥ 2 outpatient encounters with diagnosis codes 345.xx or G40.xxx on 2 separate days, excluding EEG clinics. Tiers 2 and 3 do not require ASM prescriptions; this helps to identify VWEs who may be getting their medications outside of VHA or those who have received a new diagnosis.
Validations. Tiers 2 and 3 were included in the epilepsy identification algorithm in FY 2021 after validation was performed on a sample of 8 patients in each tier. Five patients were subsequently identified as having epilepsy in Tier 2 and 6 patients were identified in Tier 3. A more comprehensive validation of Tiers 2 and 3 was performed during FY 2022 that included patients at 5 sites seen during FY 2019 to FY 2022. Since yearly trends showed only about 8% of total patients were identified as having epilepsy through Tiers 2 and 3 we sought ≥ 20 patients per tier for the 5 sites for a total of 200 patients to ensure representation across the VHA. The final count was 126 patients for Tier 2 and 174 patients for Tier 3 (n = 300).
Gold Standard Criteria for Epilepsy Diagnosis
We used the International League Against Epilepsy (ILAE) definition of epilepsy for the validation of the 3 algorithm tiers. ILAE defines epilepsy as ≥ 2 unprovoked (or reflex) seizures occurring > 24 hours apart or 1 unprovoked (or reflex) seizure and a probability of further seizures similar to the general recurrence risk (≥ 60%) after 2 unprovoked seizures, occurring over the next 10 years.12
A standard protocol was provided to evaluators to identify patients using the VHA Computerized Patient Record System (Appendix 1). After review, evaluators categorized each patient in 1 of 4 ways: (1) Yes, definite: The patient’s health care practitioner (HCP) believes the patient has epilepsy and is treating with medication; (2) Yes, uncertain: The HCP has enough suspicion of epilepsy that a medication is prescribed, but uncertainty is expressed of the diagnosis; (3) No, definite: The HCP does not believe the patient has epilepsy and is therefore not treating with medication for seizure; (4) No, uncertain: The HCP is not treating with medication for epilepsy, because the diagnostic suspicion is not high enough, but there is suspicion for epilepsy.
As a quality improvement operational project, the Epilepsy National Program Office approved this validation project and determined that institutional review board approval was not required.
Statistical Analysis
Counts and percentages were computed for categories of epilepsy status. PPV of each tier was estimated with asymptotic 95% CIs.
Results
ICD-10 codes for 480 patients were evaluated in Tier 1 phase 1; 13.8% were documented with G40.xxx, 27.9% with R56.1, 34.4% with R56.9, and 24.0% with R40.4 (Appendix 2). In total, 68.1% fulfilled the criteria of epilepsy, 19.2% did not, and 12.7% were uncertain). From the validation of Tier 1 phase 2 (n = 625), the PPV of the algorithm for patients presumed to have epilepsy (definite and uncertain) was 85.1% (95% CI, 82.1%-87.8%) (Table).

Of 300 patients evaluated, 126 (42.0%) were evaluated for Tier 2 with a PPV of 61.9% (95% CI, 53.4%-70.4%), and 174 (58.0%) patients were evaluated for Tier 3 with a PPV of 59.8% (95% CI, 52.5%-67.1%. The PPV of the algorithm for patients presumed to have epilepsy (definite and uncertain) were combined to calculate the PPV. Estimates of VHA VWE counts were computed for each tier from FY 2014 to FY 2023 using the VSSC Neurology Cube (Figure 2). For all years, > 92% patients were classified using the Tier 1 definition.

Discussion
The development and validation of the 3-tier diagnostic algorithm represents an important advancement in the surveillance and management of epilepsy among veterans within the VHA. The validation of this algorithm also demonstrates its practical utility in a large, integrated health care system.
Specific challenges were encountered when attempting to use pre-existing algorithms; these challenges included differences in the usage patterns of diagnostic codes and the patterns of ASM use within the VHA. These challenges prompted the need for a tailored approach, which led to the development of this algorithm. The inclusion of additional ICD-10 codes led to further revisions and subsequent validation. While many of the basic concepts of the algorithm, including ICD codes and ASMs, could work in other institutions, it would be wise for health care organizations to develop their own algorithms because of certain variables, including organizational size, patient demographics, common comorbidities, and the specific configurations of electronic health records and administrative data systems.
Studies have shown that ICD-10 codes for epilepsy (G40.* and/or R56.9) perform well in identifying epilepsy whether they are assigned by neurologists (sensitivity, 97.7%; specificity, 44.1%; PPV, 96.2%; negative predictive value, 57.7%), or in emergency department or hospital discharges (PPV, 75.5%).13,14 The pilot study of the algorithm’s Tier 1 development (phase 1) evaluated whether the selected ICD-10 diagnostic codes accurately included the VWE population within the VHA and revealed that while most codes (eg, epilepsy [G40.xxx]; posttraumatic seizures [R56.1]; and unspecified convulsions [R56.9]), had a low false positive rate (< 16%), the R40.4 code (transient alteration of awareness) had a higher false positivity of 42%. While this is not surprising given the broad spectrum of conditions that can manifest as transient alteration of awareness, it underscores the inherent challenges in diagnosing epilepsy using diagnosis codes.
In phase 2, the Tier 1 algorithm was validated as effective for identifying VWE in the VHA system, as its PPV was determined to be high (85%). In comparison, Tiers 2 and 3, whose criteria did not require data on VHA prescribed ASM use, had lower tiers of epilepsy predictability (PPV about 60% for both). This was thought to be acceptable because Tiers 2 and 3 represent a smaller population of the identified VWEs (about 8%). These VWEs may otherwise have been missed, partly because veterans are not required to get ASMs from the VHA.
Upon VHA implementation in FY 2021, this diagnostic algorithm exhibited significant clinical utility when integrated within the VSSC Neurology Cube. It facilitated an efficient approach to identifying VWEs using readily available databases. This led to better tracking of real-time epilepsy cases, which facilitated improving current resource allocation and targeted intervention strategies such as identification of drug-resistant epilepsy patients, optimizing strategies for telehealth and patient outreach for awareness of epilepsy care resources within VHA. Meanwhile, data acquired by the algorithm over the decade since its development (FY 2014 to FY 2023) contributed to more accurate epidemiologic information and identification of historic trends. Development of the algorithm represents one of the ways ECoEs have led to improved care for VWEs. ECoEs have been shown to improve health care for veterans in several metrics.15
A strength of this study is the rigorous multitiered validation process to confirm the diagnostic accuracy of ICD-10 codes against the gold standard ILAE definition of epilepsy to identify “definite” epilepsy cases within the VHA. The use of specific ICD codes further enhances the precision of epilepsy diagnoses. The inclusion of ASMs, which are sometimes prescribed for conditions other than epilepsy, could potentially inflate false positive rates.16
This study focused exclusively on the identification and validation of definite epilepsy cases within the VHA VSSC database, employing more stringent diagnostic criteria to ensure the highest level of certainty in ascertaining epilepsy. It is important to note there is a separate category of probable epilepsy, which involves a broader set of diagnostic criteria. While not covered in this study, probable epilepsy would be subject to future research and validation, which could provide insights into a wider spectrum of epilepsy diagnoses. Such future research could help refine the algorithm’s applicability and accuracy and potentially lead to more comprehensive surveillance and management strategies in clinical practice.
This study highlights the inherent challenges in leveraging administrative data for disease identification, particularly for conditions such as epilepsy, where diagnostic clarity can be complex. However, other conditions such as multiple sclerosis have noted similar success with the use of VHA administrative data for categorizing disease.17
Limitations
The algorithm discussed in this article is, in and of itself, generalizable. However, the validation process was unique to the VHA patient population, limiting the generalizability of the findings. Documentation practices and HCP attitudes within the VHA may differ from those in other health care settings. Identifying people with epilepsy can be challenging because of changing definitions of epilepsy over time. In addition to clinical evaluation, EEG and magnetic resonance imaging results, response to ASM treatment, and video-EEG monitoring of habitual events all can help establish the diagnosis. Therefore, studies may vary in how inclusive or exclusive the criteria are. ASMs such as gabapentin, pregabalin, carbamazepine, lamotrigine, topiramate, and valproate are used to treat other conditions, including headaches, generalized pain, and mood disorders. Consequently, including these ASMs in the Tier 1 definition may have increased the false positive rate. Additional research is needed to evaluate whether excluding these ASMs from the algorithm based on specific criteria (eg, dose of ASM used) can further refine the algorithm to identify patients with epilepsy.
Further refinement of this algorithm may also occur as technology changes. Future electronic health records may allow better tracking of different epilepsy factors, the integration of additional diagnostic criteria, and the use of natural language processing or other forms of artificial intelligence.
Conclusions
This study presents a significant step forward in epilepsy surveillance within the VHA. The algorithm offers a robust tool for identifying VWEs with good PPVs, facilitating better resource allocation and targeted care. Despite its limitations, this research lays a foundation for future advancements in the management and understanding of epilepsy within large health care systems. Since this VHA algorithm is based on ASMs and ICD diagnosis codes from patient records, other large managed health care systems also may be able to adapt this algorithm to their data specifications.

