D. Luster,6
U. Melcher,1
R. Murch,7,
H. Scherm,4
R. C. Seem,8
J. L. Sherwood,4
B. W. Sobral,9 and
S. A. Tolin10
Oklahoma State University, Stillwater, Oklahoma,1 Federal Bureau of Investigation, Quantico, Virginia,2 Cobb Consulting Services, Kennewick, Washington,3 University of Georgia, Athens, Georgia,4 Colorado State University, Ft. Collins, Colorado,5 USDA-ARS, Ft. Detrick, Maryland,6 Institute for Defense Analysis, Alexandria, Virginia,7 Cornell University, Geneva, New York,8 Virginia Bioinformatics Institute, Blacksburg, Virginia,9 Virginia Polytechnic Institute and State University, Blacksburg, Virginia,10
SUMMARY INTRODUCTION Vulnerability of U.S. Crops, Rangelands, and Forests History of Plant Pathogens as Bioweapons ROLE OF MICROBIAL FORENSICS IN CROP BIOSECURITY USE OF SURROGATE PATHOSYSTEMS AS MODELS COMPONENTS OF A STRONG MICROBIAL FORENSICS CAPABILITY Sampling Methods, Sample Size, and Quality On-site disease assessment. Characteristics of a good sample. Sample size and sampling pattern. Logistics. Sample storage. First detectors and first responders. Epidemiological Tools and Models To Support Forensic Analysis Comparison and Validation of Current Microbial Forensic Identification and Typing Methods Continuum of attribution. Criteria for selecting appropriate forensic typing methods. Non-nucleic acid-based methods. Serological techniques. Nucleic acid-based methods. Importance of Genome Dynamics, Phylogenetics, and Systematics Influence of Mutation, Evolution, and Environment Background occurrence. Molecular markers for forensic analysis. Confidence levels. Importance of confidence to forensics. Pathogen and Host Gene Expression and Protein Modification Posttranslational protein modification. Host-encoded products. Pathogen-generated secreted products. Standard Criteria for Isolate Discrimination and/or Matching Integrated Informatics and Data Analysis Strategy for Microbial Forensics BUILDING PLANT PATHOGEN FORENSIC CAPABILITY: NEAR- AND LONG-TERM STRATEGIES Recent and Current Initiatives Gaps Assessment and Recommendations Gaps in personnel. Gaps in infrastructure. Gaps in research and technology. Building Plant Pathogen Forensic Capability CONCLUSIONS ACKNOWLEDGMENTS REFERENCES
| SUMMARY |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Recent acute, but unintentional, introductions of nonindigenous plant pathogens (or their vectors) demonstrate the range of damage and consequences associated with newly introduced exotic pathogens (Table 1). If such an outbreak were to be caused by the intentional release of a naturally occurring or engineered biological agent, the political, economic, and societal impacts would be considerable.
|
In wealthy countries, a deliberate pathogen introduction event could (i) result in severe negative impacts on crop yield and quality, (ii) cause significant public shock and/or panic due to a loss of confidence in a portion of the food supply, and (iii) negatively impact the national economy, particularly in rural and agricultural sectors, due to yield decreases, reduced income from commodities sold, and the potential effects of quarantines and/or the loss of international markets. In countries of the third world, particularly those in which a single crop (such as rice or cassava) is the primary food commodity for a large segment of the population, it could lead to human hunger, suffering and political disruption, as have resulted in the past from natural and accidental plant pathogen introductions.
Plant pathogens of high risk for the United States that are designated select agents under the Code of Federal Regulations, title 7, part 331, by the Agricultural Bioterrorism Protection Act of 2002 and the United States Department of Agriculture (USDA) Animal and Plant Health Inspection Service (APHIS) (http://www.apsnet.org/online/feature/BioSecurity/ and http://www.cdc.gov/od/sap/docs/salist.pdf), are listed in Table 2. This list is parallel to select agent lists for human and zoonotic diseases, except that all plant pathogen select agents are nonindigenous pathogens not yet known to occur in the United States (termed "exotic" in the Plant Pest Act). Strict regulations, registrations, restrictions, and security are required for their handling and investigation (http://www.aphis.usda.gov/programs/ag_selectagent/index.html). Such regulations may be useful in attributing a crime involving a select agent. However, a plant pathogen select agent introduced into the United States with little likelihood of eradication may be delisted to facilitate the research needed for effective postintroduction disease management. The recent removal of soybean rust from the APHIS select agent list after the natural introduction of the causal fungus into the United States in the fall of 2004 (161, 171) and the concomitant delisting of PPV are two examples of this policy.
|
There have been no documented cases, as yet, of the deliberate use of pathogens to attack crops or other plants. However, a posture of preparedness dictates that reasonable steps be taken to ensure that appropriate crop biosecurity capabilities be in place before a devastating event occurs, not afterwards. To that end, the examination of natural or accidental plant pathogen or pest introductions can provide insight into the possible impacts of a successful deliberate attack. For example, the potato blight epidemic in Ireland (1845 to 1846) led to extensive famine, resulting in the deaths of 1 million and the emigration of an additional 1.5 million Irish (29, 96). Brown spot of rice contributed to the Great Bengal Famine of 1943. In the United State, a leaf blight in 1970 destroyed about 20% of a corn crop valued at $1 billion (151).
| ROLE OF MICROBIAL FORENSICS IN CROP BIOSECURITY |
|---|
|
|
|---|
The degree of confidence with which forensic analyses can support identification of a specific microbe, reconstruction of its method of introduction into a particular location, and identification of the perpetrator depends on many factors. Lag times may occur at several stages of the investigation. The first is the time between the introduction of a pathogen and its detection, which is affected by many factors, including weather conditions before, during, and after the introduction. A second lag is the time required to develop and execute an appropriate sampling protocol. Protocols must include validated techniques that minimize the time between on-site sample collection and arrival at a forensics laboratory. Third, the time required for stringent laboratory assessment and the resolution, reliability, and repeatability of the chosen analytical methods affects the success of a forensic investigation. In many cases it may be far easier to determine exclusion (assurance that a particular pathogen or person is not involved in the incident) than absolute attribution (evidence that may uniquely associate a particular pathogen/isolate or person to the incident).
A critical first question with respect to a plant disease outbreak is whether a crime has occurred. Many diseases are already endemic, and once-exotic diseases that are not eradicated may become endemic relatively quickly after introduction. An intentional introduction of a plant pathogen as a biocrime or bioterrorist event might not be recognized as such. Thus, tools for better pathogen resolution, more-relevant background information, and more-robust surveillance mechanisms are needed to better evaluate whether a disease is natural or human incited. On the positive side, crop producers and plant pathologists are already poised to move quickly to apply management strategies to control disease. Therefore, rapid determination of whether criminal activity has occurred is crucial so that responders know if the event should be handled as a crime, with appropriate steps for attribution, or solely as a containment effort. Forensic science can assist in this endeavor.
As the nascent discipline of plant pathogen forensics develops, standard crime scene processing and evidence handling protocols must be validated and adapted to plant pathogen forensics applications. It may be appropriate to develop some new technologies specific for crime scenes involving crops, forests, nurseries, orchards, or rangelands. A thorough analysis is required to identify and assess the information, capabilities, tools, and resources already in existence. Once these are brought to bear on the new applications of forensic science, it will be possible to identify remaining gaps and the needed capabilities to fill them, a step that will serve as the basis for the development of a forward strategy. This subdiscipline of forensics specifically targeted toward microbial pathogens, as applied to bioterrorism and biocrimes involving humans and animals, has been developing over the past few years. However, few if any field or laboratory methods, standard operating procedures (SOPs), or protocols have yet been specifically developed and rigorously validated for application to plant pathogens. As plant pathogen forensics becomes established as a separate subdiscipline of forensic science, a major early area of opportunity will be to critically assess, select, and shepherd existing methods, SOPs, and protocols through an appropriate process so that "sets" of validated "tools" are available and defensible should a crop bioterrorism event occur. To accomplish these near- and long-term goals, plant pathologists and forensic scientists (especially those working in microbial forensics) need to plan and work together.
| USE OF SURROGATE PATHOSYSTEMS AS MODELS |
|---|
|
|
|---|
|
| COMPONENTS OF A STRONG MICROBIAL FORENSICS CAPABILITY |
|---|
|
|
|---|
SOPs for the collection of microbial forensic field samples must allow for variation among crop species and suspected pathogens. Forensic field samples may include whole plants, selected plant parts, plant surface swabs or exudates, soil (with or without root tissue), suspected insect vectors, natural or irrigation water in or near the fields, air samples, and/or biological samples (alternative weed hosts and soil or aquatic organisms, etc.). Containers must be clean and unused, and the samples must be collected directly into the container. Minimum documentation includes an administrative log, a sample log, the complete chain of custody, a collection site map(s) sufficient to allow repeat sampling from the same location (within or among fields), and a laboratory submission or transferal document providing detailed information on the crop, field history, and environment. Photographs showing symptoms, field layout, and other relevant details may supplement, but not replace, this documentation.
Characteristics of a good sample. What constitutes a "good" sample varies depending on the patterns of disease intensity, the pathogen, and the host. If multiple disease foci are present in the field, samples should be collected from a representative number of these locations, as well as from outside the focal areas [Nutter, Phytopathology 94(Suppl.):S130, 2004]. Since pathogen titers often differ in leaves, stems, roots, and flowers and with the distance from the site(s) of initial infection, it is important to sample from different plants and different plant parts. The ease of pathogen collection may vary with the season; for example, tree fruit phytoplasmas overwinter in tree roots and move into above-ground branches in the spring, while PPV is absent from tree samples collected in summer when the temperature rises above 30°C. When pathogen detection is carried out with sensitive assays such as enzyme-linked immunosorbent assay (ELISA) or PCR, it is often practical to combine tissue samples from several plants and analyze them together as a pooled sample; this form of group testing allows a larger proportion of the plant population to be tested with a minimal number of assays, thereby improving the detection limit (77) and reducing laboratory resource demands. A drawback of sample pooling is the loss of spatial and disease incidence information that would result from individual sample processing; however, plants from positive pooled samples could be retested individually in a second round of assays if information on the exact location and disease intensity of each infected plant is needed.
If necrotic lesions are present, it is best to sample from the lesion edges, where living plant tissue better supports active pathogen growth, as the lesion centers may subsequently be invaded by saprophytic microbes. Seeds are a good source of seed-borne pathogens such as Soybean mosaic virus (SMV) (74), and underground stems and tubers may serve as a source of pathogens, as occurs with the potato ring rot bacterium, Clavibacter michiganensis subsp. sepedonicus (43). Certain specialized pathogen structures, such as the galls of the corn smut fungus, Ustilago maydis, or the tumors produced by the crown gall bacterium, Agrobacterium tumefaciens, may be collected directly (38, 54).
Sample size and sampling pattern. The number of samples collected should be representative of the impacted area and be defensible scientifically and legally. Sampling pattern and sampling size considerations have been described for numerous pathosystems, including those that involve exotic or once-exotic pathogens such as PPV (78) or the citrus canker pathogen, Xanthomonas axonopodis pv. citri (117). In general, sampling for detection (i.e., presence or absence of a pathogen or disease) in a given field requires a different sampling pattern and sample size than sampling to determine the disease incidence or severity in the field. Sometimes, presence-absence data at the field level (prevalence) can be more important for forensic purposes than incidence or severity data, e.g., to determine the overall geographical extent of the disease or in deciding whether a given field should be placed under quarantine. In such cases, sampling can concentrate on high-risk areas within a field, such as borders or wet areas, depending on the pathogen.
In most forensic applications, detailed information on disease incidence or severity will be needed to develop spatial disease intensity maps to identify potential points of inoculation. In this case, sample size and sampling pattern considerations are critical. For assessment of disease incidence, Delp et al. (47) advocate the use of a stratified random sampling pattern in which the field is first divided into several strata (e.g., regions of higher or lower disease risk), followed by the random collection of samples within each stratum. Using this sampling design, percent error in disease estimates is reduced considerably compared with commonly used systematic sampling designs such as diagonal or W-shaped patterns. Among the systematic sampling designs, entire-field X- and W-shaped patterns are equivalent to each other and superior to diagonal or partial-field sampling patterns (101). When applying these sampling designs in the field, the sampler must be mindful of the fact that there may not be a single point of inoculation. For example, deliberate release of a pathogen by airplane may result in line or area sources of inoculum (128).
Logistics. Collection and documentation of 30 to 40 samples could occupy a two-person team for 8 to 12 h, particularly if travel is required between sample locations. Protocols for forensic field sampling should be designed in consideration of the number of available personnel; similarly, sample numbers should be reasonable in the context of the analytic and volume capabilities of the facility performing the laboratory analyses and/or diagnoses. However, no protocol should be designed solely based on resource limitations. The use of custody seals will alert the recipient to any tampering between the time of collection and receipt of the samples.
Because it is not possible to imagine every possible scenario that may require a microbial forensic investigation, a general SOP may not always be available. This limitation should not preclude attempts to collect critical evidence; however, the bases for current protocols and the investigator's experience should be relied on when adapting existing procedures to unique situations. When applying this "common sense and experience rule," all steps and information accrued must be well documented. When samples are obtained from multiple locations (fields and locations within a field, etc.) appropriate decontamination of personnel and equipment is necessary to prevent the investigator from becoming a vector, spreading the pathogen, contaminating pathogen-free samples, and possibly resulting in false positives. The extent of decontamination will depend on the pathogenicity, virulence, aggressiveness, survival, and mode(s) of dissemination of the suspect organism.
Sample storage. Long-term storage of forensic microbial samples prior to analysis may be necessary; thus, great care must be taken to preserve the integrity and security of the samples. Some plant pathogens may be stored more successfully than others. Storage of viable pathogen cultures is a matter very different from preservation of desiccated leaf tissue, seeds, or fruits. Documentation of environmental conditions during storage is required, and chain-of-custody records must reflect all aspects of storage conditions and exposure to the environment, including records of individuals who may have access to the samples.
First detectors and first responders. "First detectors" on the scene of a deliberate plant pathogen introduction are likely to be growers, crop consultants, Master Gardeners, extension agents, or other local personnel not affiliated with the government. "First responders," individuals authorized to respond and take action after a potential deliberate introduction, generally arrive on the scene later, after being notified by first detectors. Note that the designation "first responders" differs here from that traditionally used for human targets, where the first responders are police or firefighters, etc. Clearly, timely and effective management of a crime scene will be impossible unless first detectors and first responders are equipped with the knowledge and skills to recognize that a crime has occurred and to react appropriately. Although ongoing efforts by the National Plant Diagnostic Network (NPDN) (http://www.npdn.org/) include training of first detectors and first responders (170), the number of plant pathologists trained in field applications of the discipline is on the decline. If the current levels of funding for extension, applied research, and plant disease epidemiology continue to decline, this lack of personnel will become one of the most serious gaps in crop biosecurity efforts.
Climate matching, one of the most commonly used proactive epidemiological tools, helps to identify areas where and when anomalous disease events might occur, the probability that a pathogen could become established at a specific location, and how rapidly it might spread. Use of this tool has increased in popularity as global climate and species occurrence databases improve and expand (135). An empirical "bioclimate envelope" of the pathogen's environmental requirements is derived based on its current distribution, and long-term climate databases are then used to identify which geographical locations meet these requirements (8). Popular software tools include Climex (172, 173, 174), FloraMap (82), and AWhere-ACT (S. N. Collis and J. D. Corbett, Abstr. 4th Int. Conf. Integrating GIS Environ. Modeling [GIS/EM4]: Problems, Prospects and Research Needs, 2 to 8 September 2000, Banff, Alberta, Canada, http://www.colorado.edu/research/cires/banff/pubpapers/152/; F. Zermoglio, J. Corbett, and S. Collis, Abstr. New Tools Spatial Data Anal.: Proc. Center Spatially Integrated Social Sci. Specialist Meet., 10 to 11 May 2002, Santa Barbara, Calif., http://www.csiss.org/events/meetings/spatial-tools/papers/zermoglio.pdf). Bioclimate envelope analyses are often complicated by the existence of nonclimatic barriers to establishment and spread (155; M. J. Samways, Letter, J. Biogeogr. 30:817, 2003), e.g., the absence of efficient vectors or susceptible hosts. Nonetheless, they are useful for first-pass analyses, especially for organisms for which more mechanistic models are not available (172). Recently, Pivonia and Yang (138) used the Climex system to assess the potential year-round establishment in North America of Asiatic soybean rust, an exotic fungal disease that was detected for the first time in soybean production areas in the southern United States in the fall of 2004 (161, 171).
When some of the environmental requirements of a pathogen have been determined experimentally, weather-based disease models can help define its likely locations of establishment and persistence (176, 194). This approach is used by the North Carolina State University-APHIS Plant Pest Forecasting System (http://www.nappfast.org/index.htm), a web-based modeling system that incorporates meteorological and crop distribution databases into a geographical information system (GIS). The system contains modules for several well-studied pest species, templates for new pests, and a generic infection model for exotic fungal plant pathogens (109). Another example of a generic weather-based model that can be adapted easily to a wide range of pathogens is the DYMEX simulator developed in Australia (172). When applied in a retrospective manner, disease models can determine whether conditions at a suspected release site were favorable, at a given time, for infection and disease development. Such analysis provides indirect evidence for whether intentional pathogen release at that site may have occurred (128).
Trajectory analysis utilizes complex atmospheric models and tracks airborne pathogen propagules in real time or in forecast mode. Air parcels pick up spores in source areas and move them into upper-air streams, from which they are eventually deposited at distant locations where the likelihood of infection depends on the presence of a susceptible plant and a favorable environment. Atmospheric dispersion models such as the HYSPLIT4 (for "Hybrid Single-Particle Lagrangian Integrated Trajectory") model are used to calculate the most likely trajectories (28). Use of trajectory analysis in proactive mode is illustrated by the disease warning system for tobacco blue mold, which predicts the seasonal movement of the oomycete pathogen Peronospora tabacina from the Caribbean Basin and the southern United States northward along the east coast of the United States (110). The technique can be applied retrospectively to identify the likely source of an outbreak. For example, retrospective analyses of transatlantic wind patterns strongly suggested that the fungi causing both sugar cane rust and coffee rust were introduced into the Americas by aerial long-distance dispersal, the former from Cameroon in 1978 and the latter from Angola in 1970 (13, 16, 144).
Spatial disease data and associated infrastructure (GIS, GPS, and various remote-sensing platforms) have been applied in plant pathology for some time (107, 126, 140), and similar tools are now being developed and implemented by the NPDN to monitor and map outbreaks of agricultural threat organisms. A three-tiered approach to such analyses [Nutter, Phytopathology 94(Suppl.):S130, 2004] consists of (i) the acquisition of aerial and satellite images prior to conducting disease assessments on the ground; (ii) the ground-based assessment of disease incidence and severity in the affected area, in which the spatial pattern of disease is referenced by GPS; and (iii) the integration, mapping, and spatial analysis of remotely sensed and ground-based data in a GIS. In some cases, it may be possible to develop algorithms that can distinguish between natural and intentionally induced disease outbreaks based on the spatial pattern of disease. However, if an endemic pathogen is introduced at a single point and time, information on spatial patterns alone will add little to distinguishing between the two release scenarios. In such cases, genetic and population genetic analyses, as discussed below, will be critical for attribution.
Spatially interpolated high-resolution weather data and forecasts are based on simulations with a mesoscale weather model that ingests continental and global real-time atmospheric data, along with static information such as terrain and land use, to produce a numerical simulation with an output grid spacing of between 10 and 40 km (108, 152, 153); postprocessing of the output interpolates data to a resolution of 1 km2, allowing it to be linked to disease models to provide high-resolution information about future, present, or past disease risk (166) and providing an informational framework within which the previous or potential spread of an intentionally introduced pathogen may be estimated.
Critical forensic evidence related to time of infection can be provided by host and pathogen phenology data. Crop phenology data, derived from ground surveys, remote sensing (149), or crop models (121), can provide critical forensic evidence related to the time of infection, especially for pathogens that require defined host phenology stages for infection and/or with hosts that show age-related susceptibility variation (e.g., see reference 56). Information on pathogen phenology can be equally important, especially for determining time of infection. For example, information about leaf age and position, and lesion size and development, was used to determine lesion age in the search for the likely source tree of the current citrus canker epidemic in southern Florida (164). Time of infection can be reconstructed from propagule monitoring in relation to pathogen phenology. Spore samplers (61) can provide a continuous record of pathogen presence or absence in an area, especially when state-of-the-art high-throughput samplers are coupled with sensitive and specific detection procedures (e.g., PCR-based analyses or biosensors) (186). Such monitoring networks are currently being implemented for early detection of human pathogens in real time (e.g., the BioWatch program, the Biological Aerosol Sentry and Information System, and the Autonomous Pathogen Detection System) (58) and could be extended to include plant pathogenic threat organisms. If properly archived and documented, the samples collected routinely by such networks could be very useful for applications in microbial forensics.
Continuum of attribution. Comparative interpretation of data from an evidence sample and a reference sample is a routine feature of a microbial forensic analysis. Three general categories of interpretation are "inclusion," "exclusion," and "inconclusive." The first two, inclusion (i.e., possibly originating from the same source or sharing a recent common ancestor) and exclusion (i.e., could not have come from the same source), are the two endpoints of a continuum of certainty with respect to attribution, while the variety of possibilities between the endpoints represent various degrees of inclusion and inconclusive data sets. Inclusion is achieved when the patterns or profiles generated from two or more samples are sufficiently similar that the samples could have originated from the same source. The measure of similarity should take into account all variation present in both samples. Because of the clonal nature of many microbial pathogens, it may never be possible to absolutely identify the source of the evidence. In some scenarios it may be possible to state only that two samples are similar or are more similar to each other than to other samples. An alternate definition of inclusion is a failure to exclude the possibility that the two samples had a common origin (or ancestry) or that they belong to the same group. An exclusion event occurs when the sample patterns or profiles are sufficiently dissimilar that the two samples could not have originated from the same source (or are related too distantly). Lastly, an inconclusive interpretation is rendered when the data are insufficient to provide a conclusive interpretation (23).
International marketing requirements for agricultural commodities such as seeds or planting stock often require certification that one or more pathogens are, to a specified degree of confidence, absent. The high level of diagnostic accuracy needed for such assurances also may be applicable to microbial forensics. For example, the North American potato industry imposes rigorous testing for the potentially devastating ring rot bacterium, C. michiganensis subsp. sepedonicus (45). Zero tolerance trade restrictions for this pathogen have led to a major research focus in several countries to develop new detection methods that surpass current tests for sensitivity, specificity, and efficiency. Such newly developed methods may be the only diagnostic tests available for obtaining information to characterize the source.
Aside from such certification programs, the procedures typically applied to diagnose a naturally occurring plant disease for purposes of disease management are generally much less stringent, reliable, and reproducible than would be required for validated forensic identification. Regardless of the principle of the test, operators must appreciate the limitations of available assays to avoid overinterpretation and overrepresentation of results.
Criteria for selecting appropriate forensic typing methods. Critical characteristics of microbial typing for forensic applications include (i) universality, the ability to type all organisms within the taxon using a particular method; (ii) sensitivity, the percentage of actual positive samples detected (with no false negatives); (iii) specificity, the percentage of actual negative samples identified correctly (with no false positives); (iv) efficiency, the total percentage of correct test results; (v) reproducibility, the same result obtained consistently when a particular isolate is tested repeatedly; and (vi) resolution, the degree of attribution that can be obtained with a method.
For most plant pathogens, multiple methods of microbial identification and typing are available. Having results from multiple tests will increase the level of accuracy and confidence in microbial forensics investigations. Typing methods currently in use for plant pathogens include both nucleic acid-based and non-nucleic acid-based technologies.
Non-nucleic acid-based methods. The first and most important assays for both microbial forensics and management are those that determine the species of the pathogenic agent. A number of traditional methods, in use long before the advent of molecular biology, remain effective for some applications and may provide significant clues for pathogen identification in a forensics setting. Symptomatology, the ability to cause either no reaction or a "hypersensitive" (resistance) reaction on a nonhost plant; plant host range; insect vector specificity; and pathogen morphology (of bacterial colonies or cells, fungal colonies or fruiting bodies, or virus particles or inclusion bodies, etc.) are often the first steps in identification (90, 158).
The pathogen's host range and the host's specific response to the pathogen also are used for typing of plant pathogens. Species of many plant pathogenic bacteria are further divided into pathovars, based solely on the host range of the bacterium, and methods that define physiological processes or the complement of certain molecules are also used to define taxa. For example, BIOLOG (Hayward, Calif.) and other substrate utilization tests provide profiles of metabolic capabilities, while fatty acid methyl ester (FAME) analysis produces a profile of the microbe's fatty acid composition; in both cases the profiles of the test strain are compared with those in a database of species and strains for the closest match. The accuracy of such assays for microbial identification is limited by the population of characterized strains in the databases.
Like bacteria, plant virus strains belonging to the same species also may be discriminated by the comparative reactions of a set of plant species or cultivars within a species, known as differentials.
Fungi, because of their large genomes and complex life cycles, present particular typing challenges. For example, the fungal mating type, determined by plate mating assays, is the primary mode of identification of the model fungal plant pathogen U. maydis (142, 198). Although the mating assay is reliable and accurate, it is not very definitive for strain attribution because it does not measure other variations in the genome; on average about 1 of 36 of the cells will possess a given mating-type genotype in a random population. Thus, mating-type distinction is a good exclusionary tool but will not achieve absolute attribution.
Serological techniques. ELISA and indirect fluorescent antibody staining are serological assays commonly used for identification of plant pathogens, particularly viruses and bacteria. The sensitivity and specificity of serological assays vary with the titer and specificity of the antibody and whether the antibody is monoclonal or polyclonal (102, 180, 197). Recent adaptations by diagnostic industries for dipstick convenience and portability have enhanced the usefulness of these immunology-based technologies in the field. Cross-reactivity among closely related strains may be a problem; for example, the seven strains of SMV, identified on differential host cultivars (37), are chemically and serologically homogeneous at the coat protein level (74, 80).
Nucleic acid-based methods. The popularity of nucleic acid-based technologies has grown rapidly. Older methods, such as restriction fragment length polymorphism (RFLP), DNA fingerprinting, and phage typing are still valid and useful. However, complete genome sequences are now available for many economically important viruses and bacteria, and a few fungi, and others are in progress. Thus, many genetic markers are available for analyses. DNA probes, constructed for taxon-specific marker genes, are used widely. Sequencing of particular genome regions known to provide informative data, such as the 16S rRNA, the internal transcribed spacer region between the sequences coding for the16S and the 23S rRNA, or the groE or recA genes, is often used for bacterial identification. The sensitivity, specificity, and versatility of PCR have made it a method of choice for applications related to sequence analysis and comparison. PCR-based assays have been used widely, for example, in the typing of DNA viruses (100, 139). The discovery of repeated sequences in many bacterial genomes has given rise to a version called rep-PCR, in which electrophoretic banding patterns reflect different numbers and positions of repeated sequences (104, 145).
Real-time PCR has been used for genetic characterization of bacteria (130, 157), viruses (112), and fungi (63). Although the technique is rapid and can be very specific, it may not be as sensitive as culture-based assays to detect pathogens present in plant extracts when PCR inhibitors are present, or with very small sample volumes, both of which can reduce sensitivity. For cultivable bacteria, PCR can be combined with isolation in BIO-PCR (160). In this assay, viable cells of the target bacterium are enriched in medium and thereby detected at extremely low original levels in seeds and other propagation materials. No DNA extraction is needed since the cells lyse during the initial denaturation step. For higher levels of specificity, BIO-PCR can be performed on membranes, although a possible disadvantage of membrane use is the chance for cross-contamination (159).
Multiplex PCR, in which primers against more than one target are combined in a single reaction mixture, can be employed to detect more than one species of bacterium or virus in the same sample (9). Assay and detection of multiple sites of a microorganism's genome can increase confidence in an identification. Such systems are currently in use for certifying vegetative plant propagules as virus free. Another PCR variant, reverse transcription-PCR (RT-PCR), is useful for plant viruses having RNA genomes. Reasonably "universal" primers have been developed for some virus families, genera, and "species" targeted to taxon-specific sequences (35, 91, 150). RT-PCR clearly differentiates between some strains of common viruses: for example, between PPV strains D and M (27) and SMV strains G2 and G7 (131), as well as between the common strain of Potato virus Y, PVYO, and strain PVYN (12). Kim et al. (92) conducted RFLP of RT-PCR products ("restrictotyping") to differentiate five Korean SMV strains. Many other new variations and assay combinations, such as multiplex PCR-ELISA and immunocapture PCR, have been developed (120).
The methods mentioned above are all limited by their reliance on a minute fraction of a taxon's many defining features. For nucleic acid-based methods, the accuracy of the comparison to some degree will be proportional to the length of the fragment (and/or the site) used in the analysis. Direct DNA-DNA hybridization methods provide information about the degree of similarity of entire genomes, without the need for actual sequence information.
Direct comparisons to rank the sensitivity and specificity of certain detection/diagnostic methods have been carried out for some plant pathogens (76, 85, 146). Many research programs shifted from traditional methods (symptomatology, electron microscopy, and host differential reactions) to PCR-based or immunological tests when the latter were demonstrated to be more sensitive or specific (12, 27, 112, 134). In limited cases, methods have been standardized among laboratories to ensure that comparisons between/among the groups were reliable (44). However, such test comparisons and standardizations are not frequently done, because the validation of methods at a level necessary for more rigorous challenge adds significant cost and generally is not required for managing a natural disease outbreak. Thus, for many diagnostic systems the relative effectiveness of one technology over another for critical identification is not known, and in fact, the "best" test will often depend on the "diagnostic" sites and methods available for a given taxon and the databases of information collected on the species and closely related strains and species. For example, in a recent analysis of multiple plant pathogenic strains of the ubiquitous bacterium Serratia marcescens (147, 199), different "identifications" were provided by BIOLOG, FAME, 16S rRNA and groE sequencing, and DNA-DNA hybridization because each of these tests measures or compares a different genome region, gene product, or phenotype. While exclusion and inclusion interpretations can be made with all of these methods, research is needed to establish the most reliable and informational methods for high-priority plant pathogens and to develop the reagents and databases for them.
The diversity of organisms within a microbial population must also be considered in evaluating typing methods. In a given plant, field, or region, some pathogen populations, such as those of U. maydis, are relatively homogeneous. For others, such as PPV, high population-level species diversity means that an individual sample from a single host will contain many mixed sequence variants (163). Researchers in the United States and the European Union are investigating the evolutionary tempo and drift of various regions within the PPV genome, and mutation "hot spots" within the genome could be targets for forensic analysis. Also relevant is the rate at which the pathogens change. For example, the genome of SMV appears to be more stable than that of PPV, although new strains of SMV are reported relatively frequently (49, 55, 92). Furthermore, some isolates described as new may actually be strains or recombinants of existing viruses (65). Tomato spotted wilt virus (TSWV) is quite variable because of reassortment of genomic segments and other mutations. For TSWV and related viruses, three loci (the N and NSm genes and the intergenic region) are used for comparisons. A rule of thumb is that isolates having <90% amino acid sequence identity in their N-protein sequences are distinct strains.
In considering the dynamics of bacterial genome change, it is useful to distinguish between a pathogen's core genome and its flexible genome (41). The core genome consists of genes ubiquitous in the bacterial species, encoding housekeeping proteins and other proteins essential for survival. These genes are less likely to undergo horizontal gene transfer and either evolve neutrally or are selectively constrained. The flexible genome consists of genes that vary among strains within a species, encoding proteins responsible for adaptation to a particular niche, host, or environment. Such genes, which may be associated with virulence, resistance to antibiotics or toxins, or the mobility of the genome or genome parts, evolve largely through horizontal gene exchange (acquisition and loss). For viruses, regions of genes involved in host interaction or movement within the plant or by vectors, which may exhibit significant variability, would be useful for attribution.
Evaluation of genomic variability is a challenge to forensic investigation because of the difficulty in establishing tightly defined taxonomic groupings. However, certain aspects of variability among pathogen genomes can also provide outstanding support for identification. For example, a forensics-useful application became apparent when specific genomic regions of TSWV showed strong homology among strains from Florida and Georgia, and those strains could be resolved from strains from other parts of the world (133). Differences in fungal mating types may also be used as a forensic tool; U. maydis is a cosmopolitan species with much variation in mating type among populations, even within a small area (198). For some pathogens, specific regions of variability are potentially useful forensics tools. Pseudomonas syringae is a highly clonal and stable species, in which a genomic pathogenicity island (PAI) contains the "hypersensitive reaction and pathogenicity" (hrp) genes (1). The same PAI also contains the exchangeable effector locus (EEL), which is thought to have been acquired independently after the acquisition of the hrp-encoded PAI but before divergence of the pathovars (48). Divergence in EELs has occurred more recently through the acquisition of new effectors and by point mutation. Thus, the EEL may be useful in forensic investigations.
Pathogen populations are not homogeneous in nature. A given plant may be affected by a mixed pathogen population, including members of different pathogen kingdoms (e.g., viruses and bacteria in the same plant), different species (e.g., two phytoplasmas transmitted by the same vector species), or different strains/pathovars of a single species (e.g., pathovars tomato and maculicola of P. syringae). Even a natural population of a single strain of most pathogens may consist of many sequence variants. Most pathogen characterization is done, however, on very homogeneous populations initiated ("cloned") from single cells or propagules, a process considered essential for reproducible and comparable laboratory characterizations of microorganisms. Mutations of pathogens are common during laboratory maintenance and subculturing due to a lack of selection for characteristics needed to persist in nature. For example, pathogens propagated on artificial media without contact with host plant tissue may lose pathogenicity or aggressiveness after a number of passages, and insect-transmitted pathogens similarly may lose the ability to be so transmitted, even if propagated by grafting on a susceptible host plant. A pathogen stored frozen may undergo lower rates of mutation than one stored at higher temperatures or than pathogens in nature. Thus, there may be a degree of uncertainty regarding variation with those samples maintained in the laboratory. Comparative genomic sequence characterizations of mixed populations are needed to identify the degree of variation among individuals in a population, rates of mutation, and the extent of sequence divergence. Currently, such data are not available for most plant pathogens.
Populations of pathogens continue to undergo change in nature, although rates of change are not well characterized. In some cases, information may be gleaned from comparisons between populations in different countries or within regions of the same country. North American isolates of SMV were less diverse than those obtained from Asian countries, possibly because Asia is likely the center of viral origin and the opportunity for pathogen evolution has existed longer there (49). Deployment of host resistance genes in a crop species may select for variants that overcome the resistance; for TSWV and many other pathogens, this phenomenon can occur within only a few growing seasons.
The presence and number of extrachromosomal elements, plasmids, and viruses have been used for differentiation of cellular (nonvirus) pathogen strains. For example, the bacterium P. syringae pv. tomato DC3000 contains two plasmids, but although plasmids have been implicated in the pathogenicity of a number of plant pathogenic bacteria, curing experiments revealed no correlation between P. syringae pv. tomato plasmid presence and pathogenicity (26). Multiple prophages (virus sequences integrated into the genome) are present in many plant pathogenic bacteria, but their role in gene expression has not been well studied (26). All fungi have extrachromosomal DNA within their mitochondria and also may have linear and circular plasmids (68). For fungi that have no naturally occurring plasmids, such as the model fungus U. maydis, the presence of such elements in a field isolate of the fungus could be a sign of genetic modification by humans (179). Some fungi also have mycoviruses, which may reduce aggressiveness and affect gene expression (3, 116, 185).
Simplistically, the probability of two microbes being derived recently from the same source, the attribution probability (pa), is given by the formula pa = 1 pe, where pe is the exclusion probability, the probability that they were not so derived. The desire in forensic microbiology is to have one of these probabilities be so high that there is little reason to doubt any associations or lack thereof.
DNA typing of humans is based on the frequency of alleles in the human population, the number of loci, and the mode of inheritance. The autosomal markers were chosen to be biologically (and thus statistically) independent. Independence is due to the presence of genes on multiple chromosomes and a high frequency of meiotic homologous recombination between distant markers on the same chromosome (except for those genetic markers residing on the nonrecombinant region of the Y chromosome or on the mitochondrial genome). As a result, the frequencies of the markers in the population can be used multiplicatively (with slight modifications) to calculate probabilities for exclusion or attribution. By a similar approach, most markers used with bacteria and viruses, and many fungi, are located on the same piece of DNA or RNA and thus may not be subject to frequent recombination. Exceptions are plasmids, horizontally transferred sequences, and genomes of multipartite viruses (69). Additionally, the degree of recombination varies among species. Because of the potential nonindependence of marker pairs, data are needed on both the frequencies of alleles at loci and the degree of linkage of the allele frequencies at pairs of loci to render the most effective estimate of the rarity of the nucleic acid profile.
Microbes evolve much faster than do humans or plants. Rapidly evolving regions of microbial genomes have promise for plant pathogen attribution-exclusion decisions because newly arising alleles may be novel and thus unique in the world population of that microbe. However, regions that evolve so rapidly that changes could occur during the time of divergence of the donor from that of the crime scene microbe lack the inherency mentioned above. They may still be useful, but the approaches for interpretation will be based more on a similarity/dissimilarity matrix. There is a need to identify the regions of plant pathogen genomes that are the most informative for the questions that may arise during a microbial forensics investigation.
Certain sites in genome sequences are under neither positive nor negative selection (94). These neutral sites should evolve at the same rates, since the processes that substitute one nucleotide for another in an organism are thought to be sequence independent and absence of directed selection is assumed. The frequency of differences at neutral sites of pairs of isolates with known divergence times can be used to calculate the mutation rates and the chance of multiple mutations at a particular locus. Knowing the neutral mutation rate and the frequency of differences at neutral sites between a suspect microbe and a crime scene microbe allows calculation of the time of divergence of the two microbes (under certain assumptions). If this time is within a window consistent with the suspected crime scenario, then a failure to exclude is supported. If the time scale is longer than the suspected separation of the two, exclusion is supported. In some microbial forensics cases the time may be a period of months, years, or decades. The confidence of attribution or exclusion based on a neutral site mutation rate depends on the accuracy with which that rate is known and the reliability of the observed difference frequency (and storage or environmental influences). Confidence in the neutral mutation rate increases with the number of isolate pairs used in its calculation. Thus, multiple sets of isolate pairs whose times since divergence from a common ancestor are known from historical records are needed. Sequences of regions containing multiple neutral sites, and improved computational methods for estimating the rates of change (59), also are important.
Although high confidence of strain identity is one goal of forensic attribution, strain identity alone does not always lead to absolute attribution. For example, the anthrax bacteria of the 2001 outbreaks were identified to strain, but identification of the perpetrator could not be ascertained directly from this information.
Background occurrence. Traditionally, when a plant pathogen is discovered in a geographic area in which the pathogen was previously unknown, a peer-reviewed note is published. Such information is often incomplete and nonuniform. The disappearance of a plant pathogen from an area is not frequently reported. Native plants and plants with unapparent symptoms are usually not surveyed, except when alternative hosts are being sought. It is common to provide some characterization with respect to pathogen markers, but marker characterization methods are not standardized. There is no single distribution source where such typing data are stored (87). However, some individual investigators or groups of investigators have created and are maintaining databases of isolates of concern to them. For example, there are extensive databases for the Geminiviridae (http://www.danforthcenter.org/iltab/Geminiviridae/ and http://gemini.biosci.arizona.edu/) and for two fungal genera, Phytophthora and Fusarium. At the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/entrez/), new sequence entries contain limited information about the host of isolation of the plant pathogen being sequenced and the rough geographical location. More-precise locations and dates of sampling and isolation are not usually provided.
Molecular markers for forensic analysis. As stated above, most plant pathogens lack a standard marker system. For those that do, markers are based on any of a variety of biomolecules and are taxon dependent. Rarely does a marker satisfy all three of the criteria defined above: inherent, diagnostic, and Boolean. Many marker systems were developed for use in systematics, the hierarchical assignment of organisms to a taxonomic classification. Several series of terms describe the hierarchy: strain, subspecies, species, genus, family, etc. There is a well-defined system for classification of viruses into families, genera, and species, but these taxa are being redefined with molecular techniques. Below the species level, further subdivision into subtypes, groups, or strains is common. Taxonomic levels do not reflect consistent evolutionary time periods. Although systematics has provided many useful markers, it does not always follow evolutionary descent patterns (187) and it is less relevant to microbial forensic considerations. Phylogenetics, in contrast, reconstructs the order of the organism's divergence from a common ancestor, so that under a certain set of assumptions, placement of a crime scene isolate on a phylogenetic tree identifies its closest known relatives or most recent common ancestor. Phylogenetic analyses, using a specific gene or genomic region, have been conducted on many plant pathogens. Meaningful trees require considerable sequence variation in the targeted portions of the genome, but such variability should not interfere with reliable sequence alignment (119).
Few plant pathogen marker systems have been used widely enough to acquire good estimates of the frequencies of particular markers in populations. For most, the scope of the population is geographically restricted to a plot/field, county, state, or region, sometimes a nation. Seldom are data on worldwide frequencies calculable. Of marker types, nucleotide sequence is the best developed because more isolates have been examined by this method than by others. However, the diversity of microbes of potential phytopathogenic threat is so large that obtaining complete sequences of enough microbes for a good level of background knowledge is a daunting task (175). Genome microarrays and subtractive hybridization approaches may be good alternative technologies for resequencing (11). Indeed "high-resolution differentiation between closely related" microbes was obtained in a microarray study in which 295 of 300 pairs of bacterial strains were statistically differentiable (192).
Confidence levels. The overall mean mutation rate, if measured for unique-sequence nucleic acids in the genome and if expressed as mutations per cell division (or round of virus replication) per genome, is the same for all organisms (51). Thus, organisms with large genomes have fewer mutations per division per thousand base pairs than those with small genomes, but because they have larger genomes they will have the same overall number of mutations.
Regions of plant pathogen genomes evolve at different rates among different microbes, within microbes, and even at the individual base pair level. Viruses have different patterns of evolution (75), and some RNA viruses, such as influenza virus, evolve rapidly, while others, such as Tobacco mosaic virus, evolve slowly (60). In general, DNA viruses, particularly those with large genomes, evolve relatively slowly. For viruses, initiation of an infection with a genome that has been cloned in a bacterial plasmid rapidly results in the generation of a collection of genomic sequences whose diversity is characteristic of the virus and the host (162, 163). Forensic investigators should not limit characterizations to single cloned samples from suspect and crime scene viruses, since even fairly diverse sequences could be drawn from a single population of genomes. As noted earlier, TWSV isolates could be differentiated by analysis of five viral genes (178). Ideally, the population should be subjected to nucleotide sequencing directly, using analysis protocols that identify polymorphic positions.
For bacteria, common markers include the 16S rRNA, the spacer between the 16 and 23S rRNAs, and the groE and recA genes. Each has a slightly different rate of evolution, allowing coverage of several taxonomic levels. However, the rate of nucleotide substitutions is usually insufficient to establish that two bacteria with identical sequences for one of these genes had a common ancestor within a period consistent with forensic scenarios. Comparing the sequences of several genes, or of the whole genome, provides greater confidence. New developments in nucleic acid sequencing, such as a highly processive technique that employs amplification of DNA fragments in microspheres followed by pyrosequencing (113), allow rapid and cost-effective sequencing of entire microbial genomes. Finding one or more nucleotide differences between crime scene and reference sequences might be an indication that the suspect pathogen should be excluded as a source of the evidence microorganism. However, substitutions may have accumulated if many generations of propagation, or propagation in a selective environment, occurred since the two lines were separated. Separately propagated lines may also show dramatic changes in genome size (195).
Rates of molecular evolution of bacterial plant pathogens would be useful for forensic analyses, but the assumption that they have a molecular clock could be limited because selection for mutator strains (10) may occur after divergence of the suspect microbe from the common ancestor. Mutator strains constitute about 1% of natural populations (97), but their frequency (169) is increased during stationary phase (103). If one of the two bacteria being compared has had a high mutation rate since derivation from the parent, then the distance between the two bacterial isolates is inflated and can lead to inappropriate exclusion. On the other hand, estimates are that a thousand generations are needed to fix mutator strains (93) in a population (168).
Fungal retrotransposons provide a potential gene set with a level of variation sufficient for discrimination (62). Since several idiosyncrasies of their replication are documented, they are hot spots for the accumulation of nucleotide sequence changes. Different genes have different rates of evolution. Those rates are not constant at broad taxonomic levels, since the molecular clock for a particular protein-coding gene does not tick at the same rate in all lineages (6). Yet, at lower taxonomic levels, the assumption of a molecular clock has proven useful (15). Indeed, for fungi, the overall rates of molecular evolution have been judged indistinguishable for Neurospora crassa and Saccharomyces cerevisiae (14). In a series of four intraspecies comparisons based on whole bacterial genome sequences (Chlamydia pneumoniae, Escherichia coli, Helicobacter pylori, and Neisseria meningitides), essential genes were found to be more conserved than nonessential ones and duplicated genes (except in C. pneumoniae) had more differences than unique genes (83). Thus, duplicated genes (193) might be favored for microbial forensics, although distinguishing among nearly identical sequences may be a problem. Strain-specific genes tended to be uncharacterized ones.
Genomic processes other than nucleotide substitution may be occurring rapidly enough to assist the forensic investigator. Rates of deletions, inversions, and translocations per site, the expansion and contraction of regions of repeated sequences, the movement of mobile elements, the invasion of prophage genomes, and the acquisition or loss of plasmids all may provide useful clues. In a comparison of two Chlamydia species, the rates of deletions, inversions, and translocations per site were substantially less than the neutral substitution rate (42). Potentially more helpful are the expansion and contraction of regions of repeated sequences, the movement of mobile elements, the invasion of prophage genomes, and the acquisition or loss of plasmids. The utility of repeated sequences was examined for the fungus Beauvaria bassiana (39). Variation in the number of GA dinucleotide