|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Short Communication |

Faculty of Life Sciences, University of Manchester, Michael Smith Building, Oxford Road, Manchester M13 9PT, UK
Correspondence
Jun Fan
jun.fan-rres{at}bbsrc.ac.uk
| ABSTRACT |
|---|
|
|
|---|
Present address: Centenary Building, Rothamsted Research, Harpenden AL5 2JQ, UK. ![]()
Supplementary material is available with the online version of this paper.
| MAIN TEXT |
|---|
|
|
|---|
Cases caused by HEV have been found worldwide. Outbreaks of HEV have been recorded in Asia, Africa and Mexico (Emerson & Purcell, 2003
; Worm et al., 2002
). Cases of acute, sporadic hepatitis have been found in the USA, Europe and Africa (Coursaget et al., 1998
; Erker et al., 1999
; Schlauder et al., 1999
; Zanetti et al., 1999
).
The HEV genome has three partially overlapped open reading frames (ORFs). ORF1 is located at the 5' terminus of the genome and encodes non-structural proteins (Erker et al., 1999
; Koonin et al., 1992
). ORF2 is at the 3' terminus of the HEV genome and encodes the viral capsid protein which has three glycosylation sites (Worm et al., 2002
). ORF3 overlaps with either ORF1 or ORF2 (Wang et al., 2000
).
Genotyping in HEV can have diagnostic and clinical implications. Several genotyping schemes for HEV have been proposed by different research groups. Four genotypes are agreed upon from all schemes: genotype 1 includes Asian and African pandemic strains (Aye et al., 1992
; Chatterjee et al., 1997
; Coursaget et al., 1998
; Tam et al., 1991
), genotype 2 includes Mexican and Nigerian strains (Buisson et al., 2000
; Huang et al., 1992
), genotype 3 includes the USA, Japanese and European strains (Erker et al., 1999
; Schlauder et al., 1998
, 1999
; Takahashi et al., 2002
) and genotype 4 includes the strains from China and Japan (Takahashi et al., 2002
; Wang et al., 1999
, 2000
).
The genotyping groups used the following methods: nucleotide identities/genetic distances, deduced amino acid similarities, phylogenetic trees and restriction endonuclease sites, in order to determine genotypes based on either partial genome or full-length sequences. Comparing nucleotide identities/genetic distance is the most commonly used method to define HEV genotypes. Worm et al. (2002)
defined an HEV genotype as viruses having nucleotide divergence of less than 20 % in the ORF2 region. Another scheme defines a genotype as viruses having nucleotide divergence of less than 15 % in the RNA polymerase region (Arankalle et al., 1999
). It was also suggested that the genetic distance of the ORF1 region should be greater than 0.235 between genotypes (Wang et al., 1999
).
Amino acid and phylogenetic analyses can be treated as the extension of genetic distance analysis. Amino acids are deduced from DNA sequences and phylogenetic trees are constructed based on genetic distances. HEV-US1 and HEV-US2 sequences were demonstrated to be only 74 % identical to the Burmese and Mexican strains, which combining with amino acid and phylogenetic analyses indicates the presence of the third genotype in HEV (Erker et al., 1999
).
It can be concluded that analyses of genetic distance, amino acid similarity and phylogenetic relationships have become a common practice in defining HEV genotypes (Erker et al., 1999
; Hsieh et al., 1998
; Schlauder et al., 1999
; Schlauder & Mushahwar, 2001
; Wang et al., 1999
).
Restriction endonuclease analysis (REA) has been much less used. Different restriction endonucleases are applied to strains. Different genotypes are supposed to give different restriction site patterns. PCR-restriction fragment length polymorphism (PCR-RFLP) can be observed from the different enzyme digestion patterns. Gouvea et al. (1998)
compared ten complete and five partial HEV sequences from genotype 1 and 2 using REA. Another case was that genotypes can be distinguished among Taiwanese, Asian and Mexican strains by PCR-RFLP (Hsieh et al., 1998
).
At the early stage of defining HEV genotypes, partial regions were commonly used to define genotypes because the majority of available HEV strains were only sequenced partially. It had also been demonstrated that an ORF1 region generates trees consistent with the trees generated with full-length sequences (Schlauder et al., 1998
, 1999
; Wang et al., 1999
; Zanetti et al., 1999
).
However, Zhai et al. (2006)
found that in 23 out of the 24 cases, the partial genome region can not sufficiently determine the genotype of the complete genome by using one-way and two-way ANOVA to compare 24 partial genomic regions statistically.
Here I provide a new approach based on full-length genome sequences and the ORF structure (the lengths of three ORFs and the overlapping relationships among the ORFs), to investigate genotypes in HEV.
All available complete HEV sequences were retrieved from GenBank. The details of the retrieved sequences are listed in Supplementary Table S1, available in JGV Online.
Since there is only one available complete sequence for genotype 2, three genotype alignments were built using CLUSTAL W (Thompson et al., 1994
) for genotype 1, 3 and 4, respectively, followed by manual curation referring to the corresponding genotype amino acid alignments. The four-genotype alignment was constructed by the profile alignment. Three sequences (GenBank accession nos AB091395
[GenBank]
, AB189071
[GenBank]
and L08816
[GenBank]
) were removed from the alignment because they are identical to AB200239
[GenBank]
, AB189074
[GenBank]
and NC_001434
[GenBank]
, respectively. The final alignment contains 73 sequences of up to 7169 bp (including gaps) and is available in JGV Online.
There are obvious differences in the ORF structure among genotypes. The average and SD values of the ORF structure are listed in Table 1
. Genotype 1 and genotype 2 have a similar ORF structure. In genotype 1, 2 and 3, ORF2 overlaps with ORF3, whilst in genotype 4 ORF2 overlaps with ORF1. ORF1 in genotypes 3 and 4 is on average 40 nt longer than in genotypes 1 and 2.
|
To test this hypothesis, an alignment was built which includes DQ450072
[GenBank]
sequence and the four genotype-consensus sequences. Parti-on (Fan & Robertson, unpublished tool) and Recco (Maydt & Lengauer, 2006
) (
is set to 0.2) were used to detect breakpoints.
Parti-on located two optimal breakpoints between genotypes 3 and 4 at position 4749–4758 and 5226–5270 with P-values lower than 0.001 for both breakpoints (Supplementary Fig. S1, available in JGV Online). Recco detected two breakpoints at similar positions. The two breakpoints were located in the RNA-dependent RNA polymerase (RDRP) region in ORF1 and the overlapping region of ORF2 and ORF3, respectively.
To verify the statistical support for the detected recombination events and to infer the phylogenetic relationship for the recombinant, the alignment was divided into three genomic regions according to the detected breakpoints. For each region, a maximum-likelihood (ML) tree was constructed based on the best fitted evolutional model determined by MODELTEST (Posada & Crandall, 1998
). The reliability of the branching order was verified by performing 1000 neighbour-joining bootstrapping. To apply the one-tailed Kishino–Hasegawa (KH) test and the Shimodaira–Hasegawa (SH) test, the alternative tree for each region was constructed by manually relocating the recombinant into the phylogenetic position of the ML tree(s) inferred from adjacent regions. These phylogenetic analyses were implemented with PAUP* (Swofford, 1999
).
The phylogenetic analyses indicated that there are two recombination events between genotype 3 and genotype 4 (Fig. 1
), which is statistically supported by both the KH test and the SH test (P<0.001). The presence of recombination events caused the discordant phylogenetic relationships in different regions.
|
The bootscanning result indicated that the only available complete sequence in genotype 2, the Mexican strain, could be another inter-genotype recombinant.
The same procedure of detecting and verifying breakpoints in DQ450072
[GenBank]
was applied to the Mexican strain. Two recombination events were detected between genotypes 1 and 3 with strong bootstrapping support, which were located at positions 1847–1856 and 2069–2072 (Fig. 2
). However, the KH test and the SH test rejected the hypothesis that genotype 2 is clustered with genotype 3 rather than genotype 1 in the region of 1857–2069 (P=0.097 and 0.104 for the KH test and the SH test, respectively), which rejects the hypothesis.
|
The recombinant was revealed by observing the ORF structure of HEV sequences. The ORF structure has drawn our attention for several reasons. Firstly, the ORF structure is highly related to the commonly used genotyping methods based on nucleotide identities/genetic distances, deduced amino acid similarities and phylogenetic trees. The different lengths of the same ORF in different genotypes generate gaps which increase the distance and further affect the inferred phylogenetic trees. The gap region between genotypes also decreases the nucleotide and amino acid identities. It has been shown that several new genotypes suggested by genetic distances, sequence identities and phylogenetic trees are not sufficient to be a separate genotype (Schlauder et al., 1999
; Schlauder & Mushahwar, 2001
; Wang et al., 2000
). Therefore the ORF structure can be a new criterion to define HEV genotypes. The small SD values in Table 1
mean that there is less variance among ORF structures of strains from the same genotype, which indicates the feasibility of using the ORF structure as a criterion for genotyping.
Secondly, due to the different ORF structures in different genotypes, the regions of genotype-specific gaps can be used to design genotype-specific primers. There are two regions containing the most genotype-specific primer variance, one is in the hypervariable region and the other is located at the overlapping region of ORF1, 2 and 3. Compared with the restriction endonuclease genotyping method, primers can provide more flexibility, reliability and accuracy, considering that the usual lengths of restriction endonuclease sites and primers are 6–8 bp and 25–30 bp, respectively.
Finally the target organisms which HEVs can infect differ among genotypes (Supplementary Table S1). Genotypes 1 and 2 infect only primates (Meng et al., 1998
), genotype 4 can infect swine and humans and genotype 3 has the widest target range. In addition, genotypes 3 and 4 are believed to be attenuated (Emerson & Purcell, 2003
), which is the reason that genotype 3 and 4 strains normally have sporadic cases whereas genotype 1 usually causes epidemic infections. Considering the different ORF structures observed among genotypes, several hypotheses can be made. For example, (i) does the longer ORF1 give the ability to infect organisms other than primates, but with the sacrifice of losing the ability to cause epidemic infections? (ii) Does the longer ORF3 provide genotype 3 with the ability to infect more types of animals other than swine?
In this research, the ORF structure has been shown to be helpful in indicating a putative inter-genotype recombinant which was confirmed by phylogenetic analyses. However, this indication is limited to recombinants where recombination events change the ORF structure.
In the bootscanning analysis, the Mexican strain was indicated to be another recombinant. The phylogenetic analyses supported this hypothesis with high bootstrapping value (Fig. 2
). It is predictable that HEV has a low inter-genotype recombination rate because there has only been one confirmed recombinant since the first full-length HEV sequence became available in 1991 (Tam et al., 1991
). The recombinant hypothesis for the Mexican strain somehow explains the phenomenon that none of the recently fully sequenced strains belongs to genotype 2. The fact that the Mexican strain, which could be a recombinant of genotypes 1 and 3, was allocated to a separate genotype before the US strains (genotype 3) were defined as the prototype sequences of the third genotype further supports this recombinant hypothesis.
However, the statistical tests do not support that the Mexican strain clusters with genotype 3 in the region of 1857–2069 rather than genotype 1. The putative breakpoints are located in the hyper-variable region in ORF1. Recombination can be disguised by high mutational rate regions (Anderson et al., 2000
), and vice versa. The long branch-lengths of the middle region in Fig. 2
indicate that it is a region with a high mutational rate. However, there is still a possibility of genotype 2 being a recombinant because both KH and SH tests are conservative. To determine whether genotype 2 is a pure genotype, more full-length sequences in genotype 2 are required in order to obtain more reliable results.
The detection of positive selections in HEV was also carried out in this study by using the CodeML program in the PAML package (Yang et al., 2005
). There are two, five and four sites under positive selection across the genome for genotypes 1, 3 and 4, respectively. Both estimated
value (marginally greater than 1) and the number of selective sites indicate that the evolution in HEV is mainly due to point mutations. Antibody against HEV and vaccine of HEV are effective in the prevention of hepatitis E (Bryan et al., 1994
; Shrestha et al., 2007
), which indicates neither positive selection nor recombination has generated vaccine escape mutations. Recombination is not necessary, which can be the plausible explanation as to why only a few HEV recombinants have been observed.
In conclusion, the ORF structure can be a good criterion for genotyping HEV. An inter-genotype HEV recombinant, DQ450072 [GenBank] , was identified by the ORF scheme. The genotype 2 Mexican strain is a suspicious inter-genotype recombinant, which requires more HEV sequences to confirm.
| ACKNOWLEDGEMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
Arankalle, V. A., Paranjape, S., Emerson, S. U., Purcell, R. H. & Walimbe, A. M. (1999). Phylogenetic analysis of hepatitis E virus isolates from India (1976–1993). J Gen Virol 80, 1691–1700.[Abstract]
Aye, T. T., Uchida, T., Ma, X. Z., Iida, F., Shikata, T., Zhuang, H. & Win, K. M. (1992). Complete nucleotide sequence of a hepatitis E virus isolated from the Xinjiang epidemic (1986–1988) of China. Nucleic Acids Res 20, 3512
Bradley, D. W. (1990a). Enterically-transmitted non-A, non-B hepatitis. Br Med Bull 46, 442–461.
Bradley, D. W. (1990b). Hepatitis non-A, non-B viruses become identified as hepatitis C and E viruses. Prog Med Virol 37, 101–135.[Medline]
Bryan, J. P., Tsarev, S. A., Iqbal, M., Ticehurst, J., Emerson, S., Ahmed, A., Duncan, J., Rafiqui, A. R. & other authors (1994). Epidemic hepatitis E in Pakistan: patterns of serologic response and evidence that antibody to hepatitis E virus protects against disease. J Infect Dis 170, 517–521.[Medline]
Buisson, Y., Grandadam, M., Nicand, E., Cheval, P., van Cuyck-Gandre, H., Innis, B., Rehel, P., Coursaget, P., Teyssou, R. & Tsarev, S. (2000). Identification of a novel hepatitis E virus in Nigeria. J Gen Virol 81, 903–909.
Chatterjee, R., Tsarev, S., Pillot, J., Coursaget, P., Emerson, S. U. & Purcell, R. H. (1997). African strains of hepatitis E virus that are distinct from Asian strains. J Med Virol 53, 139–144.[CrossRef][Medline]
Coursaget, P., Buisson, Y., N'Gawara, M. N., Van Cuyck-Gandre, H. & Roue, R. (1998). Role of hepatitis E virus in sporadic cases of acute and fulminant hepatitis in an endemic area (Chad). Am J Trop Med Hyg 58, 330–334.[Abstract]
Cubitt, D., Bradley, D. W., Carter, M. J., Chiba, S., Estes, M. K., Saif, L. J., Schaffer, F. L., Smith, A. W., Studdert, M. J. & Thiel, H. J. (1995). Family Caliciviridae. In Virus Taxonomy: Classification and Nomenclature of Viruses: Sixth Report of the International Committee on Taxonomy of Viruses [for the] Virology Division, International Union of Microbiological Societies, pp. 359–363. Edited by F. A. Murphy, C. M. Fauquet, D. H. L. Bishop, S. A. Ghabrial, A. W. Jarvis, G. P. Martelli, M. A. Mayo & M. D. Summers. Wien, New York: Springer-Verlag.
Emerson, S. U. & Purcell, R. H. (2003). Hepatitis E virus. Rev Med Virol 13, 145–154.[CrossRef][Medline]
Emerson, S. U., Anderson, D., Arankalle, A., Meng, X. J., Purdy, M., Schlauder, G. G. & Tsarev, S. A. (2005). Hepevirus. In Virus Taxonomy: Classification and Nomenclature of Viruses: Eighth Report of the International Committee on the Taxonomy of Viruses, pp. 853–857. Edited by C. M. Fauquet, M. A. Mayo, J. Maniloff, U. Desselberger & L. A. Ball. London; San Diego, CA: Elsevier/Academic Press.
Erker, J. C., Desai, S. M., Schlauder, G. G., Dawson, G. J. & Mushahwar, I. K. (1999). A hepatitis E virus variant from the United States: molecular characterization and transmission in cynomolgus macaques. J Gen Virol 80, 681–690.[Abstract]
Feinstone, S. M. & Purcell, R. H. (1978). Non-A, non-B hepatitis. Annu Rev Med 29, 359–366.[CrossRef][Medline]
Gouvea, V., Hoke, C. H. & Innis, B. L. (1998). Genotyping of hepatitis E virus in clinical specimens by restriction endonuclease analysis. J Virol Methods 70, 71–78.[CrossRef][Medline]
Hsieh, S. Y., Yang, P. Y., Ho, Y. P., Chu, C. M. & Liaw, Y. F. (1998). Identification of a novel strain of hepatitis E virus responsible for sporadic acute hepatitis in Taiwan. J Med Virol 55, 300–304.[CrossRef][Medline]
Huang, C.-C., Nguyen, D., Fernandez, J., Yun, K. Y., Fry, K. E., Bradley, D. W., Tam, A. W. & Reyes, G. R. (1992). Molecular cloning and sequencing of the Mexico isolate of hepatitis E virus (HEV). Virology 191, 550–558.[CrossRef][Medline]
Koonin, E. V., Gorbalenya, A. E., Purdy, M. A., Rozanov, M. N., Reyes, G. R. & Bradley, D. W. (1992). Computer-assisted assignment of functional domains in the nonstructural polyprotein of hepatitis E virus: delineation of an additional group of an positive-strand RNA plant and animal viruses. Proc Natl Acad Sci U S A 89, 8259–8263.
Maydt, J. & Lengauer, T. (2006). Recco: recombination analysis using cost optimization. Bioinformatics 22, 1064–1071.
Meng, X. J., Halbur, P. G., Haynes, J. S., Tsareva, T. S., Bruna, J. D., Royer, R. L., Purcell, R. H. & Emerson, S. U. (1998). Experimental infection of pigs with the newly identified swine hepatitis E virus (swine HEV), but not with human strains of HEV. Arch Virol 143, 1405–1415.[CrossRef][Medline]
Posada, D. & Crandall, K. A. (1998). MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818.
Schlauder, G. G. & Mushahwar, I. K. (2001). Genetic heterogeneity of hepatitis E virus. J Med Virol 65, 282–292.[CrossRef][Medline]
Schlauder, G. G., Dawson, G. J., Erker, J. C., Kwo, P. Y., Knigge, M. F., Smalley, D. L., Rosenblatt, J. E., Desai, S. M. & Mushahwar, I. K. (1998). The sequence and phylogenetic analysis of a novel hepatitis E virus isolated from a patient with acute hepatitis reported in the United States. J Gen Virol 79, 447–456.[Abstract]
Schlauder, G. G., Desai, S. M., Zanetti, A. R., Tassopoulos, N. C. & Mushahwar, I. K. (1999). Novel hepatitis E virus (HEV) isolates from Europe: evidence for additional genotypes of HEV. J Med Virol 57, 243–251.[CrossRef][Medline]
Shrestha, M. P., Scott, R. M., Joshi, D. M., Mammen, M. P., Jr, Thapa, G. B., Thapa, N., Myint, K. S., Fourneau, M., Kuschner, R. A. & other authors (2007). Safety and efficacy of a recombinant hepatitis E vaccine. N Engl J Med 356, 895–903.
Swofford, D. (1999). PAUP* phylogenetic analysis using parsimony (*and other methods). Sunderland, MA: Sinauer Associates.
Takahashi, M., Nishizawa, T., Yoshikawa, A., Sato, S., Isoda, N., Ido, K., Sugano, K. & Okamoto, H. (2002). Identification of two distinct genotypes of hepatitis E virus in a Japanese patient with acute hepatitis who had not travelled abroad. J Gen Virol 83, 1931–1940.
Tam, A. W., Smith, M. M., Guerra, M. E., Huang, C.-C., Bradley, D. W., Fry, K. E. & Reyes, G. R. (1991). Hepatitis E virus (HEV): molecular cloning and sequencing of the full-length viral genome. Virology 185, 120–131.[CrossRef][Medline]
Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.
Wang, Y., Ling, R., Erker, J. C., Zhang, H., Li, H., Desai, S., Mushahwar, I. K. & Harrison, T. J. (1999). A divergent genotype of hepatitis E virus in Chinese patients with acute hepatitis. J Gen Virol 80, 169–177.[Abstract]
Wang, Y., Zhang, H., Ling, R., Li, H. & Harrison, T. J. (2000). The complete sequence of hepatitis E virus genotype 4 reveals an alternative strategy for translation of open reading frames 2 and 3. J Gen Virol 81, 1675–1686.
Worm, H. C., van der Poel, W. H. M. & Brandstatter, G. (2002). Hepatitis E: an overview. Microbes Infect 4, 657–666.[CrossRef][Medline]
Yang, Z., Wong, W. S. W. & Nielsen, R. (2005). Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol 22, 1107–1118.
Zanetti, A. R., Schlauder, G. G., Romano, L., Tanzi, E., Fabris, P., Dawson, G. J. & Mushahwar, I. K. (1999). Identification of a novel variant of hepatitis E virus in Italy. J Med Virol 57, 356–360.[CrossRef][Medline]
Zhai, L., Dai, X. & Meng, J. (2006). Hepatitis E virus genotyping based on full-length genome and partial genomic regions. Virus Res 120, 57–69.[CrossRef][Medline]
Received 5 December 2008;
accepted 17 February 2009.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
| J MED MICROBIOL | ALL SGM JOURNALS | |