|
|
||||||||
Short Communication |
Victorian Infectious Diseases Reference Laboratory, 10 Wreckyn Street, North Melbourne, Victoria 3051, Australia
Correspondence
Doris Chibo
Doris.chibo{at}mh.org.au
| ABSTRACT |
|---|
|
|
|---|
The GenBank/EMBL/DDBJ accession numbers for the sequences reported in this paper are DQ243940 [GenBank] DQ243962 [GenBank] and DQ243964 [GenBank] DQ243987 [GenBank] .
A figure showing the phylogenetic analysis using PHYLIP on HCoV-229E isolates and strains identified between 1972 and 2004, and tables showing primers used and their nucleotide position relative to the HCoV-229E complete genome and the summary of amino acid changes in the S and N proteins of HCoV-229E variants studied are available as supplementary material in JGV Online.
| MAIN TEXT |
|---|
|
|
|---|
Coronaviruses contain four structural proteins, spike (S), membrane (M), small envelope (E) and nucleocapsid (N) (Brian & Baric, 2005
). A fifth, the haemagglutinin-esterase (HE) protein, exists in some serogroup 2 coronaviruses (Holmes & Lai, 1996
). Limited functional and structural information exists for the HCoV-229E S protein, although receptor-binding activity is associated with the S1 subunit (Bonavia et al., 2003
). In the related non-human Infectious bronchitis virus, S1 is involved in the induction of neutralizing, serotype-specific and haemagglutination inhibiting antibodies (Cavanagh et al., 1988
).
Both recombination and point mutations have the capacity to drive coronavirus evolution (Navas-Martin & Weiss, 2003
). Although recombination has not been detected for HCoV-229E or HCoV-OC43, a recombination breakpoint in the SARS-CoV polymerase has been described (Rest & Mindell, 2003
). Few studies on variation in the human coronavirus S protein have been reported. One study of three geographically and chronologically distinct HCoV-229E isolates found limited variation within S gene nucleotide sequences (Hays & Myint, 1998
). In contrast, two distinct variants of HCoV-OC43 circulated in Belgium between 2003 and 2004 (Vijgen et al., 2005
).
Prior to 2001, in our laboratory coronaviruses were identified using a method involving isolation from human fibroblasts, followed by confirmatory immune electron microscopy where these agents were suspected. Virus isolation attempts on respiratory specimens were discontinued in 2001 and replaced by a RT-PCR capable of differentiating between HCoV-229E and HCoV-OC43 (Birch et al., 2005
). RNA was extracted from virus isolates and clinical material by using a High Pure Viral Nucleic Acid kit (Roche Diagnostics) and reverse transcribed using avian myeloblastosis virus reverse transcriptase (Promega) at 42 °C for 1 h using random hexamers. The full S and N gene sequences of HCoV-229E were amplified in a nested PCR (Supplementary Table S1 available in JGV Online) using a Qiagen Taq DNA polymerase kit (Qiagen). The cycling programmes consisted of an initial denaturation of 4 min at 94 °C, followed by 40 cycles (first round) or 25 cycles (second round) of 30 s at 94 °C, 30 s at 60 °C and 130 s for the S gene or 90 s for the N gene at 72 °C, with a final extension of 7 min at 72 °C. RT-PCR products were purified, sequenced in both directions using a cycle sequencing reaction (ABI Prism Big Dye Terminator Cycle Sequencing Ready Reaction kit; Perkin-Elmer) and analysed by using an ABI 3730S capillary sequencer.
Nucleotide sequences were analysed and amino acid sequences determined using BIOEDIT sequence alignment editor version 7.0.1 (Hall, 1999
). Alignments of nucleotide sequences were made with Multalin (Corpet, 1988
) and manual editing of alignments was performed using Genedoc version 2.5 (Nicholas & Nicholas, 1997
). Expected transition/transversion ratios and gamma distribution parameter alpha were estimated using TREEPUZZLE version 5.2 (Schmidt et al., 2002
). Phylogenetic trees based on the optimum alignment were constructed using DNAdist and neighbour-joining method with PHYLIP version 3.63 (Felsenstein, 1993
) and parameters estimated from TREEPUZZLE. Unrooted phylograms were drawn with TREEVIEW version 1.5 (Page, 1996
). Tests of selection were conducted using MEGA version 3.1 (Kumar et al., 2004
). The codon-based model of NeiGojobori was used to compare synonymous (dS) and non-synonymous (dN) distances. Assumptions tested were purifying (dN<dS), neutral (dN=dS) and positive (dN>dS) selection. The probability computed was <0·05 for hypothesis rejection at the 5 % level. Potential N-glycosylation sites in the HCoV-229E S protein were predicted using the NetNGlyc 1.0 Server at the Centre for Biological Sequence Analysis (http://www.cbs.dtu.dk/services/NetNGlyc/).
Between 1979 and 2004, HCoV-229E was isolated or detected by RT-PCR in 25 patients aged from 0·3 to 58 years of age (Table 1
). Patients suffered from symptoms varying from mild upper respiratory tract infection to pneumonia, and three (patients 12, 18 and 19) were hospitalized. Several presented with fever, fatigue and cough, consistent with a case definition of influenza-like illness (Clothier et al., 2005
). A total of 25 HCoV-229E sequences were analysed. Twenty-one were full-length S gene sequences, 13 from virus isolates and eight directly from clinical material. Partial S gene sequences were obtained from two samples of clinical material (patients 19 and 25 in Table 1
). S gene sequences could not be obtained from the viruses detected in patients 18 and 24. All S gene sequences available were compared to an ATCC prototype strain (HCoV-229E 1973; GenBank accession no. DQ243963
[GenBank]
, containing 3522 nt, 1173 aa).
|
|
The entire sequence of the N gene was obtained for 23 of 25 HCoV-229E variants; compared with the S gene, there was less heterogeneity in this gene and the most divergent variant (from patient 22 in Table 1
) differed by only 25 nt (9 aa) from the 1973 prototype strain (HCoV-229E 1973; GenBank accession no. DQ243939
[GenBank]
). Seven amino acid changes, four of which were non-conservative, were shared by each variant. Four changes clustered in a hot spot between aa 224 and 229 (Supplementary Table S3 available in JGV Online). Deletions were detected in only two N gene sequences. The first was a three base (GGT) in-frame deletion at nt position 480 in the virus from patient 3 resulting in the deletion of glycine at aa 160. The second was an A deletion from a string of five As starting at nt 611 in the variant from patient 14, producing a frameshift mutation and generation of a stop codon 52 aa downstream. This sequence was generated by direct RT-PCR from a clinical specimen and may be the result of amplification of non-infectious viral RNA.
Phylogenetic analysis of the S gene assigned the 23 variants to four distinct groups each containing temporally associated viruses (Fig. 1
). The groups comprised viruses circulating during the periods 19791982, 19821984, 19901992 and 20012004. The expected transition/transversion ratio and gamma distribution parameter alpha were estimated as 1·62 and 0·16, respectively, from the dataset using TREEPUZZLE (Schmidt et al., 2002
). These values were used in DNAdist (Kimura) to produce a more accurate tree topology. A strong correlation between bootstrap values and these groupings was found. The expected transition/transversion ratio and gamma distribution parameter alpha values estimated using the N gene dataset were 1·81 and 0·19, respectively. Phylogenetic analysis of the N gene showed clustering of variants similar to that obtained with the S gene. However, less sequence variation in the N gene made it difficult to resolve the viruses into four groups. Rather, two major clusters comprising groups 1 and 2, and groups 3 and 4, respectively, were identified (Supplementary Fig. S1 available in JGV Online).
|
Alignment of S protein sequences and phylogenetic analysis of the S gene revealed several discrete time points where amino acid substitutions occurred and were then retained during subsequent years. These mutations may represent the outcome of natural evolution of the virus in the context of antibody pressure. Comparing rates of dS and dN nucleotide substitutions can test three forms of evolutionary selection on a gene. Equal dS and dN substitution rates demonstrate an absence of selection, whereas an excess of dS substitutions indicates purifying selection, operating to preserve protein structure and function. Excess dN substitution rates indicate positive selection and occur mainly in genes with adaptive functions (Bush, 2001
). Most positively selected genes previously identified have been pathogen surface proteins (Yang & Bielawski, 2000
). To avoid recognition by antibodies generated in response to prior antigen exposure, surface proteins of pathogens must change their structure. Hence, it is likely that evasion of the host immune system drives repeated amino acid replacements in surface proteins (Bush, 2001
). Using the NeiGojobori method for estimating statistical differences between dS and dN substitution rates (Kumar et al., 2004
), we confirmed that the HCoV-229E S gene sequences had undergone positive selection over time. Probabilities of 0·00 were obtained for purifying and neutral selection and both hypotheses were rejected. A probability of 1·0 was obtained when testing for positive selection and this hypothesis was therefore accepted.
We could not show evidence for the emergence of distinct subtypes of HCoV-229E as recently reported for HCoV-OC43, where strains circulating in Belgium between 2003 and 2004 showed sufficient unrelatedness at the amino acid level (3·1 %) to suggest the existence of two genetically distinct strains (Vijgen et al., 2005
). For the HCoV-229E variants we studied, the overall amino acid difference between the 1979 strain and the 2004 strain was 3·3 %. When apparent antigenic drift occurred, the extent of amino acid difference between the old variant and its replacement was less than 1 %.
When the HCoV-229E S gene sequences from this study were compared to 26 sequences available through GenBank, only two variants were similar. The first, strain A162 (accession no. Y10051
[GenBank]
), was isolated in 1995 from an adult in Ghana (Hays & Myint, 1998
). While the submitted sequence was 483 nt short of full-length, it grouped phylogenetically with coronaviruses circulating in Australia between 1990 and 1992 (Fig. 1
). Also similar was strain 1318, first isolated in 1999 in the USA. Although the sequence available was less than half that of the complete S gene, this virus grouped with variants circulating between 2001 and 2004 in Australia (Fig. 1
). Hence, the time of circulation of distinct HCoV-229E variants in south-eastern Australia generally coincided with that of variants circulating elsewhere.
Overall, sequencing and analysis of the S and N gene products of HCoV-229E strains circulating in Victoria, Australia, between 1979 and 2004 has provided the first evidence for genetic drift and positive selection as part of the evolution of this virus. The similarity between those strains circulating in Victoria and a small number of strains identified in other geographical locations at similar times indicates that HCoV-229E, despite having the potential, has not undergone major recombination events since it was first isolated in 1967.
| REFERENCES |
|---|
|
|
|---|
Bonavia, A., Zelus, B. D., Wentworth, D. E., Talbot, P. J. & Holmes, K. V. (2003). Identification of a receptor-binding domain of the spike glycoprotein of human coronavirus HCoV-229E. J Virol 77, 25302538.
Brian, D. A. & Baric, R. S. (2005). Coronavirus genome structure and replication. Curr Top Microbiol Immunol 287, 130.[Medline]
Bush, R. M. (2001). Predicting adaptive evolution. Nat Rev Genet 2, 387392.[Medline]
Cavanagh, D., Davis, P. J. & Mockett, A. P. (1988). Amino acids within hypervariable region 1 of avian coronavirus IBV (Massachusetts serotype) spike glycoprotein are associated with neutralization epitopes. Virus Res 11, 141150.[CrossRef][Medline]
Clothier, H. J., Fielding, J. E. & Kelly, H. A. (2005). An evaluation of the Australian Sentinel Practice Research Network (ASPREN) surveillance for influenza-like illness. Commun Dis Intell 29, 231247.[Medline]
Corpet, F. (1988). Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res 16, 1088110890.
Drosten, C., Gunther, S., Preiser, W. & 23 other authors (2003). Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N Engl J Med 348, 19671976.
Felsenstein, J. (1993). PHYLIP: phylogeny inference package (version 3.5c). Department of Genetics, University of Washington, Seattle, WA, USA.
Hall, T. A. (1999). BIOEDIT: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser 41, 9598.
Hays, J. P. & Myint, S. H. (1998). PCR sequencing of the spike genes of geographically and chronologically distinct human coronaviruses 229E. J Virol Methods 75, 179193.[CrossRef][Medline]
Holmes, K. V. (2001). Coronaviruses. In Fields Virology, 3rd edn, pp. 11871203. Edited by B. N. Fields, D. M. Knipe & P. M. Howley. Philadelphia: Lippincott Williams & Wilkins.
Holmes, K. V. & Lai, M. M. C. (1996). Coronaviridae: the viruses and their replication. In Fields Virology, 3rd edn, vol. 1, pp. 10751093. Edited by B. N. Fields, D. M. Knipe & P. M. Howley. Philadelphia: Lippincott Raven.
Kumar, S., Tamura, K. & Nei, M. (2004). MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5, 150163.
Lai, M. M. (1990). Coronavirus: organization, replication and expression of genome. Annu Rev Microbiol 44, 303333.[Medline]
McIntosh, K. (1996). Coronaviruses. In Fields Virology, 3rd edn, vol. 1, pp. 10951103. Edited by B. N. Fields, D. M. Knipe & P. M. Howley. Philadelphia: Lippincott Raven.
Navas-Martin, S. & Weiss, S. R. (2003). SARS: lessons learned from other coronaviruses. Viral Immunol 16, 461474.[CrossRef][Medline]
Nicholas, K. B. & Nicholas, H. B., Jr (1997). Genedoc: a tool for editing and annotating multiple sequence alignments. Distributed by author, 2·6·002 edn. http://www.psc.edu/biomed/genedoc
Page, R. D. M. (1996). TREEVIEW: an application to display phylogenetic trees on personal computers. Comput Appl Biosci 12, 357358.
Pene, F., Merlat, A., Vabret, A., Rozenberg, F., Buzyn, A., Dreyfus, F., Cariou, A., Freymuth, F. & Lebon, P. (2003). Coronavirus 229E-related pneumonia in immunocompromised patients. Clin Infect Dis 37, 929932.[CrossRef][Medline]
Rest, J. S. & Mindell, D. P. (2003). SARS associated coronavirus has a recombinant polymerase and coronaviruses have a history of host-shifting. Infect Genet Evol 3, 219225.[CrossRef][Medline]
Schmidt, H. A., Strimmer, K., Vingron, M. & von Haeseler, A. (2002). TREEPUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18, 502504.
Siddell, S. G. (1995). The Coronaviridae: an introduction. In The Coronaviridae. Edited by S. G. Siddell. New York: Plenum Press.
Snijder, E. J., Bredenbeek, P. J., Dobbe, J. C. & 7 other authors (2003). Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-off from the coronavirus group 2 lineage. J Mol Biol 331, 9911004.[CrossRef][Medline]
Vabret, A., Mourez, T., Gouarin, S., Petitjean, J. & Freymuth, F. (2003). An outbreak of coronavirus OC43 respiratory infection in Normandy, France. Clin Infect Dis 36, 985989.[CrossRef][Medline]
van der Hoek, L., Pyrc, K., Jebbink, M. F. & 7 other authors (2004). Identification of a new human coronavirus. Nat Med 10, 368373.[CrossRef][Medline]
Vijgen, L., Keyaerts, E., Lemey, P., Moes, E., Li, S., Vandamme, A. M. & Van Ranst, M. (2005). Circulation of genetically distinct contemporary human coronavirus OC43 strains. Virology 337, 8592.[CrossRef][Medline]
Woo, P. C., Lau, S. K., Chu, C. M. & 12 other authors (2005). Characterization and complete genome sequence of a novel coronavirus, coronavirus HKU1, from patients with pneumonia. J Virol 79, 884895.
Yang, Z. & Bielawski, J. P. (2000). Statistical methods for detecting molecular adaptation. Trends Ecol Evol 15, 496503.[CrossRef][Medline]
Received 7 November 2005;
accepted 16 January 2006.
This article has been cited by other articles:
![]() |
R. Dijkman, M. F. Jebbink, N. B. El Idrissi, K. Pyrc, M. A. Muller, T. W. Kuijpers, H. L. Zaaijer, and L. van der Hoek Human Coronavirus NL63 and 229E Seroconversion in Children J. Clin. Microbiol., July 1, 2008; 46(7): 2368 - 2373. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
| J MED MICROBIOL | ALL SGM JOURNALS | |