|
|
||||||||
1 Laboratory of Clinical Virology, Department of Medical Microbiology, Center for Infection and Immunity Amsterdam (CINIMA), Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
2 Laboratory of Experimental Virology, Department of Medical Microbiology, CINIMA, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
3 Department of Virology, Sanquin, Amsterdam, The Netherlands
4 Laboratory of Immunochemistry, D. I. Ivanovsky Institute of Virology, Russian Academy of Medical Sciences, Moscow, Russia
Correspondence
Vladimir V. Lukashov
v.lukashov{at}amc.uva.nl
| ABSTRACT |
|---|
|
|
|---|
A list of genotypes and GenBank accession numbers for the human HBV sequences used in this study is available with the online version of this paper.
| INTRODUCTION |
|---|
|
|
|---|
|
However, for HBV to persist successfully and to overcome new challenges, adaptive evolution of both the overlapping P and S genes is essential. In particular, adaptive evolution of the P gene in response to antiviral therapy renders the virus resistant to antiviral drugs, and mutations in the S gene allow virus escape from neutralizing antibodies (Cooreman et al., 2001
; Hannoun et al., 2000
; Moskovitz et al., 2005
). In this study, we analysed the possibility that the HBV polymerase and surface proteins evolve independently, despite being encoded by the same nucleotide sequence.
We hypothesized that HBV may employ a mechanism that allows the independent adaptive evolution of both proteins encoded by the same sequence. In the overlapping, frame-shifted polymerase/surface region of the HBV genome, the first nucleotide position in a P codon corresponds to the third position in the S codon (p1/s3), the second position in a P codon to the first position in the S codon (p2/s1) and the third position in a P codon to the second position in the S codon (p3/s2) (Fig. 2
). The position of a nucleotide substitution within a codon strongly influences the chance for the substitution to be synonymous or not. A nucleotide substitution in the first codon position causes an amino acid change in 60 of 64 cases, in the second position in 63 of 64 cases and in the third position in only 16 of 64 cases. Hence, nucleotide substitution in the first position of a P codon (p1/s3) is likely to cause amino acid changes in P, but not in S, whereas substitutions in the third position of a P codon (p3/s2) are probably synonymous in P, but non-synonymous in S. Thus, an adaptive, non-synonymous nucleotide substitution in the first position of a P codon is likely to remain synonymous in S, whereas an adaptive, non-synonymous nucleotide substitution in the second position of an S codon is likely to remain synonymous in P. On the other hand, nucleotide substitutions in the p2/s1 positions in most cases will be non-synonymous in both genes. We predicted that adaptive evolution of HBV occurs via p1/s3 mutations in the P gene and via p3/s2 mutations in the S gene, and that p2/s1 mutations are rare.
|
| METHODS |
|---|
|
|
|---|
The overall mean synonymous (Ds) and non-synonymous (Dn) distances in the overlapping region of the P and S genes were calculated by using the MEGA 3 software (Kumar et al., 2004
). The NeiGojobori method with JukesCantor correction was used; standard error was calculated by using bootstrap resampling with 1000 replications.
The CODEML module of PAML 3.15 (Yang, 1997
; Yang et al., 2000
) was applied for estimation of the rates of synonymous and non-synonymous nucleotide substitutions at sites in the overlapping reading frames of P and S separately by maximum-likelihood (ML) approximation. The nested site models (1 and 2, 7 and 8) of PAML were run on an array of 680 computers called LISA, hosted by SARA Computing and Networking Services, Amsterdam, The Netherlands. Each computer (node) in the array was provided with two Intel Xeon processors working at 3.4 GHz on 24 GB EM64T memory under OS Debian Linux. Jobs submitted individually consisted of a single model preceded by model 0 in order to trace inconsistency of input data. The approach allowed for the simultaneous estimation of Dn/Ds parameters by parallel jobs assigned to different nodes of the array. When parameter trapping near the border of the parameter space did not occur, it took the most complex model, 8 [default settings (Yang et al., 2000
)], 913 h to converge to an optimal likelihood estimate, given 400 HBV sequences of 226 codons each. For proper convergence, the presence of identical sequences (Dn/Ds=0/0) should be avoided. In addition, the use of an initial input tree with ML-estimated branch lengths (copied from models 0, 1 or 7 into models 2 or 8) may prevent parameter trapping and hence a very slow convergence of model 2 and 8 values for ML to those of models 1 and 7. Likelihood-ratio tests (LRT) and Bayes Empirical Bayes (BEB) statistics (Anisimova et al., 2001
, 2002
; Yang et al., 2005
) were applied as described in the PAML manual (http://abacus.gene.ucl.ac.uk/software/paml.html).
Analysis of the rate of synonymous substitutions (the number of synonymous substitutions per synonymous site, Ks) and non-synonymous substitutions (the number of non-synonymous substitutions per non-synonymous site, Ka), as well as their ratios (Ka/Ks), in a sliding window was performed by using the SWAAP 1.0.2 software (http://www.bacteriamuseum.org/SWAAP/SwaapPage.htm). The NeiGojobori distance estimation method was used for a sliding window (window length, 15 nt; window step, 3 nt, which is the maximal resolution of the program). Due to the limited number of sequences that can be analysed simultaneously by the program, 99 sequences, selected randomly from the total of 450, were analysed each time. So, for the total of 450 sequences, four sets of 99 sequences and one set of 54 sequences were analysed and the data were pooled, together giving 1110 window comparisons.
| RESULTS |
|---|
|
|
|---|
|
To establish whether the difference in variation of the p1/s3 and p3/s2 versus p2/s1 positions is related to HBV diversification within or among genotypes and to confirm that our findings are not limited to certain HBV genotypes, we subsequently analysed nucleotide variation at the p1/s3, p2/s1 and p3/s2 positions among sequences belonging to each genotype and among unassigned sequences, as well as among genotype consensus sequences of genotypes AH (Table 1
). For every genotype from A to H, as well as for unassigned sequences, the cumulative entropy values for the p2/s1 positions were markedly lower that those for the p1/s3 and p3/s2 positions. For instance, for genotype B, which was represented by 120 sequences (118 plus two reference sequences), the cumulative entropy value for the p2/s1 positions was 2.3, compared with the values of 9.8 and 9.5 for the p1/s3 and p3/s2 positions, respectively. The same trend was observed for genotype E, which was represented by five sequences (four plus one reference sequence): 1.2, compared with 3.0 and 2.2. This pattern of markedly lower variation at the p2/s1 than at the p1/s3 and p3/s2 positions was also observed when genotype consensus sequences were compared with each other (Table 1
).
|
|
-distribution the Dn/Ds ratio as a function of the proportion of sites with a certain ratio. A skewed distribution indicates the proportion of sites that are either highly conserved or nearly neutral. The values for p and q for P and S pointed to a similarly shaped
-distribution.
|
|
| DISCUSSION |
|---|
|
|
|---|
The evolution of a genetic region containing overlapping, frame-shifted genes is subjected to extra constraints. A slower evolution rate of overlapping compared with non-overlapping genome regions has been demonstrated for a number of viruses, including HBV (Mizokami et al., 1997
). The extra constraints in the evolution of overlapping compared with non-overlapping genetic regions are related to the fact that a neutral or beneficial substitution in one reading frame could be deleterious in the other reading frame. Therefore, synonymous substitutions and adaptive amino acid changes in one reading frame, which would be evolutionarily neutral or beneficial in non-overlapping genetic regions, will nevertheless be selected against, as they could be deleterious in the other reading frame. As a result, independent evolution of overlapping genes is restricted. The general trend demonstrated for the evolution of the overlapping regions is that one of the two overlapping genes is subjected to positive selection (adaptive evolution), whilst the other is subjected to purifying selection. In particular, this has been shown for papillomaviruses, in which the adaptive evolution of the E2 region and purifying selection in the overlapping E4 region have been demonstrated (Hughes & Hughes, 2005
; Narechania et al., 2005
). The same phenomenon was demonstrated for the evolution of potato leafroll virus (Guyader & Ducray, 2002
). For members of the family Microviridae, in which the A and B as well as D and E genes are overlapping, in both overlapping regions, one gene was subjected to positive selection (Ka/Ks ratio >1, B and E genes), whilst in the other gene, purifying selection was operational (Ka/Ks ratio <1, A and D genes) (Pavesi, 2006
). Similarly, in SIV, the tat gene was found to be the most variable among the nine virus genes at the amino acid level, whereas the overlapping vpr gene appeared to be one of the most conserved (Hughes et al., 2001
). Moreover, for SIV, the adaptive evolution of tat mirrored by purifying selection in vpr has been demonstrated in vivo in experimentally infected monkeys (Hughes et al., 2001
).
Among viruses with overlapping genes, HBV provides a striking example, with 50 % of its genome containing overlapping reading frames (Fig. 1
). For many other viruses, the overlapping of reading frames is partial and adaptive evolution of both genes can occur in non-overlapping regions. In contrast, the HBV surface protein gene S is overlapped completely by the polymerase gene P. Whilst the independent adaptive evolution of both P and S genes was shown to be constrained (Mizokami et al., 1997
), it should not be precluded, as both genes must adapt to the versatile environment. We hypothesized that HBV may employ a mechanism that allows the independent adaptive evolution of both overlapping genes, by which the evolution of P occurs via p1/s3 non-synonymous substitutions, which are synonymous in S, and the evolution of S mainly occurs via p3/s2 non-synonymous substitutions, which are synonymous in P (Fig. 2
). To test this hypothesis, we analysed variation of nucleotides and amino acids among 450 internationally obtained HBV genomes in the overlapping P and S gene region.
We demonstrated that the evolution of the overlapping region of the P and S genes indeed occurs mainly via p3/s2 and p1/s3 substitutions, respectively, and that substitutions at the p2/s1 positions, which would affect amino acid of both proteins, are rare (Fig. 3a
). This mechanism was operational in HBV evolution both within and among genotypes (Table 1
). The Dn/Ds ratio of <1 for the whole gene does not mean that adaptive evolution is not operational, as adaptive mutations could have accumulated in short regions of the gene, or even in a few nucleotide positions. By using PAML analysis, we identified positions of both the P and S genes where adaptive evolution is operational (Fig. 3c
). Sliding-window analysis demonstrated that, whilst significant parts of the P or S genes were subjected to positive selection, with the Ka/Ks ratio for either P or S gene being >1, there were only a few regions where the Ka/Ks ratios in both genes were >1 (Fig. 4
).
Whilst HBV is a rather unique example of overlapping reading frames, this mechanism of independent evolution of the overlapping regions could also apply to other viruses. This point is supported by the observation of an increased frequency of amino acid residues with a high level of degeneracy (arginine, leucine and serine) in the proteins encoded by overlapping genes of several viruses (Pavesi et al., 1997
).
| ACKNOWLEDGEMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
Anisimova, M., Bielawski, J. P. & Yang, Z. (2002). Accuracy and power of Bayes prediction of amino acid sites under positive selection. Mol Biol Evol 19, 950958.
Ardell, D. H. & Sella, G. (2001). On the evolution of redundancy in genetic codes. J Mol Evol 53, 269281.[CrossRef][Medline]
Barrell, B. G., Air, G. M. & Hutchison, C. A., III (1976). Overlapping genes in bacteriophage
X174. Nature 264, 3441.[CrossRef][Medline]
Bartholomeusz, A. & Schaefer, S. (2004). Hepatitis B virus genotypes: comparison of genotyping methods. Rev Med Virol 14, 316.[CrossRef][Medline]
Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T. J., Higgins, D. G. & Thompson, J. D. (2003). Multiple sequence alignment with the CLUSTAL series of programs. Nucleic Acids Res 31, 34973500.
Cooreman, M. P., Leroux-Roels, G. & Paulij, W. P. (2001). Vaccine- and hepatitis B immune globulin-induced escape mutations of hepatitis B virus surface antigen. J Biomed Sci 8, 237247.[CrossRef][Medline]
Gibbs, A. & Keese, P. K. (1995). In search of the origin of viral genes. In Molecular Basis of Virus Evolution, pp. 7690. Edited by A. Gibbs, C. H. Calisher & F. Garcia-Arenal. Cambridge, UK: Cambridge University Press.
Guyader, S. & Ducray, D. G. (2002). Sequence analysis of potato leafroll virus isolates reveals genetic stability, major evolutionary events and differential selection pressure between overlapping reading frame products. J Gen Virol 83, 17991807.
Hall, T. A. (1999). BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser 41, 9598.
Hannoun, C., Horal, P. & Lindh, M. (2000). Long-term mutation rates in the hepatitis B virus genome. J Gen Virol 81, 7583.
Hughes, A. L. & Hughes, M. A. (2005). Patterns of nucleotide difference in overlapping and non-overlapping reading frames of papillomavirus genomes. Virus Res 113, 8188.[CrossRef][Medline]
Hughes, A. L., Westover, K., da Silva, J., O'Connor, D. H. & Watkins, D. I. (2001). Simultaneous positive and purifying selection on overlapping reading frames of the tat and vpr genes of simian immunodeficiency virus. J Virol 75, 79667972.
Johnson, Z. I. & Chisholm, S. W. (2004). Properties of overlapping genes are conserved across microbial genomes. Genome Res 14, 22682272.
Kumar, S., Tamura, K. & Nei, M. (2004). MEGA3: integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform 5, 150163.
Mizokami, M., Orito, E., Ohba, K., Ikeo, K., Lau, J. Y. & Gojobori, T. (1997). Constrained evolution with respect to gene overlap of hepatitis B virus. J Mol Evol 44 (Suppl. 1), S83S90.[CrossRef][Medline]
Moskovitz, D. N., Osiowy, C., Giles, E., Tomlinson, G. & Heathcote, E. J. (2005). Response to long-term lamivudine treatment (up to 5 years) in patients with severe chronic hepatitis B, role of genotype and drug resistance. J Viral Hepat 12, 398404.[CrossRef][Medline]
Myers, R., Clark, C., Khan, A., Kellam, P. & Tedder, R. (2006). Genotyping hepatitis B virus from whole- and sub-genomic fragments using position-specific scoring matrices in HBV STAR. J Gen Virol 87, 14591464.
Narechania, A., Terai, M. & Burk, R. D. (2005). Overlapping reading frames in closely related human papillomaviruses result in modular rates of selection within E2. J Gen Virol 86, 13071313.
Pavesi, A. (2006). Origin and evolution of overlapping genes in the family Microviridae. J Gen Virol 87, 10131017.
Pavesi, A., De Iaco, B., Granero, M. I. & Porati, A. (1997). On the informational content of overlapping genes in prokaryotic and eukaryotic viruses. J Mol Evol 44, 625631.[CrossRef][Medline]
Yang, Z. (1997). PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13, 555556.
Yang, Z., Nielsen, R., Goldman, N. & Pedersen, A. M. (2000). Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155, 431449.
Yang, Z., Wong, W. S. & Nielsen, R. (2005). Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol 22, 11071118.
Received 6 February 2007;
accepted 19 April 2007.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
| J MED MICROBIOL | ALL SGM JOURNALS | |