|
|
||||||||
The University of Nottingham, Institute of Infection, Immunity and Inflammation, School of Molecular Medical Sciences, Division of Microbiology and Infectious Diseases, Queens Medical Centre, A Floor, West Block, Nottingham NG7 2UH, UK
Correspondence
Richard J. P. Brown
richard.brown{at}nottingham.ac.uk
| ABSTRACT |
|---|
|
|
|---|
The GenBank/EMBL/DDBJ accession numbers for the sequence data reported here are EF043079EF043105.
| INTRODUCTION |
|---|
|
|
|---|
HCV exhibits extensive genetic heterogeneity, such that isolates can be assigned to one of six phylogenetically discrete genotypes (Robertson et al., 1998
). These six distinct genotypes can be divided further into numerous subtypes. HCV genotypes/subtypes are frequently clustered according to geographical location, with the exponential spread of particular subtypes being associated epidemiologically with specific regions and modes of transmission (Pybus et al., 2003
, 2005
; Simmonds, 2001a
, b
, 2004
; Simmonds et al., 2005
). Even within the same patient, extensive virus genetic heterogeneity is observed, such that the population exists as a swarm of genetically related yet distinct variants called a quasispecies (Bukh et al., 1995
; Martell et al., 1992
).
Over the complete genome (9400 bp), HCV displays >30 % nucleotide variability between genotypes, with the envelope glycoprotein genes E1 and E2 exhibiting the highest degree of genetic heterogeneity (Simmonds, 2004
; Simmonds et al., 2005
). Neutral genetic drift is reportedly responsible for much of the variability observed between the geographically isolated genotypes of HCV (Simmonds, 2004
). However, the continuing persistence and transmission of HCV in human populations are, in part, facilitated by molecular adaptation arising from cell- and antibody-mediated immunity, although many other contributing factors are involved.
The E2 glycoprotein is the main target for neutralizing-antibody responses and there is increasing evidence that mutations occurring within the first hypervariable region (HVR1), located at the N terminus of E2, lead to immune escape and subsequent development of chronic infection (Farci et al., 2000
; Frasca et al., 1999
; Majid et al., 1999
; Ray et al., 1999
; Wang & Eckels, 1999
). The extensive genetic variability observed within E1E2 poses a significant challenge for the development of antibody-based vaccines. Successful vaccine design demands a comprehensive knowledge of HCV genotype/subtype E1E2 molecular evolution, enabling the specific testing of immunogens targeting neutralizing epitopes that exhibit conservation across genotypes. Previously, such analyses have not been possible, due to the limited number of full-length (FL) E1E2 sequences representative of some genotypes, particularly genotypes 35. Consequently, studies of adaptive molecular evolution in HCV have, thus far, been restricted to a small number of analyses on HCV quasispecies, and often to only HVR1 and flanking sequences. In addition, much of our current understanding of E1E2 adaptive evolution is based on estimates of synonymous (dS) and non-synonymous (dN) nucleotide-substitution rates averaged across the genomic region under study. This averaging approach is often unable to detect positive selection if it is restricted to a small number of sites in a region where the majority of sites are under strong purifying selection. However, recently developed models enable estimation of dN : dS ratios (
) for individual amino acid positions within an alignment by assessing competing models of codon substitution within a maximum-likelihood (ML) framework (Yang & Bielawski, 2000
). A value of
>1 is the molecular signature of diversifying selection (positive selection or adaptive evolution),
=1 is suggestive of neutral evolution (genetic drift) and
<1 is indicative of purifying (negative) selection. Recently, we have used this approach to show that adaptive evolution within E1E2 during chronic HCV infection is patient-specific, with selected sites generally restricted to regions thought to be involved in immune evasion and receptor binding (Brown et al., 2005
). Site-specific methods have also been applied to analyse selection in partial E1E2 sequences obtained from individuals in the acute phase of HCV infection, highlighting a correlation between selective pressure and disease outcome (Sheridan et al., 2004
).
Assessment of inter- and intra-genotypic evolution will have an important role in informing broadly neutralizing vaccine development. Here, we report the generation of a panel of 45 FL E1E2 gene sequences corresponding to under-represented genotypes in international sequence databases. This sequence dataset was then combined with FL E1E2 sequences available through the Los Alamos National Laboratory (LANL) HCV database (incorporating NCBI, EMBL and DDBJ; http://hcv.lanl.gov/content/immuno/immuno-main.html) to enable a detailed analysis of extant inter- and intra-genotype diversity. Quantification and mapping of evolutionary processes contributing to HCV diversity revealed both common and differential patterns of molecular adaptation apparent within and between the different HCV genotypes. These data also show that, despite the observed heterogeneity, the majority of codons within HCV E1E2 are under strong purifying selection, a finding that has important implications concerning antibody-based vaccine design.
| METHODS |
|---|
|
|
|---|
Sequence generation and analysis.
In total, 45 FL E1E2 PCR products (17221752 bp) were generated from a cross-genotype panel of HCV-infected serum samples by using primers and methods described previously (Brown et al., 2005
; Lavillette et al., 2005
). FL E1E2 amplification products were ligated into either pENTR or pcDNA3.1 V5/His directional TOPO vectors (Invitrogen) and subsequently sequenced via BigDye terminator chemistry using an ABI 3100 capillary sequencer (Applied Biosystems). UKN FL E1E2 sequences generated for this study have been submitted to GenBank under the accession numbers EF043079
[GenBank]
EF043105
[GenBank]
. LANL HCV database sequences utilized in this analysis are deposited under the GenBank accession numbers AY070174
[GenBank]
, M58335
[GenBank]
, AF176573
[GenBank]
, U16362
[GenBank]
, D10934
[GenBank]
, AY587016
[GenBank]
, AJ000009
[GenBank]
, AF165053
[GenBank]
, AF483269
[GenBank]
, U01214
[GenBank]
, M84754
[GenBank]
, AY051292
[GenBank]
, AY651061
[GenBank]
, D14853
[GenBank]
, AB047639
[GenBank]
, AF169005
[GenBank]
, AF177036
[GenBank]
, AF238481
[GenBank]
, AF238482
[GenBank]
, AF238483
[GenBank]
, AF238484
[GenBank]
, AF238485
[GenBank]
, AY746460
[GenBank]
, D00944
[GenBank]
, AF238486
[GenBank]
, AY232730
[GenBank]
, AY232736
[GenBank]
, AY232738
[GenBank]
, AY232740
[GenBank]
, AY232746
[GenBank]
, AY232748
[GenBank]
, D10988
[GenBank]
, D10749
[GenBank]
, M62321
[GenBank]
, AY695437
[GenBank]
, D50409
[GenBank]
, AB031663
[GenBank]
, AF046866
[GenBank]
, D17763
[GenBank]
, D28917
[GenBank]
, X76918
[GenBank]
, D63821
[GenBank]
, D49374
[GenBank]
, Y11604
[GenBank]
, AF064490
[GenBank]
, Y13184
[GenBank]
, AY859526
[GenBank]
, D63822
[GenBank]
, D84262
[GenBank]
, D84263
[GenBank]
, D84264
[GenBank]
, D84265
[GenBank]
and Y12083.
[GenBank]
Nucleotide sequences were aligned by using the CLUSTAL_W option of MEGA v. 3.1 (Kumar et al., 2004
) and were adjusted manually, based on alignments of the deduced amino acid sequences, to ensure maintenance of the polyprotein open reading frame. Alignments were generated separately for each genotype, in conjunction with a combined alignment containing all sequences. Regions of dubious/ambiguous alignment were removed to preclude comparison of non-homologous sites and false-positive identification of adaptive mutations (
>1) in subsequent analyses. Genotype-specific indels were removed from the combined alignment prior to phylogenetic reconstruction. Consensus amino acid sequences were generated for genotype-specific datasets by using the consensus sequence tool available in the LANL HCV sequence database.
Phylogenetic reconstruction.
Molecular phylogenies were generated by using the ML criterion implemented in PAUP* version 4.0b10 (Swofford, 2003
). For each dataset, the best-fit model of nucleotide substitution for the observed data was selected from 56 competing models by hierarchical likelihood-ratio tests (LRTs) using MODELTEST version 3.7 (Posada & Crandall, 1998
). The proportion of invariant sites (I) (Sullivan et al., 1999
), the
shape parameter of the gamma distribution (
4, four discrete rate categories) (Yang, 1994b
), nucleotide frequencies and base-exchangeability parameters were all estimated from an initial neighbour-joining (NJ) tree (Saitou & Nei, 1987
) and subsequently fixed during heuristic optimization using the TBR (tree bisectionreconnection) branch-swapping algorithm. The best-fit substitution models used to generate the genotype-specific and combined phylogenies are detailed in Table 2
. The TrN+I+
4 model appears to best describe the complex patterns of intra-genotype nucleotide substitution that gave rise to datasets Gt1Gt6. The TrN model incorporates variable base frequencies, equal transversion frequencies and variable transitional frequencies between purines and pyrimidines (Tamura & Nei, 1993
). Invariant sites and sites with variable evolutionary rates are accounted for by the parameters I and
4, respectively; however, for dataset Gt5, the proportion of invariant sites is described adequately by the tail of the
distribution. The more complex inter- and intra-genotype evolutionary processes that gave rise to the combined topology (dataset A) are best described by the higher-resolution substitution model GTR+I+
4 (Fig. 1
). The GTR model (Yang, 1994a
) incorporates variable base frequencies in conjunction with a symmetrical substitution matrix incorporating six parameters (representing the relative rates of the substitutions A
C, A
G, A
T, C
G, C
T and G
T). Again, static, unchanging sites and rate heterogeneity are accounted for by the parameters I and
4. Statistical confidence limits to infer the robustness of internal nodes were estimated by using the bootstrap approach (Felsenstein, 1985
). Bootstrap values assigned to ML tree nodes were estimated from bootstrap consensus trees and are given as percentages derived from 1000 replicate NJ trees estimated under the best-fit substitution model. Informative site tests (Robertson et al., 1995
) were performed by using SimPlot version 3.5.1 (Lole et al., 1999
) to identify any putative HCV E1E2 recombinant/mosaic sequences, the inclusion of which may result in false-positive detection of positive selection (Anisimova et al., 2003
). No significant evidence for E1E2 recombination/mosaicism was observed.
|
|
ratios across codon sites (Yang & Bielawski, 2000
>1) (Anisimova et al., 2001
(p,q), approximated by 10 discrete categories. Variable
rates are allowed depending on the values of p and q, but are always bounded between 0 and 1, thus M7 does not permit positive selection. The M8 (alternative) model is identical to M7, with 10 discrete categories bounded between 0 and 1 (the p0 class of sites). However, there is an additional class of sites (p1) that have a free parameter,
1, which is not constrained between 0 and 1. Thus, M8 permits a class of sites with
>1 if positive selection is occurring at specific codon sites. When M8 suggests the occurrence of sites under diversifying selection, an empirical Bayes method is used to calculate the posterior probabilities of the assignment of
ratios to sites. M7 can then be compared with M8 via an LRT. The significance of the LRT can then be assessed via consultation of standard
2 tables. When the M7/M8 LRT is significant (df=2) and sites are identified with
>1 by M8 with significant (>95 %) Bayesian posterior probabilities, this is indicative of the action of site-specific molecular adaptation (Yang & Bielawski, 2000
2 test. These analyses are available from the authors on request. | RESULTS |
|---|
|
|
|---|
|
99 %) bootstrap values. However, the basal splits within the tree, which represent the points of diversification from a founder virus, were not well supported. Patterns of clade diversity observed in subtypes 1a, 1b, 2a, 2b, 3a, 4a and 5a are indicative of recent seedings of the virus into specific transmission networks (Simmonds et al., 2005
Synonymous site and amino acid diversity
Mean pairwise (p) distances, calculated for translated HCV protein sequences spanning the entire E1E2 coding region, highlighted regions of high and low variability distributed throughout the primary amino acid sequence (data not shown). Although high levels of amino acid divergence were apparent, p distances calculated at synonymous sites revealed that a considerable proportion of genetic variation observed in HCV genotypes 16 was due to synonymous substitution (data not shown). The synonymous site/amino acid diversity plots generated are not presented here, but are available from the authors on request. The high levels of synonymous site diversity observed throughout E1 and E2 indicate that these genes are under strong purifying selection. Indeed, consensus sequences generated for each genotype-specific sequence dataset exhibited complete conservation at 51 % of amino acid positions (Fig. 2
). Genotype-specific consensus sequences demonstrated that E1 glycosylation motifs were highly conserved across all genotypes. However, additional potential glycosylation motifs unique to all genotype 6 and 1b isolates at positions 250252, and to all 2b isolates at positions 299301, were observed. In the E2 protein ectodomain, the glycosylation sites encompassing residues 476478 were absent in all genotype 3 and 6 and subtype 1b sequences, respectively. Amino acid variability at cysteine residues was confined to the E2 transmembrane cysteines; residue 732 was either cysteine or serine in genotype 4, whilst cysteine was replaced by alanine at position 734 in all genotype 3 isolates (Fig. 2
).
|
values for all datasets were close to 0, confirming that purifying selection, due to extreme functional/structural constraint, is the main evolutionary force acting on HCV E1E2. However, positive selection at numerous positions was apparent in all datasets, verifying that methods that yield average
values lack the power to detect positive selection in these datasets (Table 3
ratios in the E1 protein. All positively selected sites identified mapped to the E2 segment of the E1E2 sequences. Individual residues identified as being under positive selection by site-specific analyses exhibited multiple substitutions on terminal branches, indicating convergent selective changes that have arisen independently in divergent isolates. Comparative analysis of genotype-specific datasets Gt1Gt6 showed that each HCV genotype has been subject to subtly different selective pressures.
|
2, df=2; P<0.0001) excess of codons exhibiting positive selection at homologous positions in more than one genotype. Positive selection was concentrated in HVR1 in our data and the frequency of hits per site within the 27 codons (aa 384410; f=1.519) was considerably greater than that observed downstream (aa 411745; f=0.025), suggesting that this region was responsible for the significant excess of sites with multiple hits. Analysing HVR1 alone, a significant (
2, df=4; P=0.0001) deviation from the expected distribution was seen; an excess of sites with no hits and an excess of sites with more than three hits was evident. Indeed, amino acid positions 384, 391, 395, 397, 404 and 405 were identified as being positively selected sites in at least four genotypes. This result demonstrates that, within HVR1, positive selection is restricted to a specific subset of sites. The 41 positively selected sites identified within this 27 residue domain (datasets Gt1Gt6) were confined to only 13 positions (Fig. 2
Crucially, however, our data suggest differential selection pressures unique to each HCV genotype downstream of (and to a lesser extent within) HVR1. Between positions 411 and 745 of E2, only a single position (aa 444) exhibited evidence for positive selection in two divergent genotypes (1 and 4), which is not significantly greater than the expected distribution (
2, df=2; P=0.09). The 16 positively selected sites identified within the remainder of E2 are distributed across 15 positions. Thus, a heterogeneous genotype-specific distribution of positive selection was observed in this region. The vast majority of adaptive mutations were located in regions of the E2 protein having known or suspected functional importance, e.g. CD81-binding domains or regions targeted by cytotoxic T lymphocytes (CTLs), helper T cells or antibodies capable of inhibiting receptor binding or virus neutralization (Fig. 2
).
| DISCUSSION |
|---|
|
|
|---|
Detection of positive selection by using site-specific ML methods is dependent, to a degree, on the information content of each dataset, which is a function of the number and length of the sequences contained in each dataset, plus the levels of diversity apparent between them. Datasets containing highly similar or highly divergent sequences are not informative. The LRTs for the genotype-specific null/alternative comparisons are all significant, suggesting that specific sites with
>1 exist in each dataset. To achieve an unbiased analysis, each dataset should ideally contain the same number of sequences, sampling a similar proportion of the global diversity apparent in each genotype. However, in practice, this was not possible due to limited availability of sequences/patient sera. For example, datasets Gt1 and Gt2 each contained 27 sequences, whilst dataset Gt6 contained only eight sequences. The comparative paucity of sequence data available for genotype 6 may affect the ability of CODEML to detect selected sites and, indeed, dataset Gt6 contained fewest sites with
>1, raising the possibility that these are linked. However, Gt6 contained the highest intra-genotype genetic diversity (Table 2
), suggesting that the relatively low proportion of selected sites identified in dataset Gt6 was not due to low information content. Conversely, dataset Gt5 contained 10 sequences and the lowest intra-genotype genetic diversity (Table 2
), yet a relatively high proportion of selected sites (10) (Table 3
). It is therefore possible that the relatively low proportion of selected sites identified in dataset Gt6 is due to genotype 6 being an endemic strain prevalent in South-East Asia, rather than a result of low information content. In contrast, all other genotypes display patterns of cladogenesis indicative of recent seedings of the virus into specific transmission networks/populations (Fig. 1
). These relatively recent virus introductions into previously unexposed, geographically discrete human populations could account for the higher levels of molecular adaptation observed in genotypes 15.
Site-specific selection within HVR1
HVR1 is known to be a target for strain-specific neutralizing antibodies (Farci et al., 1994
, 1996
; Shimizu et al., 1994
, 1996
), but our observation of molecular adaptation at the same restricted subset of residues, irrespective of genotype, suggests a common mechanism for escape. Specific residues within HVR1 have previously been reported to be under strong selective pressure at the quasispecies level during both acute (Sheridan et al., 2004
) and chronic (Brown et al., 2005
) infection (Table 3
). The exact role of HVR1 in cell entry is unclear, but current evidence shows that HVR1 is pivotal in E2SR-BI binding (Scarselli et al., 2002
), high-density lipoprotein-enhanced pseudoparticle entry and resistance to neutralizing antibodies (Bartosch et al., 2005
). HVR1 has an overall basic charge, and recent analysis of 1489 natural HVR1 variants showed that basic residues at specific positions influenced retrovirus pseudoparticle entry, with increasing numbers of basic residues associated with enhanced infectivity (Callens et al., 2005
; Penin et al., 2001
). Strikingly, the majority of positively selected sites within HVR1 are located at those sites where basic residues have been shown to facilitate/enhance cellular entry (Table 3
). This novel finding suggests that residues implicated in cellular entry within HVR1 may be subject to opposing evolutionary mechanisms, i.e. maintenance or acquisition of basic residues to facilitate efficient entry, balanced against substitution leading to immune escape and concomitant decreased infectivity. A similar evolutionary dynamic is observed in human immunodeficiency virus (HIV) gp120, where mutations in receptor-binding domains lead to immune escape, but at the cost of efficient cell entry (Beaumont et al., 2004
; Pinter et al., 2004
; Pugach et al., 2004
). The observation that positively selected sites are correlated with homologous positions in HVR1 between HCV genotypes indicates broad-spectrum host immune targeting of the same region, irrespective of infecting genotype. Importantly, however, discrete patterns of adaptive mutation distinctive to each genotype are still observed in this region (Fig. 2
).
Site-specific selection outside HVR1
In contrast to previous analyses conducted on acute patients' quasispecies (Sheridan et al., 2004
), no evidence for positive selection within the E1 protein was uncovered via our site-specific analyses. This observation correlates with the reported low immunogenicity of E1, with antibody responses mainly targeting the E2 protein (Fournillier et al., 2001
). A genotype-specific distribution of positively selected sites was apparent downstream of HVR1 in E2, with no tendency for positive selection at homologous residues. These data suggest that, outside HVR1, subtly different selective pressures help to shape present-day genotype diversity. The fact that selected sites were located in regions implicated in receptor binding, known human leukocyte antigen (HLA)-restricted T-cell epitopes and regions targeted by antibodies capable of neutralization supports a role for both cellular- and humoral-driven selection in HCV evolution. Why the location of adaptive mutations differs between genotypes is less clear. For example, adaptation within the C terminus of the 412447 region, which is known to be a target for antibodies that inhibit CD81 binding (Fig. 2
), is restricted to genotypes 1, 2, 4 and 5. Host-specific factors, such as HLA haplotype, are unlikely to explain this discrepancy, as the host range of genotype 1 and 3 infections overlaps. Instead, this region either is poorly exposed in genotypes 3 and 6 or is less important in facilitating CD81 binding. Both of these scenarios could potentially lead to reduced immune targeting.
E1 and E2 are highly glycosylated proteins, with four to six potential N-linked glycosylation motifs in E1 (E1N1E1N5) and up to 11 potential sites in E2 (E2N1E2N11) (Goffard & Dubuisson, 2003
; Meunier et al., 1999
; Fig. 2
). Glycosylation in HCV envelope proteins is necessary for correct glycoprotein processing, folding and cellular entry (Goffard & Dubuisson, 2003
; Goffard et al., 2005
; Huang et al., 1997
; Li et al., 1993
; Wu et al., 1995
). Glycosylation motifs may mask important epitopes from host antibody responses (Schønning et al., 1996
; Wei et al., 2003
) and comparative analysis of HIV subtype env sequences reveals a significant tendency for the occurrence of adaptive mutations at N-linked glycosylation sites (Choisy et al., 2004
). Whilst the majority of glycosylation motifs exhibited no evidence for positive selection in E2, selected sites associated with glycosylation motifs are observed at positions 476478 (E2N5), 532534 (E2N6) and 576578 (E2N9) in genotypes 1, 5 and 4, respectively. Both E2N5 and E2N6 reside in putative CD81-binding domains (Clayton et al., 2002
; Flint et al., 1999
; Owsianka et al., 2005
; Roccasecca et al., 2003
) and E2N5 is also located within the HVR2 region. Interestingly, previous quasispecies analysis failed to detect any positive selection associated with N-linked glycosylation motifs or HVR2 (Sheridan et al., 2004
), possibly due to the comparatively limited virus diversity observed at the intra-patient level. We also observed a three-codon indel in close proximity to an N-linked glycosylation motif that is unique to genotype 3 isolates and appears to be subject to the action of intense diversifying selection (Table 3
; Fig. 2
). Overall, however, these results suggest that the major function of glycosylation in HCV E1E2 is to ensure correct conformation and to facilitate cellular entry, in contrast to HIV, where a major function is to act as a glycan shield abrogating neutralization (Choisy et al., 2004
).
Selection in HLA-restricted T-cell epitopes
A proportion of the adaptive mutations identified by the site-specific analyses map to experimentally defined regions in E2 associated with human CTL and helper T-cell epitopes (LANL HCV Immunology Database). Genotype-specific selected sites are located in regions that are targeted by CTL-restricted HLA types A2 (Gt1Gt5, aa 398407), H2-Dd (Gt1Gt3, Gt5, aa 405415), A2 (Gt1Gt5, aa 401411), B53 (Gt4, aa 459469), B53 (Gt4, aa 460469), B51 (Gt3, aa 489496), B35 (Gt3, Gt5, aa497507), B60 (Gt5, aa 530539), B50 (Gt3, Gt4, aa 565578) and a restricted helper T-cell epitope recognized by HLA type DRB1*1101 (Gt1Gt6, aa 393410) (Fig. 2
). Indeed, cell-mediated immune targeting has been demonstrated to be the principal cause of virus lineage-specific substitution fixation due to hostvirus co-evolution in HIV-1 infection (Kiepiela et al., 2004
; Moore et al., 2002
). Also, we conducted analyses similar to those presented here on patient-specific HCV quasispecies, highlighting at least one example of a selective change that was highly suggestive of virus escape from an epitope-specific CD8 T-cell response (Brown et al., 2005
). However, whilst HLA class I- and II-restricted T-cell responses may have constituted one of the driving forces responsible for the observed differential molecular adaptation in HCV genotypes, knowledge of host HLA haplotype (which differs between individuals and populations) in conjunction with infecting virus genotype/subtype (which is frequently associated with specific geographical localities and modes of transmission) is required to substantiate this observation.
Implications for vaccine design
Despite the obvious nucleotide and amino acid variation observed, our analyses demonstrate that purifying selection is the major evolutionary force acting on HCV E1E2 (Table 3
; Fig. 2
). This observation has important implications concerning preventative vaccine design. Focusing the immune response on protective determinants that exhibit minimal cross-genotype amino acid variability (i.e. are under intense purifying selection) will provide the highest chance of developing a vaccine with broad potency. The induction of cross-genotype immunity raised to HCV genotypes 14 in chimpanzees has been reported, demonstrating the feasibility of developing a broadly neutralizing HCV vaccine (Lanford et al., 2004
). Regions targeted by monoclonal antibodies (mAbs) that inhibit CD81 binding and/or abrogate HCV pseudoparticle infectivity in vitro were mapped onto the alignment of genotype consensus amino acid sequences relative to the H77 reference strain (Fig. 2
). mAbs included 9/27 (Hsu et al., 2003
; Owsianka et al., 2001
), AP33 (Owsianka et al., 2001
, 2005
; Tarr et al., 2006
), 3/11 (Flint et al., 1999
; Hsu et al., 2003
; Owsianka et al., 2001
; Tarr et al., 2006
; Triyatni et al., 2002
), 2/69a (Flint et al., 1999
; Hsu et al., 2003
; Owsianka et al., 2001
; Tarr et al., 2006
; Triyatni et al., 2002
), 11/20c (Hsu et al., 2003
; Owsianka et al., 2001
; Zhang et al., 2004
) and 9/75 (Flint et al., 1999
; Hsu et al., 2003
; Owsianka et al., 2005
; Triyatni et al., 2002
). Regions targeted by neutralizing mAbs for which conflicting neutralization/binding data have been published, e.g. 64/1a (Hsu et al., 2003
; Owsianka et al., 2001
), or which target conformational epitopes that have yet to be mapped, e.g. CBH2, CBH5 and CBH7 (Cocquerel et al., 2003
; Hadlock et al., 2000
; Heo et al., 2004
), were omitted.
The regions targeted by characterized neutralizing mAbs exhibited varying degrees of cross-genotype amino acid variability. The residues critical for E2 binding for mAbs 9/27, 11/20c, 2/69a and 9/75c have yet to be mapped precisely, although the epitopes targeted by these neutralizing antibodies contain high levels of amino acid variability and/or genotype-specific positively selected sites (Fig. 2
). This observation suggests that virus escape from these antibodies is likely, and they are consequently unlikely to be broadly neutralizing. However, we have recently shown that mAbs AP33 and 3/11, which recognize overlapping epitopes within the highly conserved region encompassing residues 412423, are broadly neutralizing, with mAb AP33 having a greater neutralizing potency than mAb 3/11 (Owsianka et al., 2005
). Importantly, our selected-site analyses show that the specific residues recognized by mAbs AP33 (residues 413, 415, 418 and 420 in H77) and 3/11 (415, 420 and 421 in H77) (Tarr et al., 2006
) are under strong purifying selection, presumably due to functional constraint, with
values close to 0 (data not shown). As such, the likelihood of escape from these antibodies and others targeting the same residues/region is low. Consequently, such antibodies could have great therapeutic potential and their epitope will be an important future vaccine target.
In conclusion, molecular adaptation by E1E2 in response to host humoral/cellular targeting contributes to the continued global spread of HCV. The elucidation of site-specific selective pressures in HCV serves as a useful tool to inform future phenotypic investigation and vaccine design. Positively selected sites were located, on the whole, in biologically meaningful domains of functional significance. Vaccine candidates should be targeted at regions that are under strong purifying selection and exhibit low amino acid variability, and avoid regions with high observed frequencies of adaptive mutations.
| ACKNOWLEDGEMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
Anisimova, M., Bielawski, J. P. & Yang, Z. (2001). Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol Evol 8, 15851592.
Anisimova, M., Nielsen, R. & Yang, Z. (2003). Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics 164, 12291236.
Bartosch, B., Verney, G., Dreux, M., Donot, P., Morice, Y., Penin, F., Pawlotsky, J. M., Lavillette, D. & Cosset, F. L. (2005). An interplay between hypervariable region 1 of the hepatitis C virus E2 glycoprotein, the scavenger receptor BI, and high-density lipoprotein promotes both enhancement of infection and protection against neutralizing antibodies. J Virol 79, 82178229.
Beaumont, T., Quakkelaar, E., van Nuenen, A., Pantophlet, R. & Schuitemaker, H. (2004). Increased sensitivity to CD4 binding site-directed neutralization following in vitro propagation on primary lymphocytes of a neutralization-resistant human immunodeficiency virus IIIB strain isolated from an accidentally infected laboratory worker. J Virol 78, 56515657.
Brown, R. J. P., Juttla, V. S., Tarr, A. W., Finnis, R., Irving, W. L., Hemsley, S., Flower, D. R., Borrow, P. & Ball, J. K. (2005). Evolutionary dynamics of hepatitis C virus envelope genes during chronic infection. J Gen Virol 86, 19311942.
Bukh, J., Miller, R. H. & Purcell, R. H. (1995). Genetic heterogeneity of hepatitis C virus: quasispecies and genotypes. Semin Liver Dis 15, 4163.[Medline]
Callens, N., Ciczora, Y., Bartosch, B., Vu-Dac, N., Cosset, F. L., Pawlotsky, J. M., Penin, F. & Dubuisson, J. (2005). Basic residues in hypervariable region 1 of hepatitis c virus envelope glycoprotein E2 contribute to virus entry. J Virol 79, 1533115341.
Choisy, M., Woelk, C. H., Guegan, J. F. & Robertson, D. L. (2004). Comparative study of adaptive molecular evolution in different human immunodeficiency virus groups and subtypes. J Virol 78, 19621970.
Clayton, R. F., Owsianka, A., Aitken, J., Graham, S., Bhella, D. & Patel, A. H. (2002). Analysis of antigenicity and topology of E2 glycoprotein present on recombinant hepatitis C virus-like particles. J Virol 76, 76727682.
Cocquerel, L., Quinn, E. R., Flint, M., Hadlock, K. G., Foung, S. K. & Levy, S. (2003). Recognition of native hepatitis C virus E1E2 heterodimers by a human monoclonal antibody. J Virol 77, 16041609.
Farci, P., Alter, H. J., Wong, D. C., Miller, R. H., Govindarajan, S., Engle, R., Shapiro, M. & Purcell, R. H. (1994). Prevention of hepatitis C virus infection in chimpanzees after antibody-mediated in vitro neutralization. Proc Natl Acad Sci U S A 91, 77927796.
Farci, P., Shimoda, A., Wong, D., Cabezon, T., De Gioannis, D., Strazzera, A., Shimizu, Y., Shapiro, M., Alter, H. J. & Purcell, R. H. (1996). Prevention of hepatitis C virus infection in chimpanzees by hyperimmune serum against the hypervariable region 1 of the envelope 2 protein. Proc Natl Acad Sci U S A 93, 1539415399.
Farci, P., Shimoda, A., Coiana, A., Diaz, G., Peddis, G., Melpolder, J. C., Strazzera, A., Chien, D. Y., Munoz, S. J. & other authors (2000). The outcome of acute hepatitis C predicted by the evolution of the viral quasispecies. Science 288, 339344.
Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 781783.
Flint, M., Maidens, C., Loomis-Price, L. D., Shotton, C., Dubuisson, J., Monk, P., Higginbottom, A., Levy, S. & McKeating, J. A. (1999). Characterization of hepatitis C virus E2 glycoprotein interaction with a putative cellular receptor, CD81. J Virol 73, 62356244.
Fournillier, A., Wychowski, C., Boucreux, D., Baumert, T. F., Meunier, J. C., Jacobs, D., Muguet, S., Depla, E. & Inchauspe, G. (2001). Induction of hepatitis C virus E1 envelope protein-specific immune response can be enhanced by mutation of N-glycosylation sites. J Virol 75, 1208812097.
Frasca, L., Del Porto, P., Tuosto, L., Marinari, B., Scotta, C., Carbonari, M., Nicosia, A. & Piccolella, E. (1999). Hypervariable region 1 variants act as TCR antagonists for hepatitis C virus-specific CD4+ T cells. J Immunol 163, 650658.
Goffard, A. & Dubuisson, J. (2003). Glycosylation of hepatitis C virus envelope proteins. Biochimie 85, 295301.[Medline]
Goffard, A., Callens, N., Bartosch, B., Wychowski, C., Cosset, F. L., Montpellier, C. & Dubuisson, J. (2005). Role of N-linked glycans in the functions of hepatitis C virus envelope glycoproteins. J Virol 79, 84008409.
Hadlock, K. G., Lanford, R. E., Perkins, S., Rowe, J., Yang, Q., Levy, S., Pileri, P., Abrignani, S. & Foung, S. K. (2000). Human monoclonal antibodies that inhibit binding of hepatitis C virus E2 protein to CD81 and recognize conserved conformational epitopes. J Virol 74, 1040710416.
Heo, T. H., Chang, J. H., Lee, J. W., Foung, S. K., Dubuisson, J. & Kang, C. Y. (2004). Incomplete humoral immunity against hepatitis C virus is linked with distinct recognition of putative multiple receptors by E2 envelope glycoprotein. J Immunol 173, 446455.
Hsu, M., Zhang, J., Flint, M., Logvinoff, C., Cheng-Mayer, C., Rice, C. M. & McKeating, J. A. (2003). Hepatitis C virus glycoproteins mediate pH-dependent cell entry of pseudotyped retroviral particles. Proc Natl Acad Sci U S A 100, 72717276.
Huang, X., Barchi, J. J., Jr, Lung, F. D., Roller, P. P., Nara, P. L., Muschik, J. & Garrity, R. R. (1997). Glycosylation affects both the three-dimensional structure and antibody binding properties of the HIV-1IIIB GP120 peptide RP135. Biochemistry 36, 1084610856.[CrossRef][Medline]
Kiepiela, P., Leslie, A. J., Honeyborne, I., Ramduth, D., Thobakgale, C., Chetty, S., Rathnavalu, P., Moore, C., Pfafferott, K. J. & other authors (2004). Dominant influence of HLA-B in mediating the potential co-evolution of HIV and HLA. Nature 432, 769775.[CrossRef][Medline]
Kumar, S., Tamura, K. & Nei, M. (2004). MEGA3: integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform 5, 150163.
Lanford, R. E., Guerra, B., Chavez, D., Bigger, C., Brasky, K. M., Wang, X. H., Ray, S. C. & Thomas, D. L. (2004). Cross-genotype immunity to hepatitis C virus. J Virol 78, 15751581.
Lavillette, D., Tarr, A. W., Voisset, C., Donot, P., Bartosch, B., Bain, C., Patel, A. H., Dubuisson, J., Ball, J. K. & Cosset, F. L. (2005). Characterization of host-range and cell entry properties of the major genotypes and subtypes of hepatitis C virus. Hepatology 41, 265274.[CrossRef][Medline]
Li, Y., Luo, L., Rasool, N. & Kang, C. Y. (1993). Glycosylation is necessary for the correct folding of human immunodeficiency virus gp120 in CD4 binding. J Virol 67, 584588.
Lindenbach, B. D. & Rice, C. M. (2001). Flaviviridae: the viruses and their replication. In Fields Virology, 4th edn, pp. 9911041. Edited by B. N. Fields, D. M. Knipe & P. M. Howley. New York: Lippincott Williams & Wilkins.
Lole, K. S., Bollinger, R. C., Paranjape, R. S., Gadkari, D., Kulkarni, S. S., Novak, N. G., Ingersoll, R., Sheppard, H. W. & Ray, S. C. (1999). Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J Virol 73, 152160.
Majid, A., Jackson, P., Lawal, Z., Pearson, G. M. J., Parker, H., Alexander, G. J. M., Allain, J.-P. & Petrik, J. (1999). Ontogeny of hepatitis C virus (HCV) hypervariable region 1 (HVR1) heterogeneity and HVR1 antibody responses over a 3 year period in a patient infected with HCV type 2b. J Gen Virol 80, 317325.[Abstract]
Martell, M., Esteban, J. I., Quer, J., Genesca, J., Weiner, A., Esteban, R., Guardia, J. & Gomez, J. (1992). Hepatitis C virus (HCV) circulates as a population of different but closely related genomes: quasispecies nature of HCV genome distribution. J Virol 66, 32253229.
Meunier, J.-C., Fournillier, A., Choukhi, A., Cahour, A., Cocquerel, L., Dubuisson, J. & Wychowski, C. (1999). Analysis of the glycosylation sites of hepatitis C virus (HCV) glycoprotein E1 and the influence of E1 glycans on the formation of the HCV glycoprotein complex. J Gen Virol 80, 887896.[Abstract]
Moore, C. B., John, M., James, I. R., Christiansen, F. T., Witt, C. S. & Mallal, S. A. (2002). Evidence of HIV-1 adaptation to HLA-restricted immune responses at a population level. Science 296, 14391443.
Muller, R. (1996). The natural history of hepatitis C: clinical experiences. J Hepatol 24 (Suppl. 2), 5254.[CrossRef][Medline]
Owsianka, A., Clayton, R. F., Loomis-Price, L. D., McKeating, J. A. & Patel, A. H. (2001). Functional analysis of hepatitis C virus E2 glycoproteins and virus-like particles reveals structural dissimilarities between different forms of E2. J Gen Virol 82, 18771883.
Owsianka, A., Tarr, A. W., Juttla, V. S., Lavillette, D., Bartosch, B., Cosset, F. L., Ball, J. K. & Patel, A. H. (2005). Monoclonal antibody AP33 defines a broadly neutralizing epitope on the hepatitis C virus E2 envelope glycoprotein. J Virol 79, 1109511104.
Penin, F., Combet, C., Germanidis, G., Frainais, P. O., Deleage, G. & Pawlotsky, J.-M. (2001). Conservation of the conformation and positive charges of hepatitis C virus E2 envelope glycoprotein hypervariable region 1 points to a role in cell attachment. J Virol 75, 57035710.
Pinter, A., Honnen, W. J., He, Y., Gorny, M. K., Zolla-Pazner, S. & Kayman, S. C. (2004). The V1/V2 domain of gp120 is a global regulator of the sensitivity of primary human immunodeficiency virus type 1 isolates to neutralization by antibodies commonly induced upon infection. J Virol 78, 52055215.
Posada, D. & Crandall, K. A. (1998). MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817818.