J Gen Virol Faster Access
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Gen Virol 87 (2006), 2203-2215; DOI 10.1099/vir.0.81752-0

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yang, J.
Right arrow Articles by Wang, X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yang, J.
Right arrow Articles by Wang, X.
Agricola
Right arrow Articles by Yang, J.
Right arrow Articles by Wang, X.
© 2006 Society for General Microbiology

Identification of Hepatitis B virus putative intergenotype recombinants by using fragment typing

Jie Yang1,2, Ke Xing1, Riqiang Deng1, Jinwen Wang1 and Xunzhang Wang1

1 State Key Laboratory for Biocontrol, School of Life Science, Sun Yat-sen (Zhongshan) University, Guangzhou 510275, People's Republic of China
2 Department of Infectious Diseases, Nanfang Hospital, Guangzhou 510515, People's Republic of China

Correspondence
Xunzhang Wang
wxz{at}zsu.edu.cn


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Eight hundred and thirty-seven human Hepatitis B virus (HBV) genomes were categorized into pure genotypes and potential intergenotypes, according to their fragment types which were determined based on similarity and phylogenetic analyses of 13 contrived fragments of 250 bp against the corresponding fragments of the consensus sequences of genotypes A–H. Twenty-five intergenotypes, including 171 genomes, were revealed from the potential intergenotype recombinants by phylogenetic analysis of the precisely derived mosaic fragments. Among these, four new intergenotypes were discovered. Many genomes were revealed as putative intergenotype recombinants for the first time. About 87 % of the putative recombinants were B/C (120) and A/D (29) hybrids. The other recombinants comprised A/B/C, A/C, A/E, A/G, C/D, C/F, C/G, C/U (U for unknown genotype) and B/C/U hybrids. Genotypes A and C showed a higher recombination tendency than did other genotypes. The results also demonstrated region priority and breakpoint hot spots in the intergenotype recombination. Recombination breakpoints were found to be concentrated mainly in the vicinity of the DR1 region (nt 1640–1900), the pre S1/S2 region (nt 3150–100), the 3'-end of the C gene (nt 2330–2450) and the 3'-end of the S gene (nt 650–830). These results support the suggestion that intergenotype recombinants may result from co-infection with different genotypes.


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Human Hepatitis B virus (HBV), which is the prototype member of the family Hepadnaviridae, is a circular, partially double-stranded DNA virus of approximately 3200 bp with four overlapping ORFs encoding the polymerase (P), core (C), surface (S) and X proteins (Lee, 1997Down). Based on an intergroup divergence of 8 % or more in the complete nucleotide sequence, HBV sequences have been classified into genotypes A–F (Okamoto et al., 1988Down; Norder et al., 1992Down, 1994Down; Magnius & Norder, 1995Down). Genotypes G and H were suggested later (Stuyver et al., 2000Down; Arauz-Ruiz et al., 2002Down). Generally, different HBV genotypes have distinct geographical distributions. For example, genotype A is prevalent in northern and central Europe and is also found in North and South America and Africa. Genotypes B and C are common in south and east Asia. Genotype D spreads worldwide, but predominates in the Mediterranean area. Genotype E is found mainly in the western part of sub-Saharan Africa, while genotype F is found in aboriginal populations of the Americas (Norder et al., 1993Down). Genotype G is found in USA and France (Stuyver et al., 2000Down). Genotype H is mainly recorded in North and South America (Arauz-Ruiz et al., 2002Down). It has been assumed that the HBV genotype may influence the biology of HBV and clinical disease in hosts (Mayerat et al., 1999Down). Genotype C induces more severe liver disease than genotype B in Asia (Bowyer et al., 1997Down; Kao et al., 2000Down), while genotype A is associated with chronic infection more frequently than genotype D in Europe (Mayerat et al., 1999Down).

HBV has a high rate of diversity. The mutation rate of the genome is much higher than in other DNA viruses (Okamoto et al., 1987Down; Georgi-Geisberger et al., 1992Down) because HBV replicates similarly to retroviruses through reverse transcription of an RNA intermediate, which lacks a proofreading function (Summers & Mason, 1982Down). Homologous recombination between different genotypes is a source of variation. Accumulating data show that DNA recombination is frequent in HBV infections. Recombination between HBV genotypes B and C (B/C hybrid) is prevalent in south and east Asia (Morozov et al., 2000Down; Sugauchi et al., 2002Down, 2003Down; Luo et al., 2004Down). A/D hybrids have been described in Italy and South Africa (Morozov et al., 2000Down; Owiredu et al., 2001Down). Other HBV intergenotype hybrids have also been discovered, such as C/D hybrids in China (Cui et al., 2002Down; Wang et al., 2005Down), an A/E hybrid in Cameroon (Kurbanov et al., 2005Down) and a C/G hybrid in Thailand (Suwannakarn et al., 2005Down). Intergenotype recombination may lead to different biological characteristics of the virus and a different clinical outcome in patients. For instance, recombination of genotype B with genotype C (B/C hybrid) might affect the severity of clinical disease (Sugauchi et al., 2002Down). Then, many questions are raised: how many kinds of intergenotypic recombinants exist in published HBV data; where are the recombination breakpoints located; and are there any rules for the recombination? These questions may lead to an understanding of the mechanism of HBV intergenotype recombination. In this paper we report 25 intergenotypes comprising 171 putative intergenotype recombinants found in 837 complete HBV genome sequences.


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
HBV data.
Complete human HBV genome sequences were collected and updated every month from GenBank, EMBL and DDBJ by BLASTN searches using the complete HBV genome sequences of genotypes A–H. Unique accession numbers were collected by eliminating duplicates in a text file. Sequences were then downloaded from GenBank using the Batch Entrez tool on the NCBI (National Center for Biotechnology Information) website (www.ncbi.nlm.nih.gov). After removing identical entries, patents, artificial mutants, partial sequences and animal HBV sequences, genome sequences with lengths between 3000 and 3300 bp were changed so that they started with the traditional hypothetical EcoRI site. Finally, 837 genomes consisting of 146, 152, 292, 142, 40, 42, 13 and 10 of genotypes A–H, respectively, were obtained in March 2006 (see Table 2Down for a complete list of accession numbers).


View this table:
[in this window]
[in a new window]
 
Table 2. Fragment type analysis of the 837 HBV complete genome sequences

Accession numbers and countries are listed alphabetically according to the fragment types. Continuous accession numbers were combined for parity: for instance, ‘AB116076–94’ stands for 19 genomes from AB116076 to AB116094. The latterly identified 25 intergenotypes are indicated in the second column (IG, intergenotype). C/Gi, Genotype C and gibbon hybrid suggested by Simmonds & Midgley (2005)Down.

 
Creation of consensus sequences for genotypes A–H.
All the genome sequences were classified into genotypes A–H by using BLASTN (Altschul et al., 1990Down) and phylogenetic analysis against 24 representatives of genotypes A–H which were selected at random, three for each genotype, from HBV full-length sequences of genotypes A–H, according to Norder et al. (2004)Down. Consensus sequences of all genotypes except genotype B were then created from all the full-length sequences of the corresponding genotypes. Genotype B consists of subgenotypes Bj (j for Japan) and Ba (a for Asia) (Sugauchi et al., 2002Down). Because the majority of genotype B sequences are of subgenotype Ba, which is considered to be a B/C hybrid with genotype C clustered in the preC/C region, the consensus sequence for genotype B was created using full-length subgenotype Bj sequences.

Primary fragment typing.
Consensus sequences of genotypes A–H were aligned using CLUSTAL W (Thompson et al., 1994Down). Starting with the traditional hypothetical EcoRI site, the alignment was split into 13 subalignments of 250 bp (the last one 257 bp) and the fragments from the 13 subalignments were formatted as a database for BLASTN. Genotypes of the 13 fragments of each genome (called ‘fragment types' in this report) were determined by BLASTN of the 837 complete HBV genome sequences against this database. According to the BLASTN result, each genome was interpreted with a string of 13 letters (from A to H), representing the types of the corresponding fragments. Genomes of the pure genotypes are denoted with 13 symbols of only one letter, for example AAAAAAAAAAAAA for genotype A. Genomes of potential intergenotype recombinants are therefore denoted with 13 symbols of different letters, for example ADADAAAAAAAAD, letters that differ from the dominant letter implying potential mosaic fragments in the genome. A Perl script was written to analyse and record the fragment types of 837 HBV complete genome sequences.

Calibration of the fragment types.
Genomes of pure fragment type were considered to have little probability of being intergenotype recombinants and needed no further consideration. Only genomes of impure fragment type were probably recombinants. Therefore, potential mosaic fragments from impure fragment type genomes were collected according to their position with respect to each 250 bp region and aligned with correlated fragments from representative genomes with minor manual adjustments. The alignments were fed into the PHYLIP software package, version 3.62 (Felsenstein, 1993Down). Genetic distances were estimated by the Kimura two-parameter model and phylogenetic trees were reconstructed by the neighbour-joining method; the reliability of topologies was estimated by performing bootstrap resampling and reconstruction with 1000 replicates, then the CONSENSE program in the PHYLIP package was used to compute major-rule consensus trees. Potential mosaic fragments were retyped according to the phylogenetic trees. Fragments not clustered with any of the A–H genotypes were retyped with ‘N’, meaning uncertain, and the fragments with a deletion longer than 125 bp were marked with ‘-’. So the fragment types of some genomes were modified, changing some originally impure fragment type genomes to pure genotypes and others to different impure fragment types.

Identification of putative recombinants.
For the genomes of impure fragment types, recombination breakpoints were determined with the SimPlot program (Lole et al., 1999Down) and by bootscanning analysis, according to Robertson et al. (1995)Down, Lole et al. (1999)Down and Sugauchi et al. (2002)Down. The SimPlot program version 3.5.1 was obtained from http://sray.med.som.jhmi.edu/SCRoftware/ and was used to identify phylogenetically informative sites supporting alternative tree topologies. This was performed by considering four sequences at a time: one putative recombinant sequence, two consensus sequences of the parental genotypes and one consensus sequence of another genotype as an outgroup. Each informative site supports one of three possible phylogenetic relationships among the four taxa. Bootscanning and cluster analysis maximizing {chi}2 were used to identify the breakpoints. P values for the subsequent division of the sequence into genotypes were calculated by using Fisher's exact test. After determination of recombination breakpoints, mosaic fragments were derived precisely from the recombinants, according to the breakpoints. Mosaic fragments with the same breakpoints were collected and aligned with corresponding fragments of genotype A–H representatives. Phylogenetic trees were reconstructed based on these alignments and compared with the trees reconstructed based on the correlated complete genome sequences. A 75 % bootstrap value cut-off was used to identify the mosaicism of each fragment.


   RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
HBV genomes: classification and creation of consensus sequences
Genotypes of most HBV genomes could be determined by using BLASTN against the 24 representatives of genotypes A–H shown in Table 1Down and by analysis of the BLASTN results based on a nucleotide sequence identity cut-off (>92 %). Others required further phylogenetic analysis. Of the 837 HBV genomes, 146, 152, 292, 142, 40, 42, 13 and 10 were categorized in genotypes A–H, respectively. Of the 837 HBV genomes, 631 were found to have a standard genotype length, including 96, 128, 215, 105, 33, 37, 9 and 8 genomes for genotypes A–H, respectively, and these were chosen for consensus sequence creation. Consensus sequences for genotypes except genotype B were then created from alignments of genome sequences of the corresponding genotypes. For the reason explained previously, the consensus sequence of genotype B was obtained by the alignment of 28 Bj subgenotype sequences which were selected according to Sugauchi et al. (2002)Down. The resulting consensus sequences were used as reference sequences for genotypes A–H in the subsequent fragment typing and SimPlot bootscanning analyses.


View this table:
[in this window]
[in a new window]
 
Table 1. Twenty-four full-length representatives of genotypes A–H

Subgenotypes of some representatives are indicated in parentheses.

 
Potential intergenotype identification using fragment typing
Using BLASTN against the consensus fragments which were derived by splitting the alignment of consensus genome sequences into 13 subalignments of 250 bp, 633 genomes were identified to be pure genotypes A–H, and 204 to be potential intergenotypes, implying intergenotype recombinants. Fragment types of the 204 potential intergenotype genomes were determined further by phylogenetic analysis of potential mosaic fragments. Phylogenetic trees were reconstructed based on alignments of the potential mosaic fragments together with correlated fragments from representative genomes. Mosaic fragments that were not supported by the phylogenetic trees were modified according to the phylogenetic results, changing some originally impure fragment type genomes to pure types and others to different fragment types. The fragment types of the 837 HBV genome sequences determined with the above methods are shown in Table 2Up.

Identification of putative recombinants
By SimPlot analysis, the recombination breakpoints of the potential recombinants were determined precisely. Mosaic fragments were derived from the intergenotype recombinants according to the breakpoints. A phylogenetic tree was inferred to determine further the genotype of each precisely derived mosaic fragment using a 75 % bootstrap value cut-off. For example, eight genomes with the fragment type ‘ADADAAAAAAAAD’ and four genomes with the fragment type ‘DDADAAAAAAAN-’ belonged to genotype A in the whole genome, and all had a mosaic fragment between nt 209 and 526. A phylogenetic tree was then inferred, based on the alignment of these mosaic fragments and corresponding fragments of representative genomes of genotypes A–H. All the 12 mosaic fragments clustered with the corresponding representative fragments of genotype D with a bootstrap value of 93 % in the tree (Fig. 1aDown). A similar tree was also inferred based on alignment of the corresponding whole genomes. All 12 genome sequences clustered with the representative genomes of genotype A with a bootstrap value of 100 % (Fig. 1bDown). Phylogenetic trees based on other mosaic fragments were also reconstructed to demonstrate the mosaicism using a 75 % bootstrap value cut-off (data not shown). The mosaicism of the precisely derived fragments had much stronger bootstrap support than did that of the contrived fragments.


Figure 1
View larger version (26K):
[in this window]
[in a new window]
 
Fig. 1. Phylogenetic analysis of mosaic fragments within nt 209–526 derived from genomes with fragment types ‘ADADAAAAAAAAD’ (IG-1) and ‘DDADAAAAAAAU-’ (IG-2). GenBank accession numbers are shown on each tree. The letter before each accession number indicates the genotype: upper-case letters for reference sequences of genotypes A–H and lower-case letters for putative recombinants. Upper-case letters following accession numbers indicate putative hybrids. Numbers at nodes indicate the percentage of bootstrap replications supporting the clusters (values of 75 % and higher are shown). (a) Phylogenetic tree (1000 sets) inferred from mosaic fragments within nt 209–526. (b) Phylogenetic tree (1000 sets) reconstructed based on the correlated complete genome sequences.

 
Finally, 166 genomes were revealed as putative intergenotype recombinants supported by phylogenetic trees. According to the fragment types, these genomes were classified into 23 intergenotypes. For convenience of discussion, the 23 intergenotypes were named intergenotype 1–23 (IG-1 to IG-23 as indicated in Table 2Up and Table 3Down). Fig. 2Down shows the SimPlot bootscanning results and Table 3Down lists the breakpoints of representative genome sequences of the 23 intergenotypes. The bootstrap values for each mosaic fragment are also listed in Table 3Down. Sequences in the same intergenotype had similar breakpoints. The 5' terminal breakpoints and the 3' terminal breakpoints of the 117 genomes in IG-6 were located within nt 1730–1846 and 2244–2485 respectively. Other intergenotypes had breakpoints with variations of only tens of base pairs


View this table:
[in this window]
[in a new window]
 
Table 3. Identification of mosaic fragments derived from representatives of each intergenotype

Intergenotypes marked with ‘*’ are revealed for the first time in this report, and references are listed for those reported before. G1 and G2 refer to the original and alternate parent genotypes of each intergenotype, respectively: consensus sequences were used as references. Mosaic fragments of each intergenotype representative were derived precisely from the recombinants according to the breakpoints (column ‘Fragments’). G1 (nt) and G2 (nt) refer to the number of nucleotides within each mosaic fragment which differ from G1 and G2, respectively. The similarity between each mosaic fragment and the correlated fragments from G1 or G2 was determined as a percentage (columns ‘G1’ and ‘G2’ under identity, respectively), and differences in similarity are also indicated (column ‘G2–G1’). Each mosaic fragment was characterized by a bootstrap tree and the bootstrap values are listed.

 

Figure 2
View larger version (70K):
[in this window]
[in a new window]
 
Fig. 2. Resolution of recombinant events in representatives of 24 intergenotypes determined using the SimPlot program. Consensus sequences of genotypes A–H were used as reference sequences (indicated with a single upper-case letter in the plot). Two parent sequences and one outgroup sequence were compared against the example sequence over the entire genome with a window size of 250 bp and a step size of 20 bp (gap strip off; 100 bootstrap replicates and neighbour-joining tree analysis). To show detailed information of the whole-genome region, a 400 bp repeat of nt 1–400 was added behind each sequence. The maximum {chi}2 method was used to identify breakpoints. No plot is presented for IG-24 because there was no complete genome for the unknown genotype, and the consensus sequence of IG-24 was used as a parent reference sequence in the analysis of IG-25 (marked with U).

 
In addition to the above 166 genomes, constituting the 23 intergenotypes, there are five genomes that need further investigation and are currently categorized to two uncertain intergenotypes. Genomes AF241407 [GenBank] –AF241409 [GenBank] and AB231908 [GenBank] were originally from Vietnam. These four genomes exhibited a complicated fragment type of ‘NAGGGECCCCCCN’ in the fragment typing stage, where N represents an uncertain genotype of the 250 bp contrived fragment, i.e. the fragment did not cluster with any of the representative fragments of the eight genotypes in the tree. In the following analysis stage with precisely derived mosaic fragments, the mosaic areas corresponding to A, G and E were not supported by bootstrap values higher than the 75 % cut-off. SimPlot analysis also showed that the corresponding area between nt 2866 and 1800 was significantly different from all known genotypes. Hannoun et al. (2000)Down had supposed that fragments like these were from an unknown genotype (U) or a new genotype. We considered the four genomes as recombinants between genotype C and an unknown genotype (C/U hybrid) and assigned another intergenotype (IG-24) to include these four recombinants. Another genome (AB231909 [GenBank] ) with fragment type ‘BBBGGECCCNBBB’ belonged to subgenotype Ba in the main part of the genome. SimPlot analysis showed that the fragment corresponding to the fragment type GGEC area between nt 729 and 1895 showed a high level of identity with the unknown genotype fragment in IG-24 (99.9 %). This genome was considered to be a Ba/U hybrid or a B/C/U hybrid and was assigned to intergenotype IG-25 (Tables 1 and 2UpUp, and Fig. 2Up).

Putative intergenotype recombinants
Among the 25 intergenotypes, four were identified for the first time (marked with ‘*’ in Table 3Up). In the other 21 intergenotypes already reported (see references in Table 3Up) many genomes were also revealed for the first time.

A/D hybrids.
Ten intergenotypes, including 29 HBV genomes, were revealed to be A/D hybrids. Five intergenotypes (IG-1 to IG-5) belonged to genotype A, including 20 genomes from India and 3 from South Africa. IG-14 to IG-18, including six genomes (four from Italy and two from India), belonged to genotype D. A/D hybrids from India mostly belonged to genotype A in the whole genome, while those from Italy all belonged to genotype D. Putative mosaic areas of A/D hybrids were located mainly in the long S and X/preC gene region.

B/C hybrids.
One hundred and seventeen genomes in IG-6 belonged to subgenotype Ba and the mosaic area just covered the preC/C region (Sugauchi et al., 2002Down). B/C hybrids composed about 79 % (120 out of 152) of all genotype B sequences. Genomes of pure fragment type BBBBBBBBBBBBB were found only in Japan. U87747 [GenBank] (IG-7) from South Africa and AB219430 [GenBank] (IG-23) from the Philippines were two different A/B/C hybrids. Both of them were actually hybrids of Ba recombined with a genotype A fragment at a different position: U87747 [GenBank] in the preS1/S2 region and AB219430 [GenBank] in the X gene region. AB231909 [GenBank] (IG-25, B/C/U hybrid) was also a hybrid of Ba recombined with an unknown genotype. The recombination breakpoints were just at the end of the S gene and at the beginning of the C gene.

Genotypes B and C are the two most prevalent genotypes in south and east Asia, but only three B/C hybrids (IG-9 to IG-11) were found from all 292 genotype C genomes: D16665 [GenBank] from Japan, AF233236 [GenBank] from China and AB031265 [GenBank] from Vietnam. Genome D16665 [GenBank] with fragment type CCCCCCCBBCCCC was a B/C hybrid with genotype B clustered in the preC/C region. The mosaic area of genomes AB031265 [GenBank] and AF233236 [GenBank] was at the end of the P gene and a part of the X gene.

C/D hybrids.
IG-12 and IG-13, including 10 HBV isolates, were C/D hybrids. They belonged to genotype C in the whole genome. All 10 sequences are from north-west China, as well as Tibet, where genotype D is rare and where the dominant strain of HBV is a C/D hybrid (Cui et al., 2002Down).

A/C hybrids.
Genome AY057947 [GenBank] , the only A/C hybrid identified in this study, belongs to genotype C. The mosaic area was at the preS1/S2 region. This genome has been found in China where genotype A is seldom recorded, only one genome being recorded as genotype A in the 127 genomes from China.

Other hybrids.
Genome AB214516 [GenBank] from Bolivia was a putative C/F hybrid with a mosaic region at nt 1535–1866, including part of the X gene and the preC region. AB194949 [GenBank] from Cameroon is the only A/E hybrid, recently revealed by Kurbanov et al. (2005)Down. The A/G hybrid genome AB056516 [GenBank] belonged to genotype G with a 200 bp genotype A mosaic fragment in the preS1/S2 region. A genome from Thailand, DQ078791 [GenBank] , belonging to genotype G was also identified as a C/G hybrid. The mosaic area just covered the preC/C region (Suwannakarn et al., 2005Down). Two intergenotypes (IG-24 and IG-25) comprised mosaic areas of an unknown genotype.

To avoid missing some recombinants, we repeated the above analysis using the same methods, but with a change at the fragment typing stage, i.e. splitting the 13 contrived fragments in the middle, i.e. the first fragment started at nt 126 and ended at nt 375 downstream of the traditional hypothetical EcoRI site. Although this yielded 238 potential intergenotype recombinants at the fragment typing stage, subsequent phylogenetic analysis of the precisely derived mosaic fragments identified the same 171 intergenotype recombinants (data not shown).


   DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Many HBV intergenotype recombinants have been revealed since the first report by Bollyky et al. (1996)Down. Morozov et al. (2000)Down discovered six B/C hybrids (belonging to IG-6 or IG-11 in our study) and three A/D hybrids (IG-15 to IG-17) from 99 complete HBV sequences. Almost at the same time, Bowyer & Sim (2000)Down found three A/D hybrids and six B/C hybrids (IG-6) from 65 genome sequences. Several new hybrid types have been revealed subsequently. Recently, Simmonds & Midgley (2005)Down revealed 24 phylogenetically independent intergenotypes.

This study scanned 837 human HBV genomes currently available in GenBank for intergenotype recombinants. It was difficult to deal with such a large amount of data using described methods (Morozov et al., 2000Down; Bowyer & Sim, 2000Down; Simmonds & Midgley, 2005Down), so a different strategy was used in this study to identify potential recombinants from these genomes (i) based on similarity analysis of 13 contrived fragments of 250 bp, and (ii) based on the precisely derived mosaic fragments. Results from the first step indicated that most genomes were of pure genotypes, only 204 genomes being potential intergenotype recombinants. Analysis at the second step by a phylogenetic method based on the precisely derived mosaic fragments revealed 171 putative recombinants, which were categorized into 25 intergenotypes according to their fragment types. Among the 25 intergenotypes, IG-6 contained 117 genomes; the other 24 intergenotypes contained 54 genomes.

Compared to the results reported by Simmonds & Midgley (2005)Down, our study revealed five more intergenotypes, four (IG-2, IG-7, IG-23 and IG-25) of which are reported here for the first time, and IG-22 (G/C hybrid) reported recently by Suwannakarn et al. (2005)Down. Excluding one intergenotype which was a recombinant of human and gibbon HBV, and two intergenotypes of chimpanzee HBV, the remaining 21 human HBV intergenotypes reported by Simmonds & Midgley (2005)Down are included in 20 of our 25 intergenotypes. In addition, our study has revealed many more human HBV genomes as intergenotype recombinants than did previous studies.

Among the intergenotypes presented in this study, IG-7 and IG-23 were two different A/B/C hybrids between subgenotype Ba and genotype A with different recombination positions, and IG-25 was a B/C/U hybrid between subgenotype Ba and an unknown genotype. Though almost identical to IG-1 in its main genome, IG-2 was a different intergenotype due to a 183 bp deletion in the preS1/S2 region and a longer mosaic area of genotype D in both flanks of the deletion as compared to IG-1. It should be mentioned that AF418684 [GenBank] –AF418689 [GenBank] in IG-1 and AF418690 [GenBank] –AF418692 [GenBank] in IG-2 are sequences of different clones from the same patient (patient 313) in India. Genomes AY161148 [GenBank] –AY161149 [GenBank] in IG-1 and AY161147 [GenBank] in IG-2 are also from the same patient in India (according to GenBank). Genomes of different intergenotypes existing in serum samples from the same patient might help in understanding the mechanism of recombination.

This study scanned all the currently available human HBV genomes, and categorized all putative recombinants into different intergenotypes by fragment typing, presenting comprehensive information on human HBV intergenotypes and including a particular description for each intergenotype. Hence, a comparatively complete inspection of human HBV for intergenotype recombinants was completed, which should be helpful for finding hot spots of recombination, for investigating the geographical distribution of intergenotype recombinants and for understanding the tendency of different genotypes to recombine and the mechanism of intergenotype recombination in HBV.

Hot spots of recombination breakpoints
Different regions in the HBV genome showed different rates of recombination, and recombination breakpoints were found to be located frequently near gene boundaries (Simmonds & Midgley, 2005Down), like at the start of the S1, S2, S, X, C and P genes, and at the end of the S, X, P and C genes. Of the breakpoints listed in Table 3Up, 54 or 66 % are located within 60 or 80 bp of one of these boundaries. Comparison between our data on intergenotype recombination and previous data revealed some hot spots for recombination breakpoints.

Sixty percent of the intergenotypes (15 out of 25), including A/D (4), A/B/C (2), A/E (1), B/C (4), C/F (1), C/G (1), C/U (1) and B/C/U (1) hybrids, contained breakpoints within nt 1640–1900, which lies in the vicinity of the DR1 region. It was shown that the recombination site density in the region covering nt 1600–2000 was almost fivefold higher than that of the remainder of the genome (Pineau et al., 1998Down). DR1 has been demonstrated to cover a hot spot (nt 1885–1915) for genomic recombination and has been proposed as a candidate site for intragenic recombination between HBV isolates of different genotypes (Hino et al., 1991Down; Sugauchi et al., 2002Down).

The second hot spot of recombination is located in the preS1/S2 region, covering nt 3150–100. Seven intergenotypes, including A/D (2), C/D (2), A/B/C (1), A/C (1) and A/G (1) hybrids, showed breakpoints in this region. Results of previous analyses of HBV partial sequences also supported this hot spot. Kato et al. (2002b)Down found that some isolates in genotype G had a mosaic area of genotype A in the preS1/S2 region. Chen et al. (2004Down, 2006)Down discovered that some HBV isolates of genotype B were integrated with genotype C sequences from patients co-infected with genotypes B and C. The recombination breakpoints were also mainly found in the preS1/S2 region.

The third recombination hot spot is located near the 3' end of the C gene, covering nt 2330–2485. Recombinants with breakpoints at this spot usually have associated breakpoints at the previous DR1 hot spot. About one-third (8 out of 25) of the intergenotypes, including A/D (2), A/B/C (2), B/C (3) and C/G (1) hybrids, were found to have breakpoints at this spot. Another region between nt 650 and 830 near the 3' end of the S gene was also a hot spot for recombination. Nine intergenotypes, including A/D (6), C/D (1), A/E (1) and B/C/U (1) hybrids, had breakpoints in this area. It was interesting that some of the A/D hybrids had breakpoints at the 6 bp insertion of genotype A (nt 2356–2361, IG-3) or at the 33 bp deletion of genotype D (nt 2855–2887, IG-2 and IG-4). This reminded us of some intergenotypes with in-frame deletions in certain areas, especially in the preS1/S2 region, like IG-2, IG-4, IG-5, IG-21 and IG-11. The relationship between deletion and intergenotype recombination requires further investigation.

Recombination tendency of different genotypes
Genotypes of the 171 putative recombinants comprised genotypes A (24 genomes), B (120 genomes), C (18 genomes), D (6 genomes), F (1 genomes) and G (2 genomes). Recombinants in genotypes E and H were not found in this study. Most of the putative recombinants were B/C (120) and A/D (29) hybrids. The other recombinants comprised A/B/C (2), A/C (1), A/E (1), A/G (1), C/D (10), C/F (1), C/G (1), C/U (4) and B/C/U (1) hybrids. Genotype A was found to recombine with genotypes B, C, D, E and G, and genotype C was found to recombine with genotypes A, B, D, F, G and U. However, the other genotypes were found to recombine only with genotypes A or C. It seems that genotypes A and C have a higher recombination tendency than the other genotypes do. Recently, genotype C was even found to integrate into the genome of a chimpanzee HBV genome (Magiorkinis et al., 2005Down), while a human HBV genotype C isolate was supposed to have a gibbon HBV mosaic fragment (Suwannakarn et al., 2005Down).

It is interesting that the partial sequence of one genotype tends to integrate into other genotypes in certain regions. Genotype C tends to recombine into other genotypes in the preC/C (often covering part or all of the C gene) or the preS1/S2 region, like IG-6, IG-7, IG-22, IG-23 to IG-25 in the preC/C region, and the recombinants revealed by Chen et al. (2004Down, 2006)Down in the preC/C or preS1/S2 regions. Genotype A tends to integrate into other genotypes in X/preC (often covering part or all of the X gene) or the preS1/S2 region, like IG-14 to IG-17 and IG-23 in the X/preC region, and IG-7, IG-8 and IG-21 in the preS1/S2 region. Kato et al. (2002b)Down also recorded the integration of genotype A into genotype G in the X/preC and preS1/S2 regions.

Geographical distribution of putative recombinants
The results showed that, in general, intergenotypes tend to have the same geographical distributions as that of their parent genotypes. A/D hybrids in genotype A are mainly recorded in India (IG-1 to IG-3) and South Africa (IG-4), while A/D hybrids in genotype D are only found in Italy (IG-14 to IG-17) and India (IG-18). B/C hybrids in genotype C (IG-9 to IG-11) can be found in China, Japan and Vietnam, but they are only sporadic examples. Other recombinants, including A/E (IG-19) in Cameroon, A/B/C (IG-7) in South Africa, A/C (IG-8) in China, C/F (IG-20) in Bolivia, A/G (IG-21) in America and C/G (IG-22) in Thailand, only occur sporadically.

However, the B/C hybrid in genotype B (IG-6) is one of the dominant subgenotypes in south and east Asia, but the prototype of genotype B has not been found, except in Japan (Sugauchi et al., 2002Down). C/D hybrids (IG-12 to IG-13) are only found in the north-west of China, as well as Tibet, where C/D hybrids even become the dominant HBV groups (Cui et al., 2002Down), but genotype D as a parent genotype is rarely found there.

Co-infection and recombination
How intergenotype recombination occurs still remains unanswered. In general, recombination is supposed to result from co-infection of different genotypes (Fares & Holmes, 2002Down; Hannoun et al., 2002Down), though there is little chance for two HBV genomes to be recombined due to the encapsidation of only one pregenome for replication (Kato et al., 2002bDown). However, some cases of co-infection of different genotypes have been reported, such as co-infection of genotypes A and D (Bahn et al., 1997Down; Gerner et al., 1998Down), A and G (Kato et al., 2002aDown), and B and C (Chen et al., 2004Down, 2006Down). Intergenotype recombinations have been found to occur in co-infected sera, but in the cases reported so far, the sequences were only partial genome sequences. Our data have revealed a case with a complete genome sequence, an A/G hybrid found in sera co-infected by genotypes A and G.

Genotype G was found to frequently co-infect with other genotypes. All the isolates of genotype G discovered in San Francisco co-infected with genotype A (Kato et al., 2002aDown), while those from Canada co-infected with either genotype A or C (Osiowy & Giles, 2003Down). Genome AB056516 [GenBank] from San Francisco, an A/G hybrid in genotype G (IG-21), was proved to be derived from sera co-infected with isolates of genotypes G and A (Kato et al., 2002aDown). Therefore, it is suspected that co-infection caused the formation of AB056516 [GenBank] .

Some intergenotypes of putative recombinants become prevalent in certain areas, like B/C hybrids of genotype B (IG-6) in Asia (Sugauchi et al., 2002Down) and C/D hybrids of genotype C (IG-13 to IG-14) in north-west China (Cui et al., 2002Down; Wang et al., 2005Down), where one of the parent genotypes rarely exists. For these hybrids, recombination might have occurred a long time ago and co-infection of the parent genotypes is no longer found in those areas. But for other hybrids, recombination might just occur in patients co-infected with more than one genotype. Co-infections were found to occur frequently in certain patients and cases like these could be very helpful for understanding the mechanism of recombination.

Intergenotype recombination is a frequent affair in HBV infection, especially in cases of co-infection. Further study is required to clarify the mechanism of recombination, the role of recombination in the heterogeneity of HBV and the outcome of recombinants, which might help in our understanding of HBV history and the variety of host responses.


   ACKNOWLEDGEMENTS
 
This work was funded by the Natural Science Foundation of Guangdong Province under Grant 04009785.


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). Basic local alignment tool. J Mol Biol 215, 403–410.[CrossRef][Medline]

Arauz-Ruiz, P., Norder, H., Robertson, B. H. & Magnius, L. O. (2002). Genotype H: a new Amerindian genotype of hepatitis B virus revealed in Central America. J Gen Virol 83, 2059–2073.[Abstract/Free Full Text]

Bahn, A., Gerner, P., Martine, U., Bortolotti, F. & Wirth, S. (1997). Detection of different viral strains of hepatitis B virus in chronically infected children after seroconversion from HBsAg to anti-HBs indicating viral persistence. J Hepatol 27, 973–978.[CrossRef][Medline]

Bollyky, P. L., Rambaut, A., Harvey, P. H. & Holmes, E. C. (1996). Recombination between sequences of hepatitis B virus from different genotypes. J Mol Evol 42, 97–102.[CrossRef][Medline]

Bowyer, S. M. & Sim, J. G. M. (2000). Relationships within and between genotypes of hepatitis B virus at points across the genome: footprints of recombination in certain isolates. J Gen Virol 81, 379–392.[Abstract/Free Full Text]

Bowyer, S. M., van Staden, L., Kew, M. C. & Sim, J. G. M. (1997). A unique segment of the hepatitis B virus group A genotype identified in isolates from South Africa. J Gen Virol 78, 1719–1729.[Abstract]

Chen, B. F., Kao, J. H., Liu, C. J., Chen, D. S. & Chen, P. J. (2004). Genotypic dominance and novel recombinations in HBV genotype B and C co-infected intravenous drug users. J Med Virol 73, 13–22.[CrossRef][Medline]

Chen, B. F., Liu, C. J., Jow, G. M., Chen, P. J., Kao, J. H. & Chen, D. S. (2006). Evolution of Hepatitis B virus in an acute hepatitis B patient co-infected with genotypes B and C. J Gen Virol 87, 39–49.[Abstract/Free Full Text]

Cui, C., Shi, J., Hui, L., Xi, H., Zhuoma, Quni, Tsedan & Hu, G. (2002). The dominant hepatitis B virus genotype identified in Tibet is a C/D hybrid. J Gen Virol 83, 2773–2777.[Abstract/Free Full Text]

Fares, M. A. & Holmes, E. C. (2002). A revised evolutionary history of hepatitis B virus (HBV). J Mol Evol 54, 807–814.[CrossRef][Medline]

Felsenstein, J. (1993). PHYLIP (phylogeny inference package), version 3.61. Department of Genetics, University of Washington, Seattle, USA.

Georgi-Geisberger, P., Berns, H., Loncarevic, I. F., Yu, Z.-Y., Tang, Z.-Y., Zentgraf, H. & Schroder, C. H. (1992). Mutations on free and integrated hepatitis B virus DNA in a hepatocellular carcinoma: footprint of homologous recombination. Oncology 49, 386–395.[Medline]

Gerner, P. R., Friedt, M., Oettinger, R., Lausch, E. & Wirth, S. (1998). The hepatitis B virus seroconversion to anti-HBe is frequently associated with HBV genotype changes and selection of preS2-defective particles in chronically infected children. Virology 245, 163–172.[CrossRef][Medline]

Hannoun, C., Norder, H. & Lindh, M. (2000). An aberrant genotype revealed in recombinant hepatitis B virus strains from Vietnam. J Gen Virol 81, 2267–2272.[Abstract/Free Full Text]

Hannoun, C., Krogsgaard, K., Horal, P. & Lindh, M. (2002). Genotype mixtures of hepatitis B virus in patients treated with interferon. J Infect Dis 186, 752–759.[CrossRef][Medline]

Hino, O., Tabata, S. & Hotta, Y. (1991). Evidence for increased in vitro recombination with insertion of human hepatitis B virus DNA. Proc Natl Acad Sci U S A 88, 9248–9252.[Abstract/Free Full Text]

Kao, J. H., Chen, P. J., Lai, M. Y. & Chen, D. S. (2000). Hepatitis B genotypes correlate with clinical outcomes in patients with chronic hepatitis B. Gastroenterology 118, 554–559.[CrossRef][Medline]

Kato, H., Orito, E., Gish, R. G., Sugauchi, F., Suzuki, S., Ueda, R., Miyakawa, Y. & Mizokami, M. (2002a). Characteristics of hepatitis B virus isolates of genotype G and their phylogenetic differences from the other six genotypes (A through F). J Virol 76, 6131–6137.[Abstract/Free Full Text]

Kato, H., Orito, E., Gish, R. G. & 7 other authors (2002b). Hepatitis B e antigen in sera from individuals infected with hepatitis B virus of genotype G. Hepatology 35, 922–929.[CrossRef][Medline]

Kurbanov, F., Tanaka, F., Fujiwara, K. & 11 other authors (2005). A new subtype (subgenotype) Ac (A3) of hepatitis B virus and recombination between genotypes A and E in Cameroon. J Gen Virol 86, 2047–2056.[Abstract/Free Full Text]

Lee, W. M. (1997). Hepatitis B virus infection. N Engl J Med 337, 1733–1745.[Free Full Text]

Lole, K. S., Bollinger, R. C., Paranjape, R. S., Gadkari, D., Kulkarni, S. S., Novak, N. G., Ingersoll, R., Sheppard, H. W. & Ray, S. C. (1999). Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J Virol 73, 152–160.[Abstract/Free Full Text]

Luo, K., Liu, Z., He, H., Peng, J., Liang, W., Dai, W. & Hou, J. (2004). The putative recombination of hepatitis B virus genotype B with pre-C/C region of genotype C. Virus Genes 29, 31–41.[CrossRef][Medline]

Magiorkinis, E. N., Magiorkinis, G. N., Paraskevis, D. N. & Hatzakis, A. E. (2005). Re-analysis of a human hepatitis B virus (HBV) isolate from an East African wild born Pan troglodytes schweinfurthii: evidence for interspecies recombination between HBV infecting chimpanzee and human. Gene 349, 165–171.[CrossRef][Medline]

Magnius, L. O. & Norder, H. (1995). Subtypes, genotypes and molecular epidemiology of the hepatitis B virus as reflected by sequence variability of the S-gene. Intervirology 38, 24–34.[Medline]

Mayerat, C., Mantegani, A. & Frei, P. C. (1999). Does hepatitis B virus (HBV) genotype influence the clinical outcome of HBV infection? J Viral Hepat 6, 299–304.[CrossRef][Medline]

Morozov, V., Pisareva, M. & Groudinin, M. (2000). Homologous recombination between different genotypes of hepatitis B virus. Gene 260, 55–65.[CrossRef][Medline]

Norder, H., Couroucé, A.-M. & Magnius, L. O. (1992). Molecular basis of hepatitis B virus serotype variations within the four major subtypes. J Gen Virol 73, 3141–3145.[Abstract/Free Full Text]

Norder, H., Hammas, B., Lee, S. D., Bile, K., Couroucé, A. M., Mushahwar, I. K. & Magnius, L. (1993). Genetic relatedness of hepatitis B viral strains of diverse geographical origin and natural variations in the primary structure of the surface antigen. J Gen Virol 74, 1341–1348.[Abstract/Free Full Text]

Norder, H., Couroucé, A.-M. & Magnius, L. O. (1994). Complete genomes, phylogenetic relatedness, and structural proteins of six strains of the hepatitis B virus, four of which represent two new genotypes. Virology 198, 489–503.[CrossRef][Medline]

Norder, H., Couroucé, A.-M., Coursaget, P., Echevarria, J. M., Lee, S.-D., Mushahwar, I. K., Robertson, B. H., Locarnini, S. & Magnius, L. O. (2004). Genetic diversity of hepatitis B virus strains derived worldwide: genotypes, subgenotypes, and HBsAg subtypes. Intervirology 47, 289–309.[CrossRef][Medline]

Okamoto, H., Imai, M., Kametani, M., Nakamura, T. & Mayumi, M. (1987). Genomic heterogeneity of hepatitis B virus in a 54-year-old woman who contracted the infection through materno-fetal transmission. Jpn J Exp Med 57, 231–236.[Medline]

Okamoto, H., Tsuda, F., Sakugawa, H., Sastrosoewignjo, R. I., Imai, M., Miyakawa, Y. & Mayumi, M. (1988). Typing hepatitis B virus by homology in nucleotide sequence: comparison of surface antigen subtypes. J Gen Virol 69, 2575–2583.[Abstract/Free Full Text]

Osiowy, C. & Giles, E. (2003). Evaluation of the INNO-LiPA HBV genotyping assay for determination of hepatitis B virus genotype. J Clin Microbiol 41, 5473–5477.[Abstract/Free Full Text]

Owiredu, W., Kramvis, A. & Kew, M. (2001). Hepatitis B virus DNA in serum of healthy black African adults positive for hepatitis B surface antibody alone: possible association with recombination between genotypes A and D. J Med Virol 64, 441–454.[CrossRef][Medline]

Pineau, P., Marchio, A., Mattei, M.-G., Kim, W.-H., Youn, J.-K., Tiollais, P. & Dejean, A. (1998). Extensive analysis of duplicated inverted hepatitis B virus integrations in human hepatocellular carcinoma. J Gen Virol 79, 591–600.[Abstract]

Robertson, D. L., Hahn, B. H. & Sharp, P. M. (1995). Recombination in AIDS viruses. J Mol Evol 40, 249–259.[CrossRef][Medline]

Simmonds, P. & Midgley, S. (2005). Recombination in the genesis and evolution of hepatitis B virus genotypes. J Virol 79, 15467–15476.[Abstract/Free Full Text]

Stuyver, L., De Gendt, S., Van Geyt, C., Zoulim, F., Fried, M., Schinazi, R. F. & Rossau, R. (2000). A new genotype of hepatitis B virus: complete genome and phylogenetic relatedness. J Gen Virol 81, 67–74.[Abstract/Free Full Text]

Sugauchi, F., Orito, E., Ichida, T. & 9 other authors (2002). Hepatitis B virus of genotype B with or without recombination with genotype C over the precore region plus the core gene. J Virol 76, 5985–5992.[Abstract/Free Full Text]

Sugauchi, F., Orito, E., Ichida, T. & 10 other authors (2003). Epidemiologic and virologic characteristics of hepatitis B virus genotype B having the recombination with genotype C. Gastroenterology 124, 925–932.[CrossRef][Medline]

Summers, J. & Mason, W. S. (1982). Replication of the genome of a hepatitis B-like virus by reverse transcription of an RNA intermediate. Cell 29, 403–415.[CrossRef][Medline]

Suwannakarn, K., Tangkijvanich, P., Theamboonlers, A., Abe, K. & Poovorawan, Y. (2005). A novel recombinant of Hepatitis B virus genotypes G and C isolated from a Thai patient with hepatocellular carcinoma. J Gen Virol 86, 3027–3030.[Abstract/Free Full Text]

Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.[Abstract/Free Full Text]

Wang, Z., Liu, Z., Zeng, G., Wen, S., Qi, Y., Ma, S., Naoumov, N. V. & Hou, J. (2005). A new intertype recombinant between genotypes C and D of hepatitis B virus identified in China. J Gen Virol 86, 985–990.[Abstract/Free Full Text]

Yuasa, R., Takahashi, K., Dien, B. V. & 9 other authors (2000). Properties of hepatitis B virus genome recovered from Vietnamese patients with fulminant hepatitis in comparison with those of acute hepatitis. J Med Virol 61, 23–68.[CrossRef][Medline]

Received 9 December 2005; accepted 30 March 2006.


This article has been cited by other articles:


Home page
J. Gen. Virol.Home page
C. Osiowy, D. Gordon, J. Borlang, E. Giles, and J.-P. Villeneuve
Hepatitis B virus genotype G epidemiology and co-infection with genotype A in Canada
J. Gen. Virol., December 1, 2008; 89(12): 3009 - 3015.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
A. L. Jackson, H. O'Neill, F. Maree, B. Blignaut, C. Carrillo, L. Rodriguez, and D. T. Haydon
Mosaic structure of foot-and-mouth disease virus genomes
J. Gen. Virol., February 1, 2007; 88(2): 487 - 492.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yang, J.
Right arrow Articles by Wang, X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yang, J.
Right arrow Articles by Wang, X.
Agricola
Right arrow Articles by Yang, J.
Right arrow Articles by Wang, X.


HOME HELP FEEDBACK