|
|
||||||||
Review article |
Molecular Virology Laboratory, Department of Medical Microbiology, Center of Infectious Diseases, Leiden University Medical Center, LUMC P4-26, PO Box 9600, 2300 RC Leiden, The Netherlands
Correspondence
Eric J. Snijder
e.j.snijder{at}lumc.nl
| ABSTRACT |
|---|
|
|
|---|
Published online ahead of print on 23 February 2006 as DOI 10.1099/vir.0.81611-0.
*When its definition is followed to the letter, the term transcription (i.e. the process by which genetic information encoded in one strand of DNA is copied into a complementary RNA strand) does not apply to the synthesis of sg mRNAs by nidoviruses and other RNA viruses. Nevertheless, there is a clear functional parallel (production of RNA templates for protein synthesis) and the term transcription has been used in studies on coronavirus sg mRNA synthesis from the very start. Consequently, for the purpose of this review and regardless of the lack of a DNA template, we will use the term transcription for the synthesis of sg plus strands (sg mRNAs). Genome amplification, which results in the production of a full-length mRNA, will be referred to as replication. ![]()
The increasing complexity of the nidovirus group
Nidoviruses are a group of enveloped positive-stranded RNA viruses. Currently known representatives mostly infect mammals (coronaviruses, toroviruses and arteriviruses), but do also have avian (coronaviruses) or invertebrate (roniviruses) hosts. Nidoviruses cause a variety of diseases, the outcome of which can range from an asymptomatic, persistent carrier-state to a sometimes fatal infection. The severity of coronavirus infection is exemplified by severe acute respiratory syndrome (SARS) in humans, which was caused by a newly emerged coronavirus that gripped worldwide attention in 2003 (Drosten et al., 2003
; Ksiazek et al., 2003
; Peiris et al., 2003
). In the wake of the SARS outbreak, several other novel coronaviruses, including two that infect humans (van der Hoek et al., 2004
; Fouchier et al., 2004
; Woo et al., 2005
), were identified and added to the growing list of nidoviruses that were first characterized during the past two decades.
During that same period of time, the systematic sequence analysis of virus genomes has changed the face of virus taxonomy. With the rise of virus genetics and molecular virology, it has become clear that comparative sequence analysis will provide the most solid basis for future systems for virus classification. In addition, common strategies underlying the organization of viral genomes and common mechanisms for the regulation of viral genome expression have been recognized and have strengthened the case for using a genetic basis for virus taxonomy.
The current order Nidovirales is a perfect example of these developments. Since 1996, it officially unites the families Coronaviridae (genus Coronavirus and genus Torovirus) and Arteriviridae, which were initially considered to be completely unrelated. More recently, the new family Roniviridae was included, expanding the order into the domain of invertebrate hosts (prawns). The taxon derives its name from the common nidovirus strategy to express all genes located downstream of the replicase gene from a 3' co-terminal nested set of subgenomic (sg) mRNAs (nidus in Latin means nest; Fig. 1
). However, the most compelling reason for nidovirus unification was found in the large replicase gene itself. In phylogenetic analyses of key replicase domains, including the RNA-dependent RNA polymerase (RdRp) and helicase, different nidovirus subgroups were found to cluster, suggesting that they share a common ancestor (Gorbalenya et al., 1989
; Snijder et al., 1990a
; den Boon et al., 1991
; Cowley et al., 2000
; for reviews see Cavanagh, 1997
; de Vries et al., 1997
; Gonzalez et al., 2003
; Snijder et al., 2005
).
|
Some 15 years ago, the detection of 3' co-terminal nested sets of sg mRNAs in cells infected with the arterivirus Equine arteritis virus (EAV; de Vries et al., 1990
) and the torovirus Berne virus [now known as Equine torovirus (EToV); Snijder et al., 1990b
] provided the first indication for a connection to coronaviruses, a family whose intriguing mechanism for sg mRNA transcription* had attracted attention for quite some time. Transcription essentially is the functional connection between the homologous and non-homologous parts of the nidovirus genome. By directing the synthesis of sg mRNAs, the conserved replicase gene controls the expression of the variable set of downstream genes. Studies into the mechanism of nidovirus transcription have been based on a combination of biochemical, genetic and molecular biological approaches and were stimulated by the development of systems allowing mutagenesis of regulatory sequences. These studies have clearly benefited from the characterization of virus groups distantly related to coronaviruses, but were also complicated by it. For example, the recent finding that in contrast to coronavirus and arterivirus sg mRNAs sg transcripts of toroviruses and roniviruses do not have a common 5' leader sequence illustrates that many details of sg mRNA production remain to be explored. The purpose of this review is to compare and contrast regulatory RNA elements and the molecular mechanisms of transcription in different nidoviruses, and to discuss these in the context of the current models for transcription.
Prototypic nidoviruses and their life cycle
For obvious reasons, the best-studied nidoviruses are those that either cause disease in humans, rodents or livestock and/or are most easily amplified in cell culture. An example is the arterivirus prototype EAV, which has the smallest (known) arterivirus/nidovirus genome (12.7 kb). This was a clear advantage for the development in 1996 of a reverse genetics system based upon a full-length cDNA clone from which infectious RNA can be produced (van Dinten et al., 1997
). In the family Coronaviridae (Lai & Cavanagh, 1997
; Lai & Holmes, 2001
), Murine hepatitis virus (MHV) has been extensively studied in terms of its molecular biology and, in particular, its transcription mechanism, a research area that will certainly benefit from the availability of MHV-reverse genetics systems recently developed in the laboratories of Ralph Baric (Yount et al., 2002
), and Stuart Siddell and Volker Thiel (Coley et al., 2005
). The transcription Transmissible gastroenteritis virus (TGEV) of swine, for which Luis Enjuanes and co-workers developed the first coronavirus full-length cDNA clone (Almazan et al., 2000
), has also been studied in some detail. EToV (Snijder & Horzinek, 1993
) is the sole torovirus representative, thus far, that can be amplified and studied in cultured cells. In the new family Roniviridae (Roni stands for rod-shaped nidovirus), only the genome of Gill-associated virus (GAV) (Cowley et al., 2000
, 2002
) has been fully sequenced and it is this virus for which an initial analysis of transcription was performed. Our knowledge of nidovirus transcription is largely based on studies with the limited number of model viruses mentioned above. Also, given the relatively large evolutionary distances between nidovirus subgroups, it is clear that data should be interpreted with caution when translating them to distant nidovirus clusters. This is true in particular for arteriviruses, in view of the size difference between their genome/replicase gene and those of other nidoviruses. For these reasons, the descriptions and conclusions that follow cannot be considered as definitive for all nidoviruses.
Following entry and uncoating, translation of the nidovirus genome (Fig. 1
) is initiated at the replicase ORF1a start codon. The replicase gene is comprised of two large open reading frames (1a and 1b) that are connected by a 1 ribosomal frameshift site. Ribosomal frameshifting, promoted and coordinated by specific RNA signals, results in the C-terminal extension of a relatively small fraction of ORF1a-encoded polypeptides (pp1a) with the ORF1b-encoded polypeptide, which includes the most conserved enzymic functions including the RdRp. The pp1a and pp1ab replicase polyproteins are co- and post-translationally processed by two to four autoproteinases that reside in pp1a. The number of mature replicase products (non-structural proteins or nsps) ranges from 12 or 13 in arteriviruses to 16 in most coronaviruses (see Ziebuhr et al., 2000
; Snijder et al., 2003
and references therein). Guided by a number of ORF1a-encoded subunits that contain hydrophobic domains, most of the nidovirus replicase proteins associate with modified intracellular membranes to form a membrane-bound complex for RNA synthesis (for reviews see Lai & Holmes, 2001
; Siddell et al., 2005
and references therein), a common feature of animal positive-strand RNA viruses.
Recognition of RNA signals near the 3' end of the viral genome by the RdRp complex precedes the initiation of synthesis of full-length minus-strand RNA (or anti-genome), which in turn is the template for the synthesis of novel genome RNA. For the latter process, recognition signals present near the 3' end of the anti-genome must be used (for reviews see Lai & Holmes, 2001
; Siddell et al., 2005
and references therein). Viral RNA synthesis is asymmetric and produces much more plus- than minus-strand RNA (Sawicki et al., 2001
). In addition to being utilized for boosting replicase expression and its own amplification, the newly synthesized nidovirus genome is also believed to be the template for the production of sg-length minus strands that are used as templates for transcription (see below). Also the incoming genomic RNA may serve as template for the production of sg-length minus strands, as it was possible to experimentally uncouple transcription from replication (Schelle et al., 2005
). Elegant biochemical studies by Stanley and Dorothea Sawicki and coworkers (Sawicki et al., 2001
), using the coronavirus MHV as a model, showed that both anti-genome and sg-length minus strands are produced very early in infection. Each sg mRNA is produced from a corresponding transcription intermediate that contains the sg-length minus-strand template. These complexes synthesize the various sg mRNAs in non-equimolar, but relatively constant amounts. Also the ratio of the synthesis of genome to sg mRNAs is constant throughout the replication cycle (Sawicki et al., 2001
; Sawicki & Sawicki, 2005
).
Nidovirus sg transcripts serve to express the structural proteins (and in the case of coronaviruses a variety of accessory proteins) from genes located in the 3'-proximal third of the genome, which are not accessible for ribosomes engaged in genome translation (Fig. 1
). Although each mRNA, with the exception of the smallest one, is structurally polycistronic, they are functionally monocistronic and, with some exceptions, only the 5'-proximal ORF is translated. Ultimately, the newly synthesized genomes are encapsidated by the nucleocapsid (N) protein and progeny virions acquire their envelope by budding into the lumen of membranes of the endoplasmic reticulum-to-Golgi pathway (for a general overview of the nidovirus life cycle see Snijder et al., 2005
and references therein).
The structure of nidovirus sg mRNAs: having a leader or not...
A principal feature of members of Nidovirales, although not unique for this virus order alone, is the generation of an extensive 3' co-terminal nested set of mRNAs from which the 3'-proximal region of the polycistronic genome is expressed. It was an early discovery that coronavirus sg transcripts are not only 3' co-terminal but also contain a common 5' leader sequence of about 65100 nt, which is derived from the 5' end of the genome (Lai et al., 1982a
, 1983
; Spaan et al., 1982
; Fig. 1
). In some respects their mosaic nature resembles that of eukaryotic mRNAs generated by splicing. However, coronavirus replication occurs in the cytoplasm of the infected cell, and consequently the generation of sg mRNAs by fusion of sequences that are not contiguous in the genome was an unexpected feature. UV transcription inactivation studies essentially ruled out that the bulk of coronavirus sg RNA molecules were formed post-transcriptionally by cis-splicing of a genome-length precursor molecule (Jacobs et al., 1981
; Stern & Sefton, 1982
; Yokomori et al., 1992
).
Following the identification of a short conserved sequence (later termed transcription-regulating sequence, TRS; Fig. 1
), which is located at (i) the 3' end of the common leader, (ii) the 5' end of each mRNA body segment, and (iii) the leader-to-body fusion site in the sg mRNA, it became apparent that base pairing between plus- and minus-sense copies of this regulatory sequence might direct the co-transcriptional fusion of sg mRNA leader and body in a process of discontinuous transcription (Baric et al., 1983
; Spaan et al., 1983
; Lai et al., 1984
). The initial studies on arterivirus transcription essentially corroborated these findings (de Vries et al., 1990
; den Boon et al., 1995
, 1996
), although the arterivirus leader sequence and TRS were substantially longer and shorter, respectively, than the corresponding elements in coronaviruses.
Although a preliminary analysis of torovirus mRNAs (EToV) had already suggested the absence of common 5'-terminal sequences (Snijder et al., 1990b
), it was the analysis of ronivirus sg transcripts and the more detailed analysis of EToV transcription, which firmly established that, apparently, discontinuous transcription is not a universal feature of nidoviruses. A leader sequence derived from the genomic 5' end was not found on the sg mRNAs of the ronivirus GAV (Cowley et al., 2002
). Even more surprising was the outcome of a thorough study on the 5'-terminal sequences of EToV mRNAs by Raoul de Groot and co-workers: whereas sg mRNAs3, 4 and 5 of this virus lack a common 5' end, sg mRNA2 was shown to possess a short 5' leader sequence identical to the 5' end of the genome (van Vliet et al., 2002
; see below).
The discontinuous step in coronavirus and arterivirus transcription: during plus- or minus-strand RNA synthesis?
Over the years, different models have been proposed to explain the fusion of the common 5' leader sequence to the different 3' body segments present in arterivirus and coronavirus sg mRNAs (previously reviewed by e.g. Sawicki & Sawicki, 1995
, 2005
; van der Most & Spaan, 1995
; Lai & Cavanagh, 1997
; Brian & Spaan, 1997
; Snijder & Meulenberg, 1998
; Lai & Holmes, 2001
). Almost all models assume co-transcriptional fusion of leader and body, but the controversy whether the discontinuous step in RNA synthesis operates during plus- or minus-strand synthesis has kept the field busy and divided for years. In the two most prominent, but opposing models (commonly referred to as leader-primed transcription and discontinuous extension of minus-strand RNA synthesis), the TRS elements play a key role (Fig. 2
). Both models include a base-pairing interaction between the sense copy of the TRS in the genomic leader (leader TRS) and the antisense copy of the TRSs present at the 5' end of each of the sg mRNA body segments (anti-body TRS). According to the leader-primed transcription model (Fig. 2a
; Baric et al., 1983
; Spaan et al., 1983
; Lai et al., 1984
), transcription is initiated from the 3' end of the anti-genome to produce a leader primer of which the 3'-terminal leader TRS would base pair to the various anti-body TRSs in the anti-genome. Subsequently, the leader transcript would be extended to complete the sg mRNA. Thus, this model proposes that the discontinuous step takes place during plus-strand synthesis and that the body TRS complements in the anti-genome essentially act as promoters for transcription.
|
Subsequently, Stanley and Dorothea Sawicki (Sawicki & Sawicki, 1995
) proposed discontinuous extension of minus-strand RNA synthesis as an alternative (Fig. 2b
). Not plus-, but minus-strand sg RNA synthesis was proposed to be discontinuous, with attenuation of RNA synthesis occurring in the different body TRS regions of the genomic template. The nascent sg-length minus strand, having an anti-body TRS at its 3' end, would then base pair with the leader TRS, be completed by extension with the anti-leader sequence and subsequently serve as template for transcription. In recent years, this model has gained considerable experimental support from both biochemical and genetic studies (van Marle et al., 1999a
; Baric & Yount, 2000
; de Vries et al., 2001
; Sawicki et al., 2001
; Pasternak et al., 2001
; Zuniga et al., 2004
).
It should be noted that leader-priming and attenuation of minus-strand synthesis at body TRSs are not mutually exclusive. Formally, the possibility that sg mRNAs are formed by leader-primed transcription from the attenuated sg-length minus-strand templates cannot be ruled out (van der Most et al., 1994
).
Attenuation of minus-strand RNA synthesis: the common step in nidovirus transcription?
The transcription mechanism involving a discontinuous step, as described above for coronaviruses and arteriviruses, is unique and contrasts with the mechanism used by many other positive-strand RNA viruses that produce sg mRNAs by internal initiation from promoters' in the anti-genome (Miller et al., 1985
; reviewed by Miller & Koev, 2000
). However, similarities may exist to the mechanism used by a smaller group of viruses, exemplified by Flock house virus, Tomato bushy stunt virus and Red clover necrotic mosaic virus, that employ premature termination (attenuation) of minus-strand RNA synthesis (reviewed by White, 2002
). These viruses also produce sg-length minus strands that serve as templates for transcription, but discontinuous extension of the nascent minus-strand RNA does not occur.
Strikingly, this is exactly the mechanism that may apply to the transcription of ronivirus sg mRNAs and all but the largest of the torovirus sg mRNAs (Fig. 3
). Although reports on the minus-strand RNAs (genome length or sg-length) produced by these two virus groups have not yet been published, one might speculate that minus-strand RNA synthesis is attenuated, as in coronaviruses and arteriviruses, to produce sg-length templates for transcription. This would explain the conservation in torovirus and ronivirus genomes of the apparent equivalents of the body TRSs found in coronaviruses and arteriviruses (Snijder et al., 1990b
; Cowley et al., 2002
; van Vliet et al., 2002
). Instead of extension of the nascent minus strand with the anti-leader sequence (Fig. 3a
), roniviruses and toroviruses may directly use the attenuated minus-strand products as templates for transcription (Fig. 3b
). Consequently, if sg-length minus strands indeed exist in torovirus- and ronivirus-infected cells, it may be the attenuation step during minus-strand RNA synthesis that is the common denominator in nidovirus transcription.
|
Although representative to a certain extent, such replicon systems may only reflect the transcriptional processes as they occur in the coronavirus-infected cell. The first and most important drawback is the generally low transcriptional activity of DI RNA-based body TRSs. The body TRSs in the coronavirus DI RNA-based replicon systems reported to date were generally unable to promote the transcription of amounts of sg mRNA comparable to those produced from the same body TRSs residing in the full-length genome of the parental virus. In fact, most DI RNA-based body TRSs produce one or two orders of magnitude less sg mRNAs. Consequently, it is not entirely clear if and how the conclusions obtained with such replicon systems can be extrapolated to transcription from the full-length virus genome. Whereas most point mutants of an MHV body TRS introduced into a DI RNA-based replicon supported transcription levels approaching those of the original TRS (Joo & Makino, 1992
; van der Most et al., 1994
), TRS point mutations could decrease transcription about 100-fold in the context of the EAV full-length cDNA clone system (van Marle et al., 1999a
; Pasternak et al., 2001
) and up to 1000-fold in the TGEV infectious clone system (see below; Zuniga et al., 2004
). Secondly, due to the high frequency of RNA recombination with the helper virus genome in coronavirus DI RNA replicon systems, it was not possible to test the influence of leader TRS mutations on transcription, a problem that was readily overcome by the development of full-length molecular clones (van Marle et al., 1999a
; Pasternak et al., 2001
, 2003
; Zuniga et al., 2004
). Third, although a number of studies have addressed the effects of flanking sequences on body TRS activity in DI RNA replicon systems, the overall structural context of these regulatory sequences, and not just the immediate flanking sequences, may be important for their transcriptional activity (see below).
An alternative to the DI RNA replicon-based studies of coronavirus transcription was developed by Paul Masters and co-workers. Targeted homologous recombination was used to introduce an additional body TRS into the MHV genome (Hsue & Masters, 1999
). However, compared with modification of an infectious cDNA clone, this system is quite laborious, does not allow modification of the genomic leader and its TRS either, and has the important disadvantage that only viable recombinants can be analysed, since recombinant virus needs to be grown and passaged prior to analysis. Consequently, a first cycle analysis of transcription is not possible, making this system impractical to e.g. support large-scale TRS mutagenesis studies.
Clearly, the construction of full-length cDNA clones for different nidoviruses (van Dinten et al., 1997
; Meulenberg et al., 1998
; de Vries et al., 2001
; Almazan et al., 2000
; Thiel et al., 2001
; Casais et al., 2001
; Yount et al., 2002
, 2003
; Coley et al., 2005
) was a major step towards development of more straightforward systems for transcription research. The reverse genetics system for the arterivirus EAV, for example, permits direct transfection of in vitro synthesized EAV full-length RNA into BHK-21 cells with reasonable efficiency, permitting first cycle analysis of viral RNA synthesis and rapid screening for specific virus phenotypes using conventional biochemical assays. Full-length cDNA copies of the TGEV genome were used for the first extensive mutagenesis studies targeting coronavirus transcription in the context of the full-length genome (Zuniga et al., 2004
; Curtis et al., 2004
; Sola et al., 2005
). Although the currently available systems for coronavirus reverse genetics frequently require additional passaging of mutant viruses before analysis, this problem has been circumvented by targeting the TRSs of sg mRNAs whose translation product is not essential for virus replication in cell culture.
The TRS: key regulatory element in nidovirus transcription
The nidovirus TRS is an AU-rich element that was often called intergenic sequence or leaderbody junction site in the older literature on coronaviruses and arteriviruses, respectively. The extent of sequence identity between the coronavirus leader TRS region and the different body TRSs ranges from 7 to 18 nt, whereas the corresponding sequence of arteriviruses usually is between 5 and 8 nt long. In toroviruses, a conserved 12 nt long sequence element is present upstream of ORFs 3, 4 and 5. Although, as explained above, there are important mechanistic differences with coronaviruses and arteriviruses, it was recently shown that a 16 nt cassette containing this torovirus TRS can direct the transcription of an sg mRNA when inserted into an EToV DI RNA-based replicon (Smits et al., 2005
).
As discussed above, in the 1990s, similar DI RNA-based systems formed the platform for the initial mutagenesis studies aimed at understanding the function of coronavirus body TRSs and the regulation of their activity. Shinji Makino and co-workers (Makino et al., 1991
) inserted the MHV RNA7 body TRS and its flanking sequences into a DI RNA replicon, resulting in the transcription of a DI RNA-derived sg mRNA. This system was used for site-directed mutagenesis of body TRS and flanking sequences to delimitate cis-acting elements regulating transcription. Using this system, Joo & Makino (1992)
identified the leaderbody fusion sites of DI RNA-derived sg mRNAs. The MHV body TRSs are 5'-AAUCUAAAC-3' (or a closely related sequence), whereas the 3' end of the leader contains two to four 5'-UCUAA-3' repeats (depending on the virus strain; Makino et al., 1988
). The mRNAs were found to contain two to four 5'-UCUAA-3' repeats, depending on alternative base-pairing possibilities within the duplex (Makino et al., 1988
). Joo & Makino (1992)
found that the first 5'-UCUAA-3' sequence was contributed by the leader and that leader-to-body fusion most likely took place at the first or the second nucleotide of the second repeat. Using a similar system, van der Most et al. (1994)
concluded that the leaderbody junction occurs at multiple sites within the duplex, with a preference for 3'-proximal nucleotides within the body TRS. The same authors showed that the contribution of the leader TRS to the sg mRNA can be as little as 3 nt. At that time, these data were interpreted in the context of the leader-primed transcription model and a variant involving a back-trimming mechanism (Baker & Lai, 1990
), which was proposed to employ a 3'
5' nuclease to trim the leader sequence prior to its extension with the mRNA body.
Studies using the EAV full-length cDNA clone system (van Marle et al., 1999a
; Pasternak et al., 2001
, 2003
) have rigorously demonstrated that transcription depends on leader TRSbody TRS duplex formation (Fig. 4
), and have yielded starting information concerning additional RNA and protein determinants of arterivirus transcription. More recently these conclusions were corroborated for the coronavirus TGEV using mutagenesis of a full-length cDNA clone (Zuniga et al., 2004
). It was shown that disruption of the base-pairing interaction dramatically affects transcription and that the introduction of compensatory mutations could restore activity (see also below and Fig. 4
). Generally, these studies revealed that the relative amount of sg mRNA in both arteriviruses and coronaviruses correlates with the calculated stability of the corresponding leader TRS-body TRS duplex. However, this is clearly not the only factor determining sg mRNA amounts. Sequences flanking the core TRS were also shown to influence transcription in the context of the TGEV full-length genome (Curtis et al., 2004
; Sola et al., 2005
), confirming the earlier results obtained with DI RNA replicons (van der Most et al., 1994
; Jeong et al., 1996
; An & Makino, 1998
; Ozdarendeli et al., 2001
; Alonso et al., 2002
). In addition to the effect of flanking sequences, it should be noted that when core TRS and TRS-flanking sequences were standardized by head-to-tail insertion of several copies of a body TRS-containing cassette into an EAV full-length clone-derived replicon, a perfect gradient of sg mRNA abundance, progressively favouring smaller RNA species, was observed (Pasternak et al., 2004
). This confirmed earlier theoretical work on the MHV genome (Konings et al., 1988
) and studies with coronavirus DI RNA replicon systems (van Marle et al., 1995
; Joo & Makino, 1995
; Krishnan et al., 1996
), which have shown that relative order and/or location of TRSs in the genome play an important role.
|
Cis-acting signals directing nidovirus transcription: primary sequence or higher order RNA structure?
At first glance, the sequence conservation of nidovirus body TRSs might suggest sequence-specific recognition by a protein factor. Indeed, the RdRps of some positive-strand RNA viruses are able to bind to sg promoters in a sequence-specific manner, the best studied example being the RdRp of Brome mosaic virus (BMV) (Siegel et al., 1997
, 1998
; Adkins et al., 1997
; Stawicki & Kao, 1999
). However, BMV transcription is initiated internally on the genome-length negative strand (Miller et al., 1985
), a mechanism obviously requiring distinct cis-acting signals and trans-acting factors, but clearly different from the transcription models proposed for nidoviruses.
In recent years, an increasing number of base-pairing interactions, often spanning more than 1000 nt, have been implicated in transcription or replication of various bacterial, plant and animal viral systems (Klovins et al., 1998
; Zhang et al., 1999
; Kim & Hemenway, 1999
; Choi et al., 2001
; Lindenbach et al., 2002
; Choi & White, 2002
; Lin & White, 2004
). In one extreme case, transcription was found to require an intermolecular RNARNA interaction (Sit et al., 1998
). Interestingly, compensatory mutations simultaneously introduced in the distal and proximal regulatory elements of the Potato virus X could restore transcription, but not to wild-type levels, indicating that both this long-distance RNARNA interaction and the sequence of the proximal elements are involved (Kim & Hemenway, 1999
).
The latter situation is strikingly similar to the results obtained during leader TRSbody TRS co-variation mutagenesis studies with EAV (Pasternak et al., 2001
) and TGEV (Zuniga et al., 2004
). EAV studies provided genetic evidence that body TRSs have a specific function in transcription distinct from the formation of a duplex with the leader TRS (Pasternak et al., 2001
). Transcription defects caused by a subset of body TRS point mutations could not be restored by introduction of the compensatory mutation in the leader TRS (Fig. 4b
). Strikingly, despite their primary structure conservation, these body TRS-specific requirements apparently differ between body TRSs. For example, the EAV RNA6 body TRS contains a C at position 6, whereas this nucleotide is absolutely not tolerated at the same position in the RNA7 body TRS. Moreover, whereas the U1A substitution in the latter allowed for 40 % of mRNA7 transcription and could not be rescued by the compensatory leader TRS mutation (Pasternak et al., 2001
), the same substitution in the RNA6 body TRS caused an almost complete shutdown of transcription, which could be efficiently rescued in the double mutant (Fig. 4c
; D. D. Nedialkova, A. O. Pasternak & E. J. Snijder; unpublished data). Also the fact that a mutant carrying five mutations (5'-UCAACU-3'
5'-AGUUGU-3') in leader TRS and RNA7 body TRS could support a certain level of mRNA7 transcription argues against sequence-specific recognition of the body TRS by a protein factor (van Marle et al., 1999a
). However, this does not exclude the involvement of a regulatory protein factor (Pasternak et al., 2004
) that would recognize e.g. higher order, rather than primary, structural motifs in body TRS regions. Furthermore, whereas it was found that the extent of base pairing and the stability of the leader TRSbody TRS duplex play an important role in regulation of transcription (Pasternak et al., 2003
; Zuniga et al., 2004
; Sola et al., 2005
), some of the leader TRS mutations tested for EAV (Pasternak et al., 2001
) were much more deleterious than would be expected under the assumption that they only destabilized the duplex (see Pasternak et al., 2003
, for details). Although, according to the free-energy threshold concept proposed for TGEV (Zuniga et al., 2004
; Sola et al., 2005
), a minimum free-energy value of leader TRSbody TRS duplex may be required to promote strand transfer, the transcription mechanism may require base pairing of central (core) nucleotides of the TRS to properly position the nascent strand on the template before elongation. Hence, conservation of the TRS primary structure may serve to guarantee the fidelity of the strand transfer process, which could be driven by protein factors bound to secondary structure motifs in the leader TRS and/or body TRS regions.
Secondary structure predictions of the arterivirus leader region (van Marle et al., 1999a
; van den Born et al., 2004
) place the leader TRS in a hairpin loop structure referred to as the leader TRS hairpin or LTH (Fig. 5a
). A similar conformation can be predicted for most coronaviruses and previously the existence of a leader TRS-presenting hairpin in Bovine coronavirus was firmly supported by structure-probing experiments (Chang et al., 1996
). Recent studies, in which the EAV LTH was duplicated to allow its mutagenesis without affecting replication and genome translation, support a critical role for this domain in transcription. The leader TRS duplicate yielded novel sg mRNAs with significantly extended leaders. Furthermore, a construct with two functional LTHs was able to produce a perfect double-nested set of sg mRNAs (van den Born et al., 2005
).
|
Transcription of the largest sg mRNA of toroviruses, which carries a short leader derived from the genomic 5' end, was proposed to involve an RNA structure in the body TRS region. An RNA hairpin is predicted to be present in the genomic template, just upstream of the proposed attenuation site for minus-strand synthesis (Fig. 5b
). A short piece of sequence at this position is identical to a sequence in the genomic 5'-proximal region and appears to serve as a crossover site for the acquisition of the 18 nt leader sequence (van Vliet et al., 2002
). RNA structure predictions have also been made for the body TRS regions of EAV (Pasternak et al., 2000
) and of Bovine coronavirus (Ozdarendeli et al., 2001
). Remarkably, both studies predicted the active TRSs in the plus-strand genomic template to be located in non-base-paired regions, whereas the less active or non-active TRSs are predicted to be completely or partially base paired. None of these predictions, however, could be verified experimentally (Ozdarendeli et al., 2001
; Pasternak, 2003
).
In addition to leader and body TRSs, the 3' end of the viral genome is obviously crucial for both transcription and replication. Although based on data obtained with DI RNA replicons, it was concluded that
300 nt at the MHV genomic 3' end are required for transcription (Lin et al., 1996
), whereas only the 3'-proximal 55 nt of the same domain [and the poly(A) sequence] are required for initiation of negative-stranded RNA synthesis (Lin et al., 1994
). This difference suggests the presence of a transcription-specific cis-acting sequence at the 3' end of the viral RNA.
Viral proteins involved in nidovirus transcription
Components of the nidovirus replicase are obvious candidates for a role as regulatory factors in transcription. In addition to the core proteins of the RNA-synthesizing machinery, accessory protein functions specifically involved in transcription may have evolved that are either common to nidoviruses or specific for certain nidovirus subgroups. In the case of arteriviruses, nsp1 and nsp10 have been implicated in transcription (van Dinten et al., 1997
; van Marle et al., 1999b
; Tijms et al., 2001
; Tijms, 2004
; Seybert et al., 2005
). Remarkably, both proteins contain (putative) zinc-binding domains, which may facilitate RNAprotein interactions (van Dinten et al., 2000
; Tijms et al., 2001
). Mutations in the conserved putative zinc-binding domain of nsp1 either abolish transcription completely, without dramatic effects on genome replication, or seem to influence the balance between replication and transcription (Tijms et al., 2001
; M. A. Tijms, J. C. Zevenhoven, A. E. Gorbalenya & E. J. Snijder, unpublished data). Interestingly, zinc finger structures in the HIV N protein facilitate nascent DNA-strand transfer between RNA templates (Guo et al., 2000
). A zinc finger domain in the p23 protein of Citrus tristeza virus (CTV), a plant closterovirus with a genome organization remarkably similar to that of nidoviruses, was shown to mediate the activity of this protein in asymmetrical RNA accumulation in CTV-infected plants (Satyanarayana et al., 2002
).
Stucturefunction studies of the coronavirus replicase were promoted by the recent SARS outbreak. Bioinformatic analyses by Alexander Gorbalenya and others identified five domains with distant relationships to cellular enzymes involved in RNA metabolism (Snijder et al., 2003
; von Grotthuss et al., 2003
; Yan et al., 2003
). Particularly relevant to viral RNA synthesis was the analysis of an array of conserved domains in the C-terminal region of pp1ab (now named nsp14, nsp15 and nsp16), which were predicted to possess 3'
5' exonuclease (ExoN), uridylate-specific endoribonuclease (NendoU) and S-adenosylmethionine-dependent ribose 2'-O-methyltransferase activities. Strikingly, all three domains are conserved in coronaviruses, toroviruses and roniviruses, but only the NendoU domain is also found in arteriviruses (nsp11). It is this domain, which may function as a homohexamer (Guarino et al., 2005
), for which experimental studies recently confirmed its Mn2+-dependent in vitro endoribonuclease activity and importance for virus replication (Ivanov et al., 2004
; Bhardwaj et al., 2004
). Although the precise determinants of NendoU specificity remain to be determined, a preference for cleavage at specific uridylate-containing sequences in dsRNA was reported and an interesting hypothesis of the involvement of NendoU in transcription was formulated (Ivanov et al., 2004
). In the EAV-reverse genetics system, NendoU mutagenesis exerted pleiotropic effects on viral RNA synthesis (Posthuma et al., 2006
). While some mutations rendered RNA synthesis undetectable, others induced only a moderate reduction, with sg RNA synthesis consistently being more strongly affected than genome replication.
In coronaviruses, one or more of the small replicase subunits from the C-terminal region of pp1a (nsp710), which do not appear to have a counterpart in arteriviruses, may be directly or indirectly involved in viral RNA synthesis. The MHV RdRp-containing subunit (nsp12) was proposed to interact with nsp8 and nsp9 (Brockway et al., 2003
). Structural and biochemical studies characterized SARS-CoV nsp9 as a single-stranded RNA-binding protein displaying a new variant of the OB-fold (Egloff et al., 2004
; Sutton et al., 2004
). Furthermore, SARS-CoV nsp7 and nsp8 were recently reported to form a unique hexadecameric structure that is probably capable of encircling RNA and may operate as an accessory factor for the RdRp complex (Zhai et al., 2005
). Recently, nsp10 was also implicated in viral RNA synthesis, on the basis of the phenotyope of an MHV mutant carrying a temperature-sensitive mutation in this subunit (Sawicki et al., 2005
; see below).
The latter elegant study by Stanley Sawicki, Stuart Siddell and colleagues also underlines that, in addition to reverse genetics, classical forward genetics, based on the use of virus mutants with a conditional defect in RNA synthesis, is a powerful approach to dissect the complex web of protein interactions that governs nidovirus RNA synthesis. Using MHV, the only nidovirus for which a substantial set of temperature-sensitive mutants is available (Sturman et al., 1987
; Schaad et al., 1990
; Sawicki et al., 2005
), mutations in six different nsps were identified (nsp4, nsp5, nsp10, nsp12, nsp14 and nsp16), which under restrictive conditions all interfered with the assembly of a functional RdRp complex. Different types of RNA synthesis defects were observed, including the inability to synthesize minus-strand RNA (nsp10 mutation) and the apparent inability to switch from minus- to plus-strand RNA synthesis (interestingly caused by a mutation in the C-terminal domain of nsp5, the viral main proteinase subunit). The detailed characterization of these mutant phenotypes, and the possibility of hunting for second site revertants, may provide valuable information on the role of different nidovirus nsps in RNA synthesis and key functional interactions within the RdRp complex.
Finally, the emerging role of the N protein in coronavirus RNA synthesis should be mentioned. In arteriviruses this role seems to be non-existent, since both replication and transcription are not affected by inactivation of N protein expression (Molenkamp et al., 2000b
; Pasternak et al., 2001
). The coronavirus N protein, on the other hand, has been implicated in regulation of RNA synthesis for a long time and was, for example, reported to specifically interact with leader TRS in the case of MHV (Nelson et al., 2000
). Recent studies in reverse genetic systems revealed that the N protein is required for the efficient initiation of replication following transfection of in vitro generated Human coronavirus 229E (HCoV-229E) infectious RNA (Schelle et al., 2005
) or of a plasmid expressing a TGEV replicon (Almazan et al., 2004
). Strikingly, using a replicon system for HCoV-229E, it was found that replication, but not transcription, was impaired in the absence of the N protein, suggesting that this structural protein may be involved in regulating the balance between these two processes (Schelle et al., 2005
).
In general, whereas it can be hard to identify cis-acting higher-order RNA structure motifs and to prove their function by reverse genetics (Pasternak, 2003
), it should be possible to identify viral and/or host proteins that interact with body TRS regions. For example, yeast three-hybrid systems (Sengupta et al., 1999
) and biochemical assays in a cell-free environment can help to identify such factors, and their role may subsequently be confirmed in vivo by reverse genetics.
Coronavirus and arterivirus transcription: a variant of similarity-assisted RNA recombination?
The transcription mechanism described above for arteriviruses and coronaviruses, where both leaderbody TRS duplex formation and a distinct function of the body TRS are required for proper transfer of the nascent strand, would resemble the mechanism of the typical similarity-assisted (copy-choice) RNA recombination described for e.g. Turnip crinkle virus (TCV). In this system, both sequence similarity between the parental strands and an RdRp-binding hairpin in the acceptor strand are necessary for strand transfer (Nagy et al., 1998
; Nagy & Simon, 1998a
, b
; for a review see Nagy & Simon, 1997
). Indeed, coronavirus and arterivirus discontinuous RNA synthesis has been proposed to resemble high-frequency copy-choice RNA recombination (see e.g. Spaan et al., 1983
; Chang et al., 1996
; Brian & Spaan, 1997
; van Marle et al., 1999a
and references therein). In cells simultaneously infected with two different MHV strains, up to half of the mRNAs may carry a leader sequence from the co-infecting strain (Makino et al., 1986
), suggesting free-leader exchange during transcription and a mechanism that can join leader and body sequences derived from different templates (Zhang & Lai, 1994
; Jeong & Makino, 1994
). Later, Britton and colleagues demonstrated highly efficient leader switching during the rescue of defective RNAs by heterologous strains of the coronavirus Infectious bronchitis virus, suggesting that the discontinuous step may be a part of the normal DI RNA replication cycle (Stirrups et al., 2000
). Like retroviral reverse transcriptases (Peliska & Benkovic, 1992
, 1994
; Wu et al., 1995
), coronavirus and arterivirus RdRps may be especially prone to switch templates, producing recombinant genomes in the case of homologous recombination (Lai et al., 1985
; Keck et al., 1988a
, b
; Liao & Lai, 1992
; Molenkamp et al., 2000a
; Pasternak et al., 2004
) or sg RNAs in the special case of discontinuous transcription.
The production of minor sg mRNAs from non-canonical body TRS-like sequences of non-viral origin (Fischer et al., 1997
; de Vries et al., 2001
; Curtis et al., 2002
), as well as induction of minor sg mRNAs produced from non-canonical TRSs upon mutagenesis of the leader TRS (Zhang & Lai, 1994
; Pasternak et al., 2003
; Zuniga et al., 2004
) are consistent with a high template switching frequency of coronavirus and arterivirus polymerases. It has been reported that recombination in both virus groups occurs more frequently at the 3' end of the genome (Fu & Baric, 1992
, 1994
; Molenkamp et al., 2000a
), suggesting that body TRSs may be recombination hot spots. Likewise, Luteovirus sg promoters were proposed to be recombination hot spots (Koev et al., 1999
), and hairpins required for TCV recombination served as replication enhancers (Nagy et al., 1999
). This suggests that RNA recombination may be driven by similar factors and/or signals as discontinuous transcription in nidoviruses.
In several viral systems, 5' termini in donor templates and stable hairpin structures in both donor and acceptor templates have been shown to promote RNA recombination (Cascone et al., 1990
, 1993
; Nagy & Bujarski, 1993
; Carpenter et al., 1995
; White & Morris, 1995
; Nagy et al., 1998
). Also, sg promoters in several positive-strand RNA viruses include RNA secondary structure motifs (Wang et al., 1999
; Koev et al., 1999
; Haasnoot et al., 2000
). It is not clear whether all such structures interact with specific regulatory proteins or whether they might just mechanistically promote template-switching or pausing. Nagy et al. (1995)
have shown that mutations in the helicase-like domain of BMV protein 1a could alter the sites of RNA recombination. It is interesting that EAV nsp10, which also plays a specific role in transcription (see above), possesses helicase activity (Seybert et al., 2000
), although it remains to be studied whether this activity is directly involved in transcription.
(Dis)continuous transcription: to jump or not to jump?
The question whether arterivirus and coronavirus transcription is truly discontinuous (involving a jump of nascent strand and/or RdRp from one template site to the other) or quasi-continuous, as suggested by Zhang et al. (1994)
, is an interesting unresolved issue. The latter mechanism was proposed to involve looping-out of the template and formation of a triple-stranded intermediate, possibly mediated by viral or host proteins, in a manner similar to DNA-dependent transcription for which promoter and enhancer elements are also brought together by proteinprotein interactions (Lai, 1998
). In the first model, the RdRp complex is assumed to temporarily dissociate from template and possibly also nascent strand, whereas in the second model it remains associated with the RNA strands. In the first model, the body TRSs in the plus-strand template would behave as terminators of RNA synthesis, whereas in the second model they would only promote pausing. A key parameter in this debate is the processivity of the nidovirus RdRp complex, which is unknown. As in the case of copy-choice RNA recombination (Jarvis & Kirkegaard, 1991
; Nagy & Simon, 1997
), a non-processive polymerase would tend to dissociate from the template, whereas a processive one would be more likely to remain associated with it. The non-processive model for RNA recombination, as proposed for TCV (Cascone et al., 1990
, 1993
), implies that donor or nascent strand would contain termination signals and the acceptor strand would contain signa